from Hacker News

Perma.cc – Permanent Link Service

by brianzelip on 2/7/25, 2:04 PM with 79 comments

  • by JackC on 2/10/25, 5:00 PM

    Hi! Perma is made by the Harvard Library Innovation Lab, which I direct, and I wrote a bunch of the early code for it back in 2015 or so.

    For HN readers, I'd suggest checking out https://tools.perma.cc/, where we post a bunch of the open source work that backs this. Due to the shift from warc to wacz, (a zipped-web-archive format developed by WebRecorder), it's now possible to pass around fully interactive high fidelity web archives as simple files and host them with client side javascript, which opens up a bunch of new possibilities for web archive designs. You can see some tech demos of that at our page https://warcembed-demo.lil.tools/ , where each page is just a static file on the server and some client side javascript.

    It's best to think of Perma.cc itself, the service, as some UX and user support wrapping to help solve linkrot primarily in the law journal, courts, law journals, and journalists area (for example, dashboards for a law journal to collaborate on the links they're archiving for their authors), and our work on this as building from that usecase to try to make it easier for everyone to build similar things.

    I saw some mentions of the Internet Archive, which is great, and is also kind enough to keep a copy of our archives and expose them through the Wayback Machine. One thing I've been thinking about recently in archiving is that there's a risk to overstandardizing -- you don't want things too much captured with the same software platforms, funded through the same models, governed by the same people, exposed through the same interfaces, etc. There's supposed to be thousands of libraries, not one library. Unlike "don't roll your own crypto," I'd honestly love to see more people roll their own archives.

    Happy to answer any questions!

  • by lolinder on 2/10/25, 5:50 PM

    Heads up that the .cc TLD is frequently used for malicious purposes and will likely get blocked by a lot of networks.

    When I've worked on spam prevention in the past, that TLD always comes up disproportionately often. I've never personally built a filter that blocks the entire TLD, but I'm sure from looking at the data that people with stricter compliance requirements have.

    The Anti-Phishing Working Group ranked the TLD the second-worst in the ratio of phishing domains to total registrations, with the highest total volume of phishing (page 13):

    https://docs.apwg.org//reports/APWG_Global_Phishing_Report_2...

  • by LorenDB on 2/10/25, 4:07 PM

    Permanent, until they go out of business. We should just standardize on archive.org and figure out a way to distribute redundant copies of its data around in such a way that it can survive even if the original Internet Archive goes down.

    I hate to push blockchain stuff, but something like IPFS might actually be a good idea here.

  • by jbullock35 on 2/10/25, 4:31 PM

    Prospective users are understandably concerned that perma.cc will go out of business. No institution can guarantee that it will exist in perpetuity. But perma.cc has at least published a contingency plan: https://perma.cc/contingency-plan.
  • by emddudley on 2/10/25, 4:55 PM

    PURL (https://purl.archive.org/) is a similar permanent URL service but you choose the URL.

    It used to be hosted at purl.org and run by the OCLC but in 2016 it was transferred to the Internet Archive.

    https://web.archive.org/web/20161002094639/https://www.oclc....

  • by DanAtC on 2/10/25, 4:46 PM

    Using a country code TLD is a bold choice https://www.theregister.com/2024/10/10/io_domain_uk_mauritiu...
  • by hombre_fatal on 2/10/25, 5:53 PM

    The diagram of link rot is depressing.

    I used the same unique online alias from age ten to eighteen.

    I used to be able to google it and see hundreds of results. Dozens of forums I posted on. Dozens of games I played. In my twenties, I'd do this for the nostalgia of reading posts I'd written in my preteen era.

    Now, there are just seven results.

  • by zoezoezoezoe on 2/7/25, 2:44 PM

    If the world has taught me anything, it's that nothing is permanent, and nothing is perfect. Forums from days of yore are littered with tinybucket 404 pictures and anonymous Imgur images are gone. We like to imagine that the internet will stay the way it is forever, but I dont believe it. Free internet services like email, file uploads, etc. wont last forever. The idea is amazing, and exactly what we need for a constantly changing internet, but the only thing that is forever is nothingness.
  • by LightHugger on 2/10/25, 5:14 PM

    There have been cases where companies convinced internet archive to take down archives of sites for various reasons, ranging from copyright concerns to social outrage, how do you plan to handle these eventualities?
  • by jmuguy on 2/10/25, 5:29 PM

    Why use the cc tld? I'm guessing that choice was made a while back? Unfortunately due to its association with phishing etc, that immediately gave me the impression that the service isn't legit.
  • by Pikamander2 on 2/10/25, 5:13 PM

    Does this solve any of the problems that other link shorteners have, like eventually breaking when the site goes bankrupt or getting blocked by sites like Reddit due to their ability to conceal spam?
  • by bArray on 2/10/25, 5:33 PM

    Personally, I started building tooling around Data URIs to provide permanent archives to external resources. You can only really store text or single images like this, and you have to get the size below 16KB [1] (64kB at a push) to be reliable. You can do some form of compression, but it would be nicer to have support for longer Data URIs.

    I've literally embedded these into PDFs and all sorts, all of which still open today. It doesn't really increase the document size notably, but does make them more robust.

    [1] https://stackoverflow.com/a/695167

  • by aghilmort on 2/10/25, 4:59 PM

    how does this compare contrast with DOI - complements, replaces, etc.?
  • by abetancort on 2/11/25, 2:49 AM

    Way too expensive. I will keep using archive.is and forget I ever saw this post.
  • by pinoy420 on 2/10/25, 11:42 PM

    So who maintains the database? Or is it hashed? Or what?
  • by terrycody on 2/11/25, 2:49 AM

    But even Google retired their link shortner service.
  • by RobLach on 2/10/25, 10:13 PM

    Very cool
  • by zouhair on 2/10/25, 4:22 PM

    It's like another website that could break.
  • by lorenzowood on 2/10/25, 8:02 PM

    Thought I would try this with a browser plug-in — “This item is not available” says the Chrome store ¯\_(ツ)_/¯
  • by rsync on 2/10/25, 8:59 PM

    Scroll down to:

    "Perma.cc’s Supporting Partners"

    ... one of those partners is not like the others ...

  • by nikolay on 2/10/25, 8:20 PM

    This is old, dead, and super expensive!
  • by liotier on 2/10/25, 4:21 PM

    The About page says "We’re both in the forever business" but doesn't back up the claim with any financial information to even credibly argue that they are in the ten years business.
  • by bananapub on 2/10/25, 4:28 PM

    95% of the work for a useful product like this is figuring out how to ensure it survives your dumb business model or the idiocy of the tech industry, and yet none of the words on this landing page explain what work they've done on that.

    did they not understand that at all? did they just not do any work on it? or did they understand it and do great work but just fail to but it in 100pt text as item 0 on the list? who knows.