by brianzelip on 2/7/25, 2:04 PM with 79 comments
by JackC on 2/10/25, 5:00 PM
For HN readers, I'd suggest checking out https://tools.perma.cc/, where we post a bunch of the open source work that backs this. Due to the shift from warc to wacz, (a zipped-web-archive format developed by WebRecorder), it's now possible to pass around fully interactive high fidelity web archives as simple files and host them with client side javascript, which opens up a bunch of new possibilities for web archive designs. You can see some tech demos of that at our page https://warcembed-demo.lil.tools/ , where each page is just a static file on the server and some client side javascript.
It's best to think of Perma.cc itself, the service, as some UX and user support wrapping to help solve linkrot primarily in the law journal, courts, law journals, and journalists area (for example, dashboards for a law journal to collaborate on the links they're archiving for their authors), and our work on this as building from that usecase to try to make it easier for everyone to build similar things.
I saw some mentions of the Internet Archive, which is great, and is also kind enough to keep a copy of our archives and expose them through the Wayback Machine. One thing I've been thinking about recently in archiving is that there's a risk to overstandardizing -- you don't want things too much captured with the same software platforms, funded through the same models, governed by the same people, exposed through the same interfaces, etc. There's supposed to be thousands of libraries, not one library. Unlike "don't roll your own crypto," I'd honestly love to see more people roll their own archives.
Happy to answer any questions!
by lolinder on 2/10/25, 5:50 PM
When I've worked on spam prevention in the past, that TLD always comes up disproportionately often. I've never personally built a filter that blocks the entire TLD, but I'm sure from looking at the data that people with stricter compliance requirements have.
The Anti-Phishing Working Group ranked the TLD the second-worst in the ratio of phishing domains to total registrations, with the highest total volume of phishing (page 13):
https://docs.apwg.org//reports/APWG_Global_Phishing_Report_2...
by LorenDB on 2/10/25, 4:07 PM
I hate to push blockchain stuff, but something like IPFS might actually be a good idea here.
by jbullock35 on 2/10/25, 4:31 PM
by emddudley on 2/10/25, 4:55 PM
It used to be hosted at purl.org and run by the OCLC but in 2016 it was transferred to the Internet Archive.
https://web.archive.org/web/20161002094639/https://www.oclc....
by DanAtC on 2/10/25, 4:46 PM
by hombre_fatal on 2/10/25, 5:53 PM
I used the same unique online alias from age ten to eighteen.
I used to be able to google it and see hundreds of results. Dozens of forums I posted on. Dozens of games I played. In my twenties, I'd do this for the nostalgia of reading posts I'd written in my preteen era.
Now, there are just seven results.
by zoezoezoezoe on 2/7/25, 2:44 PM
by LightHugger on 2/10/25, 5:14 PM
by jmuguy on 2/10/25, 5:29 PM
by Pikamander2 on 2/10/25, 5:13 PM
by bArray on 2/10/25, 5:33 PM
I've literally embedded these into PDFs and all sorts, all of which still open today. It doesn't really increase the document size notably, but does make them more robust.
by aghilmort on 2/10/25, 4:59 PM
by abetancort on 2/11/25, 2:49 AM
by pinoy420 on 2/10/25, 11:42 PM
by terrycody on 2/11/25, 2:49 AM
by RobLach on 2/10/25, 10:13 PM
by zouhair on 2/10/25, 4:22 PM
by lorenzowood on 2/10/25, 8:02 PM
by rsync on 2/10/25, 8:59 PM
"Perma.cc’s Supporting Partners"
... one of those partners is not like the others ...
by nikolay on 2/10/25, 8:20 PM
by liotier on 2/10/25, 4:21 PM
by bananapub on 2/10/25, 4:28 PM
did they not understand that at all? did they just not do any work on it? or did they understand it and do great work but just fail to but it in 100pt text as item 0 on the list? who knows.