from Hacker News

Cool URIs Don't Change (1998)

by tarikozket on 7/17/20, 12:19 AM with 154 comments

by joosters on 7/17/20, 9:32 AM
While the main concept (Don't change your URIs!) is good, I can't agree at all with their advice on picking names, in particular the 'what to leave out' section. No subject or topic? The justification for this is flimsy at best - 'the meaning of the words might change' So what? People cope with this all the time in other media, e.g. old books. It's not too confusing. What's more confusing is a URI that has all the meaning removed, after all this whole URI discussion is about the human appearance of URIs. Take out the topics and you are just left with dates, numbers and unspecific cruft. If I was designing a company's website, I'm sure as hell going to put the product pages under '/products'.
FWIW, the document's own URI is terrible: 'https://www.w3.org/Provider/Style/URI' - who could have any idea what the page is about from that? And what if the meaning of the word 'Provider' or 'Style' changes in x years from now? :) You could argue that the meaning/usage of 'URI' has already changed, because practically no-one uses that term any more. Everyone knows about URLs, not URIs. Not many people could tell you what the difference was. So the article's URI has already failed by its own rules.
by niftich on 7/17/20, 2:11 AM
This is one of those classic, foundational documents about the Web. But it's rarely followed. Tool use has come to dominate the form that URIs take; tools are used both for delegation and to absolve humans from crafting URIs by hand. Switching tools frequently ruins past URIs.
Additionally, widespread use of web search engines has made URI stability less relevant for humans. Bookmarks are not the only solution to find a leaf page by topic again. A dedicated person might find that archiving websites may have preserved content at their old URIs.
Some of this is allowed to happen because the content is ultimately disposable, expires, or possesses limited relevance outside of a limited audience. Some company websites are little more than brochures. Documents and applications that are relevant within organizations can be communicated out of band. Ordinary people and ordinary companies don't want to be consciously running identifier authorities forever.
by mapgrep on 7/17/20, 1:52 AM
Rhetorical question: Why must we charge annually to control domains? Should we stop doing this in the name of greater URL stability?
The article states early on, “Except insolvency, nothing prevents the domain name owner from keeping the name.” As it turns out, insolvency is a pretty significant source of URL rot, but also so is non renewal of domains by choice or by apathy, whether for financial or mere personal energy reasons (“who is my registrar again? Where do I go to renew?”) especially by individuals. You start a project and ten years later your interest has waned.
Domains are an increasingly abundant resource as TLDs proliferate. Why not default to a model where you pay once up front for the domain, and thereafter continued control is contingent on maintaining a certain percentage of previously published resources, and if you fail at that some revocable mechanism kicks in that serves mirrored versions of your old urls. Funding of these mirrors comes from the up front domain fees. Design of the mechanism is left as an exercise for the reader :-)
by dang on 7/17/20, 1:52 AM
If curious see also
2016: https://news.ycombinator.com/item?id=11712449
2012: https://news.ycombinator.com/item?id=4154927
2011: https://news.ycombinator.com/item?id=2492566
2008 ("I just noticed that this classic piece of advice has never been directly posted to HN."): https://news.ycombinator.com/item?id=175199
also one comment from 7 months ago: https://news.ycombinator.com/item?id=21720496
by heinrichhartman on 7/17/20, 8:18 AM
I think this is just unrealistic. Let's look at this example:
```
    http://www.pathfinder.com/money/moneydaily/1998/981212.moneyonline.html
```
This consists of:
0. Access protocol
1. Hostname/DNS name
2. Arbitrary chosen path hirarchy
3. File extension
This is really a description where to find a document ("locator" not "identifier"). So, if you are:
- re-organizing / cleanup your file structure
- change or hide the file extension
- enable HTTPS
- migrating files to a different domain name
This WILL change the URL. What are you going to do? Not cleanup your space anymore? Stick to HTTP? So URLs DO change. That's just the reality.
If you want something that does not change, don't link to a location but link to content directly: E.g.
- git hashes do not change
- torrent/magnet Links don't change
- IPSFS links do not change.
Or use a central authority, that stewards the identifier:
- DOI numbers don't change
- ISBN numbers don't change
by dfabulich on 7/17/20, 1:32 AM
In the very footer of this page:
> Historical note: At the end of the 20th century when this was written, "cool" was an epithet of approval particularly among young, indicating trendiness, quality, or appropriateness. In the rush to stake our DNS territory involved the choice of domain name and URI path were sometimes directed more toward apparent "coolness" than toward usefulness or longevity. This note is an attempt to redirect the energy behind the quest for coolness.
It's 2020 and "cool" still has that same meaning, as an informal positive epithet. I believe "cool" is the longest surviving informal positive epithet in the English language.
"Cool" has been cool since the 1920s, and it's still cool today. "Cool" has outlived "hip," "happening," "groovy," "fresh," "dope," "swell," "funky," "bad," "clutch," "epic," "fat," "primo," "radical," "bodacious," "sweet," "ace," "bitchin'," "smooth," and "fly."
My daughter says things are "cool." I predict that her children will say "cool," too.
Isn't that cool?
by whym on 7/17/20, 4:10 AM
One thing I have been wondering about - speaking of changing URIs, did they (W3C) change/merge the domain name from w3c.org to w3.org at some point? Some old documents seem to point to w3c.org instead of w3.org. (e.g. http://www.w3c.org/2001/XMLSchema) Not that it hugely matters, the old (?) w3c.org links still work, since they are redirected anyway.
Example from a book: https://books.google.com/books?id=yLj8m3K0kNoC&pg=PA224&dq=h...
by prepend on 7/17/20, 1:15 AM
This is a great link and I think I’ll share it to people. I find that I struggle trying to explain why URIs shouldn’t change because it’s so ingrained in me.
One of OneDrive’s pet peeves is that if I move a file it changes the URI. So any time someone moves a file, it breaks all the links that point to it. Or if they change the name from foo-v1 to foo-v2. I wish they’d adopt google docs.
by bloaf on 7/17/20, 1:27 AM
If you have sequential pages, I don't like dates in the URIs. For example if you have something spread over 5-pages (e.g. a 5-part blog post), I should be able to guess the URIs for all 5 parts just given one. Dates mean that I cannot do that.
by matijs on 7/17/20, 5:50 AM
There is a pretty cool bet [1] on longbets.org about exactly this.
[1] http://longbets.org/601/
by vxNsr on 7/17/20, 6:09 AM
> I didn't think URLs have to be persistent - that was URNs. This is the probably one of the worst side-effects of the URN discussions. Some seem to think that because there is research about namespaces which will be more persistent, that they can be as lax about dangling links as they like as "URNs will fix all that". If you are one of these folks, then allow me to disillusion you.
Most URN schemes I have seen look something like an authority ID followed by either a date and a string you choose, or just a string you choose. This looks very like an HTTP URI. In other words, if you think your organization will be capable of creating URNs which will last, then prove it by doing it now and using them for your HTTP URIs. There is nothing about HTTP which makes your URIs unstable. It is your organization. Make a database which maps document URN to current filename, and let the web server use that to actually retrieve files.
Did this fail as a concept? Are there any active live examples of URNs?
by RcouF1uZ4gsC on 7/17/20, 1:30 AM
That is nice in theory, but in practice stuff like archive.org are vital. If you see a document you want to refer to later, you need to archive it, either in a personal archive or via archive.org.
There are too many moving parts to trust that even domain names will be the same. See geocities and tumblr for recent example. If you want a document, you should have archived it.
by jacquesm on 7/17/20, 10:08 AM
The problem with URIs is that they weren't foreseen as the gateway to a whole slew of web applications, whose URIs can have a lifetime no longer than to serve that one request. There is a continuum here from long lived useful URIs all the way to ephemeral ones.
And then there are the URIs that aren't even made for human consumption, ridiculously long, impossible to parse or pass around. Another class is those that get destroyed on purpose. Your favorite search engine should just link to the content. Instead they link to a script that then forwards you to the content. This has all kinds of privacy implications as well as making it impossible to pass on for instance the link to a pdf document that you have found to a colleague because the link is unusable before you click it and after you click it you end up in a viewer.
by EamonnMR on 7/17/20, 3:40 AM
Whenever I see a person or API use URI instead of URL I feel like I'm in an alternate universe. Turns out the distinction is that URIs can include things like ISBN numbers, but everything with a protocol string is a URL so really URL is probably the right term for most modern uses.
by cryptos on 7/17/20, 6:16 AM
It would be good if more care would be taken when designing URL schemes. It is not accidental that URL shorteners are used everywhere.
Look for example at this link:
```
    https://www.amazon.com/Fundamentals-Software-Architecture-Engineering-Approach-ebook/dp/B0849MPK73/ref=sr_1_1?dchild=1&keywords=software+architecture&qid=1594966348&sr=8-1
```
Maybe each part has a solid reason to exist, but the result is a monster.
I would prefer something like this:
```
    https://amazon.com/dp/B0849MPK73
```
And guess what, the above short link actually works! But Amazon didn't use this kind of links as a standard.
by jauco on 7/17/20, 5:55 AM
If you’re interested in taking this to a new level. You should check out initiatives like
handle.net (technically it’s like a url shortner, but there’s an escrow agreement you need to sign first to make sure that the urls stay available). Purl and w3id.org (that allow for easy moving of whole sites to a new domain name. And of course https://robustlinks.mementoweb.org/spec/
by emmanueloga_ on 7/17/20, 2:01 AM
TL;DR (from [1]). Guidelines for the "best" URIs:
* Simplicity: Short, mnemonic URIs will not break as easily when sent in emails and are in general easier to remember.
* Stability: Once you set up a URI to identify a certain resource, it should remain this way as long as possible ("the next 10/20 years"). Keep implementation-specific bits and pieces such as .php out, you may want to change technologies later.
* Manageability: Issue your URIs in a way that you can manage. One good practice is to include the current year in the URI path, so that you can change the URI-schema each year without breaking older URIs.
1: https://www.w3.org/TR/cooluris/#cooluris
by dhosek on 7/17/20, 1:29 AM
I'm in the midst of moving a website from mediawiki to a bespoke solution for hosting the data which will enforce structure on what's being presented. In the process, URLs will change, but, part of the migration is setting things up so that, for example, if someone goes to http://www.rejectionwiki.com/index.php?title=Acumen they will be redirected automatically to http://www.rejectionwiki.com/j/acumen so old links will always work. This seems a minimal level of backwards compatibility (although I wonder if there is any specific protocol for how to implement this that will keep search engine mojo—but not a lot because the site gets most of its traffic from word of mouth between users).
by ph1l337 on 7/17/20, 12:02 PM
It's kind of fun to see that this has been posted several times on hn before, but never took off.
e.g.: https://news.ycombinator.com/item?id=8454570 https://news.ycombinator.com/item?id=10086156 https://news.ycombinator.com/item?id=803901
In this one https://news.ycombinator.com/item?id=1472611 the URI is actually broken - not sure if it changed or if it just was a mistake of OP back then.
by jcahill on 7/17/20, 8:34 AM
A comment I didn't post 7 hours ago (was busy):
True. Yet this submission will have dramatically greater visibility than it otherwise would have because the HN facebook bot linked it 5 minutes ago[1]. As a web archivist, I've dealt a lot with the erosion of URI stability at the hands of platform-centric traffic behavior and I don't see it letting up any time soon.
Sidenote: The fb botpage with a far larger audience, @hnbot[2], stopped posting some months ago.
[1]: https://facebook.com/hn.hiren.news/posts/2716971055212806
[2]: https://facebook.com/hnbot
by arkis22 on 7/17/20, 4:13 AM
Does this go against REST, where a url is a specific resource and http transforms it?
by indymike on 7/17/20, 1:50 PM
SEO has caused many companies to adopt unsustainable naming schemes. A url that references and ID is not going to have to change if a word in the title of an article is changed.
by vxNsr on 7/17/20, 6:03 AM
The number one worst offender of this is microsoft onedrive. Document name or location changed? well you'll need to reshare the file/folder with everyone.
by lazysheepherd on 7/17/20, 12:23 PM
> When someone follows a link and it breaks, they generally lose confidence in the owner of the server.
Is it a bias I've developed or has anyone else realized just how many dangling links on microsoft.com? Redistributables, small tools, patches, support pages, documentation pages. I've recently found out when a link domain is microsoft.com I subconsciously expect it to be 404 with about 50% chance.
by jabroni_salad on 7/17/20, 3:15 AM
I've noticed that the fashion industry is just rife with linkrot, and they spoil very quickly. If you're looking at a forum post from longer than 3 months ago chances are links to specific products will instead redirect to the store's front page or a 404.
Is there a benefit to this? I am mostly just frustrated.
by totorovirus on 7/17/20, 8:51 AM
It's really interesting to see perils of old findings becoming relevant when it becomes an actual pain to practitioners. Recent hype to functional programming language and using immutable data was already out there among academics in 90s but wasn't really used in practice until now.
by based2 on 7/18/20, 11:44 AM
http://perdu.com/
by Polylactic_acid on 7/17/20, 4:18 AM
There is a new reason that probably didn't exist back then, the application/cms powering the old pages has been replaced and it would be a massive effort to get the old pages working on the same urls they did before.
I think archive.org is the better long term plan. Not only does it preserve urls forever, it also preserves the content on them.
by pachico on 7/17/20, 10:33 AM
Side topic, sorry in advance but, am I the only one frustrated by how this page is rendered in a mobile browser? I know, probably this wasn't an issue back in 1998 but I would have expected something that was more resilient to devices from w3. Of course, I might be overseeing issues.
by _pmf_ on 7/17/20, 7:56 AM
I have lot of bookmarks with nice URLs that still don't exist anymore.
by iggldiggl on 7/17/20, 7:40 AM
"An URI is for life, not just for Christmas."
by aabbcc1241 on 7/17/20, 6:40 AM
That bring us to the story of ipfs and ndn
by tmwed on 7/17/20, 5:00 AM
“Dope” URIs Dont Change, that’s gas.
by wolco on 7/17/20, 1:16 AM
urn?