from Hacker News

The lie of the API

by Mustafabei on 12/4/13, 12:08 PM with 118 comments

by smizell on 12/4/13, 1:38 PM
The big benefit the author is describing comes from content negotiation and hypermedia types. The idea behind content negotiation is that a client and a server should decide and agree upon the media type that will be used in the communication without any human intervention (even a developer). This is accomplished with the Accept header, where the client tells the server, "Here are the media types that I understand." Most times, we developers use the Accept header as "Here is the media type I want you to send me." If we use the former, the server will figure out what media type to send based on the clients acceptances and preferences.
Here's the great thing about it. If APIs are built this way, and if clients are built to read and understand common and registered hypermedia types, there could be a time where clients and servers are able to communicate in such a way that the media type becomes seemingly invisible to the developer. We see this with the most popular REST client/server combo, the browser and the web server that serves up HTML. As the user, you can traverse the RESTful HTML API that websites have while the media type, HTML, is mostly concealed to the user. In other words, there is a chance that a good number of HTML websites are more RESTful than most of the APIs we see today.
In reducing REST to simply RPC over the web and skipping over the ideas of content negotiation and hypermedia types, we are missing out on the genius behind how the web was designed to be used. The author is really wanting us to go back to that instead of progressing toward the current patterns of fracturing your resources into separate APIs.
by gwu78 on 12/4/13, 3:35 PM
I respect sites that provide data dumps (e.g., Wikipedia) far more than ones that design "API key" systems.
If these API keys are for "developers", then why is there an assumption that the developer cannot (or does not want to) work with raw data? Or, at least, why is there no demand from "developers" for raw data? I have never understood this "API key" phenomenon.
With today's storage space prices and capacities (physical media, not "cloud"), in many cases a user could transfer all the data she ever needed to her device and have the fastest access possible (i.e., local, no network needed) for all future queries/requests. Not to mention the privacy gains of not needing to access a public network.
Using a bakery as an example, implementing API keys is like making customers fill out ID cards and come to your bakery and present ID every time they want a slice of bread. Your policy is "Sorry we do not provide loaves to customers."
This might be palatable if your bread is something truly unique, a work of culinary art. But in practice the "bread" of web sites is data that they gathered somewhere else in the public domain. They are like the "bakery" who buys from a bulk supplier and resells at a markup. Except, with web sites, the data they obtained to "resell" cost them nothing except the electricity and effort to gather it.
The easiest way to stop "web scraping" is to make data dumps available. Because then you really have no obligation to provide JSON or XML via an "API key" system. It is less work, less expense and it's far more efficient.
by angersock on 12/4/13, 5:02 PM
TL,DR:
+ Use content negotiation headers instead of explicit content extensions for resources.
+ Don't pass auth tokens as part of the URL (you monster).
+ Don't have onerous processes for obtaining API keys.
+ Web scraping is totally a legit way of providing programmatic access to data.
~
Sadly, the author is kind of wrong in these cases.
First, as I've run into on some of my own projects, specifying desired content type (.html, .csv, .json) in the URL is actually pretty handy. In Rails, for example, you just you a respond-to-format block. This lets clients using dumb web browsers (and you'd be surprised how many of those there are) download easily the type of content they want. Accept headers are useful, but they don't solve everything.
Second, I do agree that auth tokens should go in the header--that's just reasonable. If I'm doing something that needs an auth token, I probably am curl'ing, and so I can easily set headers.
Third, keys are a necessary evil. They are the least annoying way to track access and handle authorization. That said, it shouldn't be awful to get a hold of one--in our previous startup, api keys were similar to auth tokens, and that worked out fine.
Fourth, web-scraping is not a good solution. "Herf derf just have your dev scrape the thing" is cool and all, but if the document is not marked-up in a friendly way, that information can be very brittle. Moreover, you run the risk of having cosmetic changes break scrapers silently. It's far better just to expose a machine-friendly API (which is handy for testing and monitoring anyways) and let your frontend devs do whatever wacky stuff they want in the name of UX.
EDIT:
I am all for rate-limiting as a basic step where keys do not suffice.
As for scraping, the article is a bit weird on this point. The author's insistence on "DONT USE APIS EVER RAWR" and then on "hey, let's use application/json to provide documents under the same paths for machines" is goofy. It's like they don't want you to use an API, except when they do.
The wording and phrasing just really gets in the way of the article--had the tone been a bit less hyperbolic, it would've been a decent "This is why I find web APIs frustrating to work with" with examples.
EDIT EDIT:
The author is a Semantic Web wonk. That explains it.
by barrkel on 12/4/13, 1:38 PM
There's one significant difference between an API and site scraping: versioning.
A documented API that doesn't come with some form of commitment not to break it is little better than web scraping.
Web scraping, meanwhile, is subject to breakage at every whim of the web site designers.
by handelaar on 12/4/13, 12:51 PM
tl;dr - APIs are necessary and are not a lie, contrary to the first thousand-or-so words of the article, but the author would prefer you had API resources at the same URIs as your user-facing web content, and allow user agents to switch between them using the 'Accept' http header.
by unwind on 12/4/13, 12:53 PM
This is probably stupid (I'm not a web developer), but how about using Javascript on the (human, in a web browser) client side to convert API results into DOM elements?
It would probably be less than fun to write and maintain such a monster, but it would at least make it possible to expose a single API from the server's point of view ... Yay?
by jheriko on 12/4/13, 12:37 PM
really this is a symptom of how terrible web architecture is...
the concrete examples make this painfully obvious - the API referred to is the 'modern hipster' flavour of it, nothing to do with any of the APIs I use day to day which don't go across the web.
there is a much more classical programming problem at the root of this. clients asking for what they want as implementation details instead of what they want from the result. couple this with a lack of sensibility about encapsulation and interfaces and sprinkle in the use of 'REST' as a buzzword and voila...
by Ygg2 on 12/4/13, 1:14 PM
Anyone here watched "Life is Beautiful"[1]?
There is a scene where the father of the boy "translates" what the German officer is telling to the prisoners. This is essentially what all UI (API included) does. Yeah, it's a lie, but it's a lie that actually shields us from the awful truth of how everything works.
[1]http://en.wikipedia.org/wiki/Life_Is_Beautiful
by cognivore on 12/4/13, 2:32 PM
I've been in this boat before:
Them: I need you pull data from a web site to integrate with our system. Me: Neat, how is the data exposed? Them: It's a website. Web pages. Me: I'm going to stab myself in the head now.
After spending days pulling messy HTML, attempting to navigate around with whatever method this site uses (JavaScript only maybe), and hammering everything into some sort of cohesive form you'll be seriously wishing they wasted money and time putting and API on their site.
I see he's a PhD researcher. Just sayin'.
by dblotsky on 12/4/13, 12:58 PM
This is mostly a really long-winded promotion of the "Accept" header. The article's gripe about APIs is basically a gripe about poorly thought-out design in general. Bad design isn't an artifact of the system in which you see it. Bad design exists everywhere. Keep APIs out of it. It's like saying that people should stop using cars because Lada keeps making fuel-inefficient outdated crap.
by crazygringo on 12/4/13, 10:38 PM
Except that URL's (visible pages) often don't map 1-1 to "content", and while they originally were "supposed" to, reality is far more complicated than that.
People like to be able to browse pages in an "intuitive" way. This means often combining multiple pieces of content onto a single page, or splitting up a single piece of content onto multiple pages, or often both.
In the real world, URL's are human-friendly pages which generally try to hit a sweet spot between too little and too much visible information, not unique identifiers of logical content.
Which is exactly why API's are useful -- they are designed around accessing logical content. But this is not what normal human-readable webpages are generally designed for, and rightfully so. They serve different purposes, and insisting that they should be the same is just silly.
by supermatt on 12/4/13, 12:40 PM
Not all APIs are resource-specific or public. While the arguments may be valid in many cases, I think the author may be confused between the web and software in general.
by gizzlon on 12/4/13, 2:08 PM
It’s the same content, why would you need two interfaces, just because your consumers speak different languages?
Because you then can change one without affecting the other. If your html is parsed automatically, the parsing can break when you update your html to fix a design flaw.
OP has some good points though, those APIs look retarded.
Content negotiation could be nice, but it doesn't remove the need for keys in most cases, and adding this to your stack could be harder than just making a simple API.
Ask for new representations on your existing URLs. Only by embracing the information-oriented nature of the Web, we can provide sustainable access to our information for years to come
Yes. But won't the answer, in most cases, be a simple "API"? (not a real API, in the programming sense)
by girvo on 12/4/13, 12:40 PM
It depends.
If your "API" is literally just a machine parsable version of data you have on your HTML, well, yeah, doing it the way the OP described as better will work.
But if you're writing an API to access a proper web application, it needs more than just data retrieval, and it needs ACL, and it needs to not show things to certain people, and allow bi-directional communication, and all sorts of other things.
That's where what the OP is asking for breaks down, and I don't think APIs are a "lie", perhaps they can be a leaky abstraction and sometimes the wrong choice, but they can also be super useful.
Its funny he brought up Amazon early on: they run entirely on SOA, APIs everywhere, controlling everything. Seemed cute to me :)
by bonaldi on 12/4/13, 2:14 PM
The comments on this one are worth a read; there are some well thought-out rejections of this.
(Like so many of the ranty genre, it's taking a single use-case and insisting it covers all cases. Yes, some APIs could be replaced by a negotiated machine-readable version of an HTML page, but other APIs serve specific machine access patterns that don't (and shouldn't) map neatly to the pages humans see.)
by lmm on 12/4/13, 2:21 PM
This sounds like a good idea, but it's not. For an example like a single image, or a product page, it works well. But most of the page views that you want to offer don't correspond neatly to a single REST entity - think "dashboards" and shopping carts and all the varied pages that exist in a modern application. And conversely many REST entities that you want in your API model simply don't correspond to frontend pages.
The notion of a single canonical URL for each object is attractive, but it breaks down as soon as you want to use many-many relationships efficiently. Like databases, APIs are and should be denormalized for efficiency. Given this, there's very little benefit to keeping the human- and machine-readable URLs for a given object the same, and there are downsides - do you really want every AJAX request to include all your user's cookies?
The value of API keys is that they give you a point of contact with the developers using your API. If you want to redesign the web page, you can just do it - users might get annoyed, but they'll be able to figure out how to find their information on the new page. If you want to redesign your API, you'll break existing clients. By forcing developers to provide an email address and log in every 6 months to get a new key, you get a way to give them fair warning of upcoming changes.
(And the gripe about multiple interfaces is a red herring; the webapp (whether traditional or client-side) should be one more client of your API.)
by dreamfactory on 12/4/13, 9:26 PM
Article makes no sense to me. He is just advocating for RESTful architecture in API design, which hardly matches the controversialist tone. At the same time it completely ignores anything but the simplest read-only API with a 1:1 mapping to public resources. It's like trying to make an argument about aircraft design whilst referencing a bicycle.
by Houshalter on 12/4/13, 1:10 PM
I don't understanding why having two different URLs is such a big deal. And how do I open the new API thing in my browser?
by xixixao on 12/4/13, 3:57 PM
I agree with the public limited API vs HTML point: Github limits their API to 60 requests per hour[1] without authenticating - or I can just scrape it for the simple boolean value I need.
[1]: http://developer.github.com/v3/#rate-limiting
by duaneb on 12/4/13, 6:05 PM
Web scraping is buggy and unreliable at best. Modern HTML is designed for browsers to interpret and display the same way, not to communicate data. If web scraping were to be at all viable, the HTML would need to be in a consistent, easy to parse, format that didn't require any dynamic evaluation.
Good luck.
by prottmann on 12/4/13, 1:58 PM
A website change many times, a API stay for a long time.
What would you do with links if the new Web-Designers change all URLs (because of a "fancy cool" new SEO style)?
Which problem did you solve with your kind of view of an API ?
by hawleyal on 12/4/13, 1:53 PM
FUD
by benihana on 12/4/13, 5:43 PM
It's so hard to read an article that starts with such a ridiculous assertion:
Really, nobody takes your website serious anymore if you don’t offer an API.
Especially when this article gets posted to Hacker News.