by jolie on 3/2/13, 7:10 PM with 87 comments
by tvladeck on 3/3/13, 12:06 AM
The point is that one company promised a level of service with their product that they did not deliver, and the difference was significant and persistent. The fact that the consumer could have used the product more efficiently is immaterial to that fact.
Other things that don't matter:
-that RG could/should move to another provider. That is of course their choice now, but it does not change the money they've spent and wasted with Heroku.
-that the routing problem is hard. If anything this makes it worse - it's a hard problem so people would pay a lot of money for a solution. What matters is that Heroku claimed to solve it and did not.
-that other consumers of the product managed to figure this out before RG. Heroku was still advertising through their documentation that they offered a routing solution, and they did not make clear to their customers that a significant feature of their product was now different.
Furthermore, Heroku appeared to obfuscate this fact and shift blame to the customer during the time RG was trying to diagnose their issues.
Now, by attacking RG's tone, Heroku have employed argument-level DH2 [1], which at least according to pg is not even worth considering. They have at least acknowledged their mistake, but to me that means that by extension they have sold something that they did not deliver on. The only honest way to move forward is for Heroku to offer some kind of compensation to the customers that were affected.
by brown9-2 on 3/2/13, 9:15 PM
I used to feel this way about Heroku, and I might again in the future, but I don’t right now.
I have a hard time understanding why, for all the money Rap Genius pays Heroku, they don't simply set up their own instances on EC2 and run the app there themselves. It seems like for a few days work with Puppet or Chef you could automate getting your code onto dozens of EC2 instances and installing the necessary tools/server processes, plus you don't have to complain anymore about how you can't run Unicorn.
Yes I get that there is a certain amount of value in being able to pay someone else to do all these things for you and saving time - but if you aren't happy with the result and the value given the money you are paying (and RG is not), then at a certain point it's time to just bite the bullet and fix things yourselves instead of continuing to be hamstrung by problems that the hosting provider won't/can't fix. There comes a point where you get large enough, and you are paying enough to Heroku, that it would be worth it to do things yourself and eliminate the problems.
by thraxil on 3/2/13, 8:21 PM
Is this really accurate? 512mb is barely adequate for serving a single request at a time? I'm not a Rails developer, but that sounds terrible. I'm all for trading off some performance for rapid development, but that seems a bit extreme.
I'm currently running twelve Django apps on one 512MB Rackspace VM. It's a bit tight, and I don't get a lot of traffic on them, but it's basically fine. And that's with Apache worker mpm + mod_wsgi (with an Nginx reverse proxy in front) which probably isn't even the lightest approach. And having been writing apps in Erlang and Go recently, I'm starting to feel like Python/Django are unforgivably bloated in comparison.
by benologist on 3/2/13, 8:11 PM
What are (edit:) Rails developers getting in exchange for these enormous penalties that makes it worth choosing?
by aelaguiz on 3/2/13, 10:12 PM
They were literally ignoring our repeated customer service tickets pleading for assistance or a phone call or something. We were paying them hundreds of dollars per month at the time.
When we finally got through the only people we could get ahold of were salesman. Essentially we were made to believe that only for $1000/mo support contract would we receive customer support.
FWIW Our issue was frequent network timeouts to other ec2 services which were. They did eventually resolve those after months and never did they assist us.
Heroku's platform is a significant accelerator of development for a startup. Using the platform has enabled us to do things faster and better than we'd otherwise be able to do them for the money and time we've invested.
That being said, I look forward to they day they have a true/viable competitor and are forced to compete on service. I'm extremely bitter towards them at the moment as a result of my customer support torture experience.
by kmfrk on 3/2/13, 7:48 PM
I've always wondered whether the cut-off is time- or success-based. Maybe pg should write a Boolean return function for that. :P
Big props to Rap Genius for explaining the problem so plainly in the article. Unfortunately, many people of prominence in tech aren't even capable of talking about what they do to laymen.
by sologoub on 3/2/13, 9:52 PM
Apparently, the fact that requests can be queued at Dyno level was common public knowledge back in 2011! Here's a quote from Stackoverflow answer:
"Your best indication if you need more dynos (aka processes on Cedar) is your heroku logs. Make sure you upgrade to expanded logging (it's free) so that you can tail your log.
You are looking for the heroku.router entries and the value you are most interested is the queue value - if this is constantly more than 0 then it's a good sign you need to add more dynos. Essentially this means than there are more requests coming in than your process can handle so they are being queued. If they are queued too long without returning any data they will be timed out."
Source: http://stackoverflow.com/a/8428998/276328
When you use a PaaS, it doesn't mean you don't need to be serious about it and completely forget about all technical aspects. Granted, it should have been included with New Relic from day one, but hardly justifies such a direct and persistent attack on Heroku.
by jonmc12 on 3/3/13, 4:12 AM
by spronkey on 3/4/13, 8:56 PM
There's a concept called a Bus Factor. Basically, it's the number of people who, if hit by a bus and made otherwise unusable, it would take to completely rail your business.
With $60k spent on a single sysadmin and an army of EC2, that's a pretty effing small bus factor - 1. So... that one guy gets taken out of action, and they're more or less toast? Yeah, no. Heroku gives them a massive bus factor for perhaps a little bit more money than it would take to cheap it themselves. It's a cheap way to avert risk.
They're probably at the size now where they could handle taking it in-house, but you've still then got to factor in hiring, developing the procedures for ops inhouse etc., and migrating. It's not easy to just flip the switch.
In any case, Heroku's behaviour is pretty shoddy. Though, knowing how much of a pain documentation is, I'm not surprised. I don't think they realised just how bad the change from intelligent to random routing actually was - and didn't treat it as such. This is giving them benefit of doubt though, because the other option is that they didn't publicise it precisely because they knew how bad it is. Scary thought.
by plasma on 3/2/13, 11:16 PM
by dkhenry on 3/3/13, 3:44 AM
by olefoo on 3/3/13, 5:29 AM
But, my sympathy is going to them, because what I see coming from Rap Genius looks like classic blame-game. So a vendors documentation was unclear and your server sucked publicly for some time? Shameful. You didn't know about it because you expected your vendors to give you extra hand-holding? That's really rough. Instead of fixing the issues and moving on, you make it the one thing that everyone thinks about when your company is mentioned... that might not be in your best long term interests.
After this, I would be hesitant to enter into any sort of relations with Rap Genius, and I'm not that sure of what they do or what their product is.
by paul_f on 3/2/13, 11:36 PM
by dtweney on 3/3/13, 1:16 AM
by neya on 3/3/13, 6:52 AM
by hashset on 3/2/13, 9:10 PM
by ChuckMcM on 3/3/13, 12:38 AM
This has taken on the patina of a really huge fight between operations and engineering with nobody to step in and say "Hey, we both want to make progress here, let see what we can do." there is no common point of contact here sadly.
What is the end goal? One of these companies being out of business? What? Its pretty clear that Heroku doesn't have any ideas on how to implement routing the way Rap Genius believed it worked, they even said as much. So what is the next step?
by ctovision on 3/3/13, 3:31 PM
by woah on 3/3/13, 2:11 AM
by Sujan on 3/2/13, 8:24 PM