by citricsquid on 9/29/14, 7:12 PM with 11 comments
by leesalminen on 9/29/14, 7:32 PM
I agree with your points though, 9:20 PM on a Friday was a bad time to notify customers. I had to write an email to all my customers at 11PM on a Friday. Not only that, I had to call my business partner to read it before sending, as it was a Friday night and I was inebriated.
Interesting story, after seeing the notice from AWS here on HN, I called Rackspace on Wednesday night. It was late, and the gentleman answering the phone said he worked in the NOC. He said that Rackspace was not affected by the "Shell Shock" CVE, and that they would not need to perform disruptive maintenance. Imagine my surprise Friday night.
by jms703 on 9/29/14, 7:32 PM
You have servers running on infrastructure inside a datacenter that you don't control. Surviving reboots and outages needs to be a part of your strategy if these systems are important.
Oh, and there's this: https://github.com/Netflix/SimianArmy/wiki/Chaos-Monkey
by underyx on 9/29/14, 9:28 PM
>#2) [...] Had we gotten notice on a monday, we could have switched servers or companies altogether. Hard to do that on a weekend with short notice.
These two appear to be the same.
>#3) Be sure to NOT use the word downtime. Disguise the issue as a likely small inconvenience.
>#4) Give Zero Expectations of Impact.
These two also appear to be the same. (These sentences themselves are actually contradictory, but the point of the description in these two are the same.)
>#5) Put details in riddle format. The official email said, “Maintenance window (all times CDT, UTC-5).” Whew! That cleared it up.
Why is specifying the time zone of the times given below a riddle? I can't think of any way to specify exact times in a simpler manner. I'm really not sure if 'When it's the middle of the day in some part of the USA' woul have served customers better. I assume there were times specified, as the quoted sentence wouldn't really serve any purpose otherwise.
>The Rackspace maintenance window was 6am - 6pm. Hmmm, thats odd.. thats exactly peak time. Sounds perfect.
You have no way to know how much traffic Rackspace serves in that timespan, Peak time for them might be way different than your peak time. They serve all kinds of countries all over the world, you know.
>#7) Spin lack of information as “protection.” Certain details? Like estimated impact? Progress estimates?
Most likely, they are trying to deceive anyone there, they should obviously be referring to lack of information about the issue's nature, not lack of information about maintenance work.
>#8) To pacify frustrated customers, just lie. I called into support 50 minutes into the outage and was told the migration was 85% complete. That pacified me enough to get off the phone. Then two more hours passed.
I don't think Rackspace was actively trying to deceive people. This sounds more like an issue with unexpected issues coming up, or the customer service rep being poorly informed.
>#9 Bonus Tip) Direct customers to remedies that dont work. My support gal was empathetic but at best, offered me the option to create a ticket so I would be heard by those who care. Lets forget how impersonal that is for a moment.
'Heard by those who care'? Again, I find it hard to believe that someone whose profession is to resolve customer issues via phone would care any less than people on the other side of the ticketing system. It is more probable that the representative was just unable to assist any further.
by Paul_Dessert on 9/29/14, 7:46 PM
I've put up with it because it's a PITA to transfer servers. I think it's time to start shopping around.
by RowanH on 9/29/14, 7:34 PM
by jedicoffee on 9/29/14, 7:44 PM
by vblord on 9/29/14, 7:28 PM