from Hacker News

DigitalOcean block storage is down

by kaendfinger on 9/30/19, 9:33 PM with 83 comments

by CaliforniaKarl on 9/30/19, 11:56 PM
Personally, if DO don’t have anything new in a status post, I’d prefer seeing an update that says something like “We are continuing to work on the issue. Nothing new to report. Next update in X minutes.” That is a lot easier for me to parse than the text that someone seems to be copy/pasting in each update.
by kyledrake on 9/30/19, 10:05 PM
What unholy thing did they do that broke it across 12 different datacenters, good lord.
by pastrami_panda on 10/1/19, 6:50 AM
This is OT, but I have a droplet on DO and I'm amazed at the amount of malicious traffic it gets. Is it normal for a very private vps to receive thousands of ssh attempts per hour? I have fail2ban installed and the jail is so busy it's quite astounding. Anyone with more web hosting experience that can weigh in?
by hartator on 9/30/19, 10:38 PM
Not sure why the previous incident page got flagged. This is the new one.
It's affecting us for real. Making almost our whole service - serpapi.com - down. As we are storing database files on block storage volumes.
by louwrentius on 10/1/19, 6:25 AM
Isn't Digital Ocean running Ceph for their block storage?
I would wonder - as others suggested - that they may have stretched the cluster across datacenters ?!
Would be interested in the post-mortem.
by jacquesm on 10/1/19, 5:50 AM
Thank you Digital Ocean for once again proving that 'The Cloud' is not a backup.
by stephenr on 10/1/19, 5:08 AM
This is your weekly reminder that anything you want to be reasonably “HA” should span multiple vendors in multiple DCs.
by simplehuman on 10/1/19, 4:44 AM
Anyone have a review of using DO k8s or DO managed DB in production?
by unilynx on 10/4/19, 3:48 PM
DigitalOcean just posted a post-mortem on http://status.digitalocean.com/incidents/g76kgjxqrzxs
(the same url)
by privateSFacct on 10/1/19, 3:35 AM
Higher latency (per status) is not end of world especially if it’s just “may experience” higher latency.
by imglorp on 10/1/19, 12:23 AM
Hrm, Atlassian BitBucket is also down. Just a coincidence? Does BB use DO?
https://bitbucket.status.atlassian.com/incidents/4t1pkwrdtl8...
by seaghost on 10/1/19, 5:08 AM
Their block storage is such a failure. I’m back and forth with support to automatically delete files with lifecycles for over 2 months now and it’s still not resolved.
by sb8244 on 10/1/19, 3:45 AM
It looks like they have just updated it as resolved and monitoring.
by sunasra on 10/1/19, 3:57 AM
I was always wondering how I can get know proactively if something like this break or some service has an outage. As a result, I have built this tool( http://incidentok.com )
by sodosopa on 10/1/19, 12:47 PM
So that’s why bot attacks and spam traffic was lower.
by golanggeek on 9/30/19, 10:26 PM
This is really down for more than 2 hours!!!
by irfanbaigse on 10/1/19, 1:42 PM
DigitalOCean bad experience
by jbverschoor on 10/1/19, 6:28 AM
Their ad was “you’ve been developing like a beast and your app is ready to go live”
DO is a nice thing to play around with and maybe launch something, but I wouldn’t run full production on it.