from Hacker News

DigitalOcean block storage is down

by kaendfinger on 9/30/19, 9:33 PM with 83 comments

  • by CaliforniaKarl on 9/30/19, 11:56 PM

    Personally, if DO don’t have anything new in a status post, I’d prefer seeing an update that says something like “We are continuing to work on the issue. Nothing new to report. Next update in X minutes.” That is a lot easier for me to parse than the text that someone seems to be copy/pasting in each update.
  • by kyledrake on 9/30/19, 10:05 PM

    What unholy thing did they do that broke it across 12 different datacenters, good lord.
  • by pastrami_panda on 10/1/19, 6:50 AM

    This is OT, but I have a droplet on DO and I'm amazed at the amount of malicious traffic it gets. Is it normal for a very private vps to receive thousands of ssh attempts per hour? I have fail2ban installed and the jail is so busy it's quite astounding. Anyone with more web hosting experience that can weigh in?
  • by hartator on 9/30/19, 10:38 PM

    Not sure why the previous incident page got flagged. This is the new one.

    It's affecting us for real. Making almost our whole service - serpapi.com - down. As we are storing database files on block storage volumes.

  • by louwrentius on 10/1/19, 6:25 AM

    Isn't Digital Ocean running Ceph for their block storage?

    I would wonder - as others suggested - that they may have stretched the cluster across datacenters ?!

    Would be interested in the post-mortem.

  • by jacquesm on 10/1/19, 5:50 AM

    Thank you Digital Ocean for once again proving that 'The Cloud' is not a backup.
  • by stephenr on 10/1/19, 5:08 AM

    This is your weekly reminder that anything you want to be reasonably “HA” should span multiple vendors in multiple DCs.
  • by simplehuman on 10/1/19, 4:44 AM

    Anyone have a review of using DO k8s or DO managed DB in production?
  • by unilynx on 10/4/19, 3:48 PM

    DigitalOcean just posted a post-mortem on http://status.digitalocean.com/incidents/g76kgjxqrzxs

    (the same url)

  • by privateSFacct on 10/1/19, 3:35 AM

    Higher latency (per status) is not end of world especially if it’s just “may experience” higher latency.
  • by imglorp on 10/1/19, 12:23 AM

    Hrm, Atlassian BitBucket is also down. Just a coincidence? Does BB use DO?

    https://bitbucket.status.atlassian.com/incidents/4t1pkwrdtl8...

  • by seaghost on 10/1/19, 5:08 AM

    Their block storage is such a failure. I’m back and forth with support to automatically delete files with lifecycles for over 2 months now and it’s still not resolved.
  • by sb8244 on 10/1/19, 3:45 AM

    It looks like they have just updated it as resolved and monitoring.
  • by sunasra on 10/1/19, 3:57 AM

    I was always wondering how I can get know proactively if something like this break or some service has an outage. As a result, I have built this tool( http://incidentok.com )
  • by sodosopa on 10/1/19, 12:47 PM

    So that’s why bot attacks and spam traffic was lower.
  • by golanggeek on 9/30/19, 10:26 PM

    This is really down for more than 2 hours!!!
  • by irfanbaigse on 10/1/19, 1:42 PM

    DigitalOCean bad experience
  • by jbverschoor on 10/1/19, 6:28 AM

    Their ad was “you’ve been developing like a beast and your app is ready to go live”

    DO is a nice thing to play around with and maybe launch something, but I wouldn’t run full production on it.