from Hacker News

Ask HN: How do you back up your site hosted on a VPS such as Digital Ocean?

by joeclef on 10/9/16, 7:06 PM with 87 comments

I have a production site running on Digital Ocean. I'm looking for a cheap and reliable backup solution. What are your suggestions? Thanks.
  • by dangrossman on 10/9/16, 7:30 PM

    Write a little program in your favorite shell or scripting language that

    * rsyncs the directories containing the files you want to back up

    * mysqldumps/pg_dumps your databases

    * zips/gzips everything up into a dated archive file

    * deletes the oldest backup (the one with X days ago's date)

    Put this program on a VPS at a different provider, on a spare computer in your house, or both. Create a cron job that runs it every night. Run it manually once or twice, then actually restore your backups somewhere to ensure you've made them correctly.

  • by yvan on 10/9/16, 7:35 PM

    The simplest way for us it's to use rsync, there is this service decade old (even more) that is just perfect for the offsite backup. http://rsync.net/index.html

    We basically create a backup folder (our assets and MySQL Dump, then rsync it to rsync.net). Our source code is already on git, so basically backuped on Github, and all developers computer.

    On top of it, rsynch has a very clear and simple documentation to implement it very quickly with any Linux distrib.

  • by Kjeldahl on 10/9/16, 8:20 PM

    DigitalOcean has a droplet backup solution priced at 20% of the monthly cost of your droplet. Doesn't get much easier than that, if you can afford it. For a small droplet ($10/month) that's a full backup of everything for a buck a month. https://www.digitalocean.com/community/tutorials/understandi...
  • by no_protocol on 10/9/16, 10:12 PM

    Whatever strategy you use, make sure you test the process of recreating the server from a backup to make sure you will actually be able to recover. You'll also have an idea how long it will take, and you can create scripts to automate the entire flow so you don't have to figure it all out while you're frantic.

    I use tarsnap, as many others in this thread have shared. I also have the Digital Ocean backups option enabled, but I don't necessarily trust it. For the handful of servers I run, the small cost is worth it. Tarsnap is incredibly cheap if most of your data doesn't change from day to day.

  • by rsync on 10/9/16, 9:38 PM

    Some of our customers have already recommended rsync.net to you - let me remind folks that there is a "HN Readers Discount" - just email us[1] and ask for it.

    [1] info@rsync.net

  • by kumaraman on 10/9/16, 7:43 PM

    I use AWS S3 for this as the storage prices are so cheap, at $0.03 per GB. I recommend using a utility called s3cmd, which is a similar to rsync, in that you can backup directories. I just have this setup with a batch of cron jobs which dump my databases and then sync the directories to s3 weekly.
  • by stevekemp on 10/10/16, 9:00 AM

    I see a lot of people mentioning different tools, but one thing you'll discover if you need to restore in the future is that it is crucial to distinguish between your "site" and your "data".

    My main site runs a complex series of workers, CGI-scripts, and deamons. I can deploy them from scratch onto a remote node via fabric & ansible.

    That means that I don't need to backup the whole server "/" (although I do!). If I can setup a new instance immediately the only data that needs to be backed up is the contents of some databases, and to do that I run an offsite backup once an hour.

  • by AdamGibbins on 10/9/16, 7:34 PM

    I use config management to build the system (Puppet in my case, purely due to experience rather than strong preference) so it's fully reproducible. I push my data with borg (https://github.com/borgbackup/borg) to rsync.net (http://rsync.net/products/attic.html) for offsite backup.
  • by xachen on 10/9/16, 7:31 PM

    www.tarsnap.com - it's pay as you go, encrypted and super simple to use and script using cron
  • by touch_o_goof on 10/9/16, 9:50 PM

    All automated, with one copy to AWS, one copy to Azure, and an scp local that goes on my home server. Rolling 10, put every 10th backup in cold storage. And I use a different tool for each, just in case.
  • by bretpiatt on 10/9/16, 8:01 PM

    For a static site put it in version control and keep as copy of your full site and deployment code.

    For a database driven dynamic site or a site with content uploads you can also use your version control via cron job to upload that content. Have the database journal out the tables you need to backup before syncing to your DVCS host over choice.

    If you're looking for a backup service to manage multiple servers with reporting, encryption, dedupelication, etc. I'd love your feedback on our server product: https://www.jungledisk.com/products/server (starts at $5 per month).

  • by billhathaway on 10/9/16, 9:25 PM

    Remember to have automated restore testing that validates restores are successful and the data "freshness" is within a reasonable period of time, such as last updated record in a database.

    Lots of people only do a full test of their backup solution when first installing it. Without constant validation of the backup->restore pipeline, it is easy to get into a bad situation and not realize it until it is too late.

  • by darkst4r on 10/10/16, 2:08 AM

    http://tarsnap.com + bash scripts for mysqldump and removing old dumps + cron
  • by pmontra on 10/9/16, 8:51 PM

    On OVH I rsync to another VPS in a different data center. I pick the lowest priced VPS with enough space. I also rsync to a local disk at my home. I would do the same with DO.

    OVH has a backup by FTP premium service but the FTP server is accessible only by the VPS it backups. Pretty useless because in my experience if an OVH VPS fails the technical support has never been able to take it back online.

  • by jasey on 10/10/16, 5:56 AM

  • by jenkstom on 10/10/16, 1:18 PM

    Backup ninja. It handles backing up to remote servers via rdiff, so I have snapshots back as far as I need them. The remote server is on another provider. As long as I have SSH login via key to the remote server enabled, ninja backup will install the dependencies on the remote server for me.
  • by Osiris on 10/10/16, 5:13 AM

    I use attic backup (there's a fork called borg backup). It runs daily to make incremental backups to a server at my home.

    For database, I use a second VPS running as a read only slave. A script runs daily to create database backups on the VPS.

  • by 2bluesc on 10/9/16, 7:43 PM

    I use a daily systemd timer on my home machine to remotely back-up the data on my VPS. From there, my home machine backs-up a handful of data from different places to a remote server.

    Make sure you check the status of backups, I send journald and syslog stuff to papertrail[0] and have email alerts on failures.

    I manually verify the back-ups at least once a year, typically on World Back-up Day [1]

    [0] https://papertrailapp.com/ [1] http://www.worldbackupday.com/en/

  • by spoiledtechie on 10/9/16, 7:38 PM

    I use https://www.sosonlinebackup.com.

    Stupid simple and stupid cheap. Install, select directories you want backed up, set it and forget it.

    All for $7.00 a month.

  • by stephenr on 10/10/16, 5:39 AM

    How is this at 70+ comments without a mention of rsync.net?

    Collect your files, rsync/scp/sftp them over.

    Read only snapshots on the rsync.net side means even an attacker can't just delete all your previous backups.

  • by aeharding on 10/9/16, 8:49 PM

    Because I use Docker Cloud, I use Dockup to back up a certain directory daily to S3 from my DO VPS. https://github.com/tutumcloud/dockup

    I just use a simple scheduled AWS lambda to PUT to the redeploy webhook URL.

    I use an IAM role with put-only permissions to a certain bucket. Then, if your box is compromised, the backups cannot be deleted or read. S3 can also be setup to automatically remove files older than X days... Also very useful.

  • by geocrasher on 10/9/16, 7:54 PM

    I run a couple of virtualmin web servers which do virtualmin based backups (backs up each website with all its files/email/db's/zones etc into a single file, very much like how cPanel does its account backups), and those are rsynced (cron job) to my home server than runs two mirrored 1tb disks. A simple bash script keeps a few days of backups, plus a weekly backup that I keep two copies of. Overall pretty simple, and it's free since I'm not paying for cloud storage.
  • by colinbartlett on 10/9/16, 8:09 PM

    The sites I host on DigitalOcean are all very simple Rails sites deployed with Dokku. The source code is in GitHub and the databases I backup hourly to S3 with a very simple cron job.
  • by mike503 on 10/10/16, 12:22 AM

    Bash script to dump all DBs local and tar up any config files.

    Then the script sends it to s3 using aws s3 sync. If versioning is enabled you get versioning applied for free and can ship your actual data and webdocs type stuff up extremely fast and it's browsable via the console or tools. Set a retention policy how you desire. Industry's best durability, nearly the cheapest too.

  • by kevinsimper on 10/10/16, 11:38 AM

    This is the same question I had [1], but just asked in "how can I outsource this cheap" instead of "how can I do this cheap". I also use docker, so I would only need to get a hosted database.

    [1] https://news.ycombinator.com/item?id=12659437

  • by dotancohen on 10/10/16, 5:16 AM

    I see lots of great suggestions for backup hosts and methods, but I don't see anybody addressing encrypting said backups. I'm uncomfortable with rsync.net / Backblaze / etc having access to my data. What are some good ways to encrypt these multiple-GB backups before uploading them to a third-party backup server?
  • by extesy on 10/9/16, 8:49 PM

    I currently use https://github.com/backup/backup on my Digital Ocean instances, but https://github.com/bup/bup also looks nice.
  • by benbristow on 10/9/16, 7:39 PM

    What type of site is it?
  • by moreentropy on 10/10/16, 6:30 AM

    I use restic[1] to make encrypted backups to S3 (self hosted minio service in my case).

    I can't praise restic enough. It's fast, secure, easy to use and set up (golang) and the developer(s) are awesome!

    [1] https://restic.github.io/

  • by wtbob on 10/10/16, 12:26 AM

    I have duplicity set up, sending encrypted backups to S3. It works pretty well, and is pretty cheap.
  • by educar on 10/9/16, 9:21 PM

    If you use docker to deploy, see cloudron.io. You can install custom apps and it takes care of encrypted backups to s3. And automates lets encrypt as well.
  • by 00deadbeef on 10/9/16, 9:12 PM

    I have BackupPC running on another system

    http://backuppc.sourceforge.net/

  • by bedros on 10/10/16, 9:24 PM

    I use borg backup to a backup-drive formatted as btrfs, then I use btrfs snapshot feature, to create a snapshot after every backup,
  • by voycey on 10/10/16, 6:35 AM

    I really rate Jungledisk, you can choose S3 or Rackspace Cloudfiles as your storage medium, very much set it and forget it!
  • by ausjke on 10/10/16, 4:47 AM

    Many ways to backup, but I always encrypt them other than just copying them to somewhere.
  • by yakamok on 10/9/16, 9:13 PM

    i run a python/shell program to rsync and collect what i want backed up into one folder i then compress it and gpg encrypt it and send it to my backup server
  • by edoceo on 10/9/16, 11:26 PM

    I make archives and put them in S3.

    Use pg_dump and tar then just s3cp

  • by chatterbeak on 10/9/16, 8:47 PM

    Here's how we do it:

    All the databases and other data are backed up to s3. For mysql, we use the python mysql-to-s3 backup scripts.

    But the machines themselves are "backed up" by virtue of being able to be rebuilt with saltstack. We verify through nightly builds that we can bring a fresh instance up, with the latest dataset restored from s3, from scratch.

    This makes it simple for us to switch providers, and can run our "production" instances locally on virtual machines running the exact same version of CentOS or FreeBSD we use in production.

  • by X86BSD on 10/10/16, 5:04 AM

    I don't know what the OP is running OS wise but if it's any modern Unix variant it uses ZFS. And a simple ZFS send/receive would be perfect. There are tons of scripts for that and replication.

    If you're not using a modern Unix variant with ZFS... well there isn't a good reason why you would be.

  • by nwilkens on 10/9/16, 7:56 PM

    We have cheap reliable storage servers at https://mnx.io/pricing -- $15/TB. Couple our storage server with R1soft CDP (r1soft.com), Attic, Rsync, or Innobackupex, etc..

    You can also use https://r1softstorage.com/ and receive storage + R1soft license (block based incremental backups) -- or just purchase the $5/month license from them and use storage where you want.