from Hacker News

Ask HN: Best Centralized Backup Solution

by bkgh on 12/16/18, 12:41 PM with 67 comments

What is best free software for managing and scheduling backup procedure of folders/databases in many clients? I want a centralized system that have fine-grained schedule and can show backup status of folders and databases in different clients. It's better to have plugins for supporting backup using different kinds of tools like PostgreSQL(pg_dump), MySQL, Redis, SSH ,... . Backups can be stored on different places like S3, Dropbox and local folders.
  • by tomxor on 12/16/18, 4:45 PM

    ZFS snapshots!... I _wanted_ to use "rsync.net" for this, they use FreeBSD with ZFS snapshots in a jail, the idea is that you get your stuff onto the account however you like over SSH (i.e rsync), then the snapshots serve two purposes: snapshots through time, and read only access incase you get 0wned.

    I couldn't use rsync.net though because all it's datacenters are outside of EEA, i'm currently looking into doing this myself on a simple VPS, ZFS for linux has matured quite well it seems. I'm a bit new to doing chroot jails on linux but the ZFS snapshot part is very easy if that's enough for you.

        apt install zfsutils-linux
    
    It's pretty trivial to make a pool put it in a user directory and then make snapshots... you could easily make a script to do schedule the snapshots, or there are at least two tools already around to schedule this for you via either cron or systemd timers: zfsnap or zfs-auto-snapshot respectively.

    RE databases, with some extra work you could also use ZFS on the source server and take a snapshot of the database (once you invoke the correct commands to lock it), rather than do a dump, this would be very fast because it prevents the duplication of a dump, and therefor could be done much more frequently, you however have the additional complexity of then syncing the snapshot to another servers ZFS pool, although there are tools for this I haven't bothered going this far.

  • by m_b on 12/16/18, 1:19 PM

    BackupPC is the best one: https://backuppc.github.io/backuppc/

    You get every professional features out of the box (full/inc backups, deduplication, compression...) & everything is automated.

  • by reacharavindh on 12/16/18, 1:41 PM

    I think you're looking more for the "scheduling, managing pull from different clients" than the ability to backup to S3, Dropbox, local folder etc.

    There are plenty of software projects to choose from for the latter. Borgbackup, Restic, Plain old NFSv3 and rsync, Tarsnap, Backblaze B2 etc.

    I simply use a combination of Cron, rsync and Ansible to make backups to our central NAS, and that central NAS is mirrored to our off site NAS (with extensive snapshots and tape backups).

    I had to use Netbackup for the Tape system which does everything you listed - managing and scheduling backup procedure in "clients" and does fine-grained schedule, show backup status etc. But, I HATE that thing with all my gut. It is one complicated pile of crap that I have to fight all the time to make it write to the damn tapes. Nothing like plain old simple Unix tools.

  • by geekuillaume on 12/16/18, 4:10 PM

    Self promotion ahead: I worked on a SaaS product to backup databases on S3 called DBacked. After some time I decided to open-source most of it. It's not made to backup files, only databases. You can look at the open-source project here: https://github.com/dbacked/agent and the SaaS product here: https://dbacked.com/
  • by padelt on 12/16/18, 3:49 PM

    I like https://www.urbackup.org for Windows boxes. Does incremental file and image backups. Next system for the Linux world will most likely be https://restic.net with a rclone backend that saves to B2. Idea is to have an append-only service that is safe from ransomware deleting backups for extortion purposes.
  • by doomjunky on 12/16/18, 7:57 PM

    I recommend everyone to read The Tao of Backup. It's a lovely but smart writeup concerning seven most crucial aspects of backup.

    http://taobackup.com/

  • by domsl on 12/16/18, 1:56 PM

    I like using syncthing to sync to file to another (sometimes multiple) location and the use restic (https://restic.net/ :)
  • by memset on 12/16/18, 3:48 PM

    For the second part of your requirement - supporting different backup providers - I'm a huge fan of rclone: https://rclone.org
  • by bagsvaerd70 on 12/16/18, 1:09 PM

    I like to rsync into a BTRFS volume, and then take a snapshot.

    It's simple, fine-grained (you can use .rsyncignore files like .gitignore files) and versioned.

  • by spaceknarf on 12/21/18, 2:43 PM

    There's a lot of good descriptions in this book:

    Backup & Recovery - Inexpensive Backup Solutions for Open Systems

    https://www.oreilly.com/library/view/backup-recovery/0596102...

    It's from 2009, but going through the comments here it seems not much has changed. It will help you choose the right tools, and with many of the open source ones, configure them.

  • by kakwa_ on 12/18/18, 8:21 AM

    I personally use Bacula: https://www.baculasystems.com/

    But in truth, it's a bit complex to setup. You have 3 daemons (agent, director and storage) each with its configuration.

    This configuration files must match exactly (a bit like a nagios configuration) (http://www.bacula.org/2.4.x-manuals/en/main/Conf-Diagram.png).

    Adding SSL is also a bit annoying, as there are many moving parts.

    But once setup, it works quite well and bconsole is nice.

    It's also quite easy to install since it's already available in most distributions.

    You can set backup policies on a per job basis, do full, incremental and differential backups, you can also set a pre and post backup scripts as well as do the same for restorations. It is quite comprehensive albeit a bit complex.

    At home, I use it with an old LTO-1 tape recorder/bank (IBM 3581-H17) repaired with pieces of bicycle tube (I find this kind of robots fascinating, a bit anachronistic in our age, but still relevant).

    In the end, I'm not sure if I will recommend it, but it's definitely a viable option.

  • by sshanky on 12/21/18, 5:09 PM

    I've been playing with Keybase for backups. It's neat because you don't need to deal with any proprietary APIs or scripts (but you do have to install Keybase). The backups are encrypted and can be accessed from another device as well either through your own account or as part of a Keybase Team. For now you get 250 gigs free storage.
  • by nicolaslem on 12/16/18, 2:17 PM

    I started using Backblaze B2 on servers for storing database backups and so far it's been great.

    Now I wonder if `b2 sync` is a good candidate for more traditional desktop backups (documents, lots of photos...). Does anyone have feedback on it?

  • by TekMol on 12/16/18, 5:07 PM

    rdiff-backup works like magic. I love it.

    It's a command line tool that does all stuff related to backup perfectly. It's in the Debian repos.

    It synces a given directory to a backup directory with an additional dir for metadata and history. It's super fast, even over the wire. It seems to achieve this by only sending the changed parts of files. I wonder if it also uses compression?

    You can always change your directory back to any state you backed it up.

    It has tons of nice features for checking the integrity of the backup, recovering single files from a given point of time etc.

    Did I mention how much I love it?

  • by mbogda on 12/17/18, 12:54 AM

    Gitlab is surprisingly good for that. I keep backup schedules in it executed as CI pipelines. Gitlab runner executes my ansible custom code in docker container using image prepared for that purpose.
  • by m3nu on 12/16/18, 3:08 PM

    The backup hosting service (https://www.borgbase.com/) I'm building for BorgBackup will show you the last backup of all repositories in one place. It will also alert you on missed backups.

    For backing up databases, you'd need to install small packages, like `automysqlbackup`. We already provide an Ansible role for setting up automatic backups, which could be extended to apply your schedule or add "plugins".

  • by zmix on 12/16/18, 1:40 PM

    burp (https://burp.grke.org/) comes to mind, as well. Windows and UNIX-like only, no macOS.
  • by tatoalo on 12/16/18, 5:45 PM

    I'm currently using duplicacy and it works pretty good as far as my workload goes, has someone also tried rclone and could draw a quick comparison?
  • by based2 on 12/16/18, 1:19 PM

  • by rorykoehler on 12/16/18, 1:06 PM

    Unison is good for syncing between 2 locations. Personally I would treat DB's and other files differently.
  • by parfamz on 12/16/18, 11:50 PM

    restic with s3 back-end
  • by flavious on 12/22/18, 4:59 AM

    borgbackup.
  • by InGodsName on 12/16/18, 1:01 PM

    Using RDS seems like a best option.

    Many startups use it, and i think backup is also easy.