from Hacker News

Build An Opensource Dropbox Clone

by tommynazareth on 9/1/10, 4:28 PM with 58 comments

  • by telemachos on 9/1/10, 8:39 PM

    This is an off-topic short rant. Vote as you like (obvious enough), but skip reading this if you want on-topic discussion.

    Shoot me if this is how the web is heading. (Recent browsing suggests maybe it is...) We get not only the insufferable "bottom bar" (is there a standard name for those yet?), but also a "top bar", a hover "Click Here to Share" box and a pop-up "Learn More, Instantly" box (not to mention the general clutter and overdense design).

    Man, that's foul.

  • by dododo on 9/1/10, 5:24 PM

    lsyncd does the brunt of the work. it uses inotify, which is a facility of the linux kernel, to tell which files should be copied. but i think there is a problem.

    inotify is not guaranteed to see every file system access. in fact, it often misses them when the file system is busy because there is an upper bound upon the number of inotify events that can be queued, set in /proc/sys/fs/inotify/max_queued_events.

    i.e., you could get data loss using this mechanism alone. you need to also run rsync every so often, i think.

    the lsyncd guys are aware of this: http://code.google.com/p/lsyncd/

    (maybe this is how dropbox works under linux too, if so it presumably has the same problem..)

  • by tommynazareth on 9/1/10, 5:17 PM

    I don't plan on ditching Dropbox, but I'm definitely interested in having a similar sort of automatic backup with diffs that I can host myself. This looks like it might be an option.

    Does anyone else handle this in a different way?

  • by gvb on 9/1/10, 7:00 PM

    What about using unison http://www.cis.upenn.edu/~bcpierce/unison/ instead of lsyncd? It seems like it would be more resilient. It does look like unison would require a cron job or human trigger where lsyncd is triggered by a file/directory change. On the plus side, it seems like unison would work better in an offline mode.
  • by mikeyg on 9/1/10, 7:30 PM

    Zimbra's open source edition's WebDAV sharing works well once you get the URLs figured out. You get a lot more fine grained permissions. It's cross platform, there are shared folders, etc.. if that's what you're looking for. You can make folders public to share with others, and so on. With bandwidth as fat as it is these days DAV's heavyness doesn't feel so heavy anymore. Am I the only one that shares these sentiments? Did I mention it works on MacOS 10.6, Linux, and Windows 7?

    If you're using Dropbox to back up some of these other solutions might make more sense. Crafting up a script using hard links and rsync can give you daily snapshots of any filesystem at only the storage cost of the delta (which can be easily turned into a cron job) .. here's a decent resource for that:

    http://www.mikerubel.org/computers/rsync_snapshots/

  • by please on 9/1/10, 6:05 PM

    except that it misses 90% of what dropbox does. like multi client sync, cross platform, shared folders, web interface and so on
  • by res0nat0r on 9/1/10, 5:11 PM

    I've looked at this project before, it looks like a nice implementation of rsync with inotify, but it seems to be just for a typical use case for rsync, ie, push your local changes to a remote host.

    I think if you try to use this like a real dropbox, ie changes occurring on multiple hosts at or around the same time you are going to run in to issues. Also does lysnc handle deleting data on the remote end and propagting those changes out (rsync --del). Also I'm wondering how it will handle conflicts and open files (which it says isn't recommended). I see the .dropbox dir using some type of uuid's or sha1 sigs so I'm sure its doing something more sophisticated to keep things in sync. I've noticed my daemon has died on my linux box every once in a while, and it somehow tracks multiple changes from multiple hosts and replays those transactions correctly and everything rolls up fine.

  • by xxpor on 9/1/10, 5:25 PM

    I don't understand why dropbox doesn't open source their client. Their business isn't built off of their technology, which, while nice, isn't exactly novel. They make their money from selling storage. Open sourcing their client doesn't change the fundamental fact people need the space to store stuff.
  • by dochtman on 9/2/10, 9:09 AM

    I thought http://www.sparkleshare.org/ was more interesting.
  • by khingebjerg on 9/1/10, 8:33 PM