Solutions for cross-platform synchronization

Some notes on the choices.

The aim

In a sentence: I want all my project data accessible (readable and writeable) on every computer.

I'm plagued constantly by wanting that dataset that's on that computer over there, being unable to tell if I've got the latest version of a manuscript in front of me, having to refer back to earlier versions to justify some analysis and finding out I did that work on that other machine ...

Some of these problems are mitigated by using cloud services (e.g. Gmail, Mendeley), distributed version control (e.g. git) and transferring files around (e.g. Dropbox, Google Drive). But the issue isn't broadly solved and I keep discovering that I need files and data that are somewhere else. This is complicated by the fact that I work in a heterogeneous environment. I'd prefer to do most work under Linux or unix-like environments, there's the home Mac, workplaces often insist that I use Windows, I have an Android tablet ...

Quotes below are from various forums and people I asked for their experiences.

The parameters

  • Reliability: it's no good if 99% of files sync 99% of the time
  • No babysitting: a solution that takes care of itself and doesn't need constant attention (or any attention) is preferred
  • Cost: self-obvious
  • Security: I've got no personal information on there, but there's no need to be cavalier
  • Space: probably only a small amount of storage will be needed at first (i.e. a few gig) but this is certain to grow
  • All files and all info: solutions that exclude files or transform them are right out

Suggested solutions

Google Drive

I use Google for a lot of stuff (email etc.) so it seems natural to look at Google Drive. It's very secure, the extra storage is cheap enough, but I've found it incredibly unreliable: Some files just refuse to sync or take ages to appear, seemingly without rhyme or reason. Broadly, the sync seems to be a bit stupid. On Ubuntu there's no native client, but Insync uses your Google space and seems to work well - although there are also a few reports of it not syncing or excluding files. Google's web UI isn't entirely to my liking either - it insists on treating everything as sorted by keywords / tags and so deep folder hierarchies are turned into a flat mess.

Pros: secure, cheap storage, compatibility with other Google services, web interface

Cons: no native Linux client, unreliable sync, weird web interface

Dropbox

I've used Dropbox for years: it's reliable and "just works". The Linux client is good. (The Android client less so, but it's improving.) But extra space isn't cheap and their patchy record on security gives me pause.

Pros: just works, web interface

Cons: expensive extra storage, dodgy security

SparkleShare

A git-based system for syncing and revising your work. While this seems like a great idea - and does work well for small stuff - size and performance starts to suffer with larger data, especially binaries. This is apparently being looked at but there is no timetable for a fix.

Pros: it's git! And we can get git everywhere!

Cons: problems with binaries and large files.

OwnCloud

A self-hosted solution that needs a full LAMP stack on the server, although that provides a nice web interface. There are mixed reports on this one - some claimed that the software was slow and crashy on both the client and server ends. Others said they'd "had no problems".

Pros: web interface, self hosted so you can have as much space as you want

Cons: self-hosted so you have to keep it running and working, reliability

SpiderOak

SpiderOak is basically a far more configurable Dropbox: you can point at individual folders on individual machines and get them backed up into the cloud, then linking any of those folders to the others be synchronized. There's a web interface, a handy "share this folder on the web" feature, a new "Hive" feature that basically duplicates Dropbox. Also the whole thing is ridiculously secure, with all the data being encrypted in transmission and storage, to the point that SpiderOak HQ says if you lose your password, they are unable to help you. On the down side, when a new folder or machine is introduced, it seems to be very slow to propagate.

Pros: security, does everything Dropbox does, configurability, Android client

Cons: extra storage is not cheap, client may be slow

Others

There's a bewildering array of solutions for this, so many that a lot had to be dismissed for fairly minor reasons (small user base, a lot of software installation, "techy", etc.). Solutions not examined in depth include:

  • Seafile
  • Roll my own manual solution using rsync: charming but I'd have to trigger it myself (although a cron job could be written) and I've got enough else to do.
  • Jungle Disk: reported to have a slow client and dodgy synchronization
  • Syncany: not ready and by one report "probably never will be"
  • Box.net: no official Linux client "but if you have root and can install WebDAV you should be able to make it work". Urgh, no.
  • Bittorrent Sync
  • Wuala
  • SugarSync
  • Box
  • Bitcasa
  • SkyDrive
  • Tresorit
  • Syncplicity
  • Apple iCloud
  • git-annex assistant
  • Sparkleshare
  • Tahoe-LAFS

Conclusion

I went with SpiderOak. US$10 a month gives me 100G and after 3 months it seems to be working fine. True, the initial upload and sync was sluggish (a few days for 10G) but this may have been effected by a poor internet connection here. The ability to configure it for multiple folders has proven to be surprisingly handy - I sync up my project folder, ebooks folder and image folder. (The music will be next.) And it needs no attention and just does its job in the background.

Having said that, Dropbox will be a fine solution for most people. Certainly the storage is expensive, but if you've only got a little data and are a bit canny with their referrals bonuses, it should be enough. It's inflexible being a single folder, but again that will suit a lot of people. Most of all, it just works.