Once upon a time, before Dropbox was a thing, I had a big desktop and a small laptop. I thought, wouldn’t it be wonderful if my files could be the same on both sides? And if it could do that automatically, without me asking. So I started writing a program called SyncDroid (no relation to the Android app) to do file-sync-over-LAN. I wrote some blog posts to explain my thinking.
Time passed, and I got busy with other things. I ceased having a consistent desk and became completely dependent on my laptop, which got much bigger to compensate for having to do everything and at the same time only slightly physically heavier thanks to the wonders of technology.
Time passed some more, and I find myself at the same desk for three days of the week, with a really nice desktop and another very nice but just-not-quite-as-beefy laptop. Dropbox is not a good option due to the proud tradition of Crap Australian Internet, and besides, security and cloud services do not mix. (Yes, I am aware of SpiderOak. No, I will not use it until I can audit it and compile it myself.)
So BitTorrent Sync is a thing, which is basically what I dreamed of when I started SyncDroid. Zero-interaction LAN file sync between machines. No dependency on Internet services. Free. Sold.
File sync is a really tricky problem. It cannot be fully and correctly solved without massively overhauling how applications deal with data. (The Cloud helps a lot. It is not the complete solution.) Therefore, I have some advice on how to make BitTorrent Sync work without too much pain or unexpected data loss.
When you’re starting out with say, your Documents folder, don’t try to sync two complete versions of the folder. You’ll end up with all files from both sides on both sides, and/or a bunch of conflicts, where you just expected nothing to happen. Much better to delete all of one side and sync it across. It is slow, sadly, but you only have to do it once.
BitTorrent Sync has a ‘relay service’. If two machines are not on the same LAN but do have Internet access, they can talk through the relay service.
I live in the Land Down Under with Slow Internet, so relaying through servers in the US is too slow to be useful. In each folder’s preferences (on every single peer) you need to unclick ‘Use tracker server’, ‘Search DHT network’ and ‘Use relay server when required’. (Try version 1.4.83 if the setting isn’t working.)
Checksumming and transferring large files (like VM images) takes a long time. There is still room for someone to make an efficient VM synchronisation system. It might be impossible to make it ‘nice’, but you could at least provide snapshots or something rather than leaving one side corrupted most of the time. Parallels might do this by accident, but I’m not willing to risk my data to find out.
When you’re setting up to start with, I found it easiest to copy the Share URLs into a file in Dropbox and copy-paste them into BTS on the receiving end.
DO NOT copy the keys into Dropbox if you worry about the NSA reading your data. Those keys don’t expire and give access to your data. Dropbox keeps snapshots of everything and the NSA works with Dropbox. Of course, we can’t audit BTS anyway, so probably best to keep government secrets locked away a little more securely.
Pay attention to the ‘Store deleted files in folder archive’ setting. Definitely turn it off for VM folders.
Deleted or modified files go into an archive folder (
.sync/Archive under the synced folder root). I’ve seen references to a 30 day cleanup period, but am yet to confirm this.
Don’t use BTS to sync your Dropbox folder between machines unless the Dropbox client is only running on one of them. They’ll confuse each other. Dropbox already does LAN sync.
Rather than having many shares, you can store the canonical copy of each folder in a synced folder and then symlink it to where you want it to appear.
Sync isn’t necessarily between two machines. You can sync three or more machines to the same folder.
This would be awesome if you had, say, office workers which need disconnected access to a shared folder. You can then disconnect a machine, keep the local copy, modify it and have your changes sync when you reconnect. This might cut down your need for corporate fileservers and VPNs
If you’re using a Mac, you might want to prevent BTS from syncing .FinderInfo and .ResourceFork files. As of October 2014 (version 1.4.83) they fail to sync but BTS can’t figure out why, causing your folders to perpetually be out of sync. Add the following to the end of .sync/IgnoreList in your folder:
Hopefully you don’t actually need the resource fork for any of your files. Does OS X actually use the resource fork these days?
I had to disconnect the folder from each peer and reconnect it. The peers remember all of the FinderInfo files that they’re meant to be ignoring. Disconnecting forces BTS to start over without the FinderInfo files. Sometimes you can just disconnect and reconnect one peer (usually the one that started with all of the data).
This also highlights a nuisance in the system: configuration is not synced. You need to do this on every single machine. Did I mention that file sync is nontrivial?
.SyncIgnore is not a thing any more. It is now called
.sync/IgnoreList. The format and use of the file is the same, but
.SyncIgnore no longer works.
This is a bit of shame, really, because (I never tested this, but…)
.SyncIgnore could get synced automatically, saving you from manually making the same config changes on each host. Perhaps it caused flapping and conflicts.
There is useful logging in
~/Library/Application Support/BitTorrent Sync/sync.log. You don’t need to turn on debug logging in the menu (it’s extremely verbose).
Sleep mode might be affected on Macs? My MacBook seems to run down the battery while it’s supposed to be sleeping. And it’s definitely awake some of the time – sync continues while it’s asleep. Hopefully it doesn’t do this while it’s disconnected from the network (i.e. in my bag, away from home).
Parallels virtual machines take forever to sync after modification and burn a lot of CPU power.
Hopefully you realised this already, but never boot the same VM image on two different machines at once. BTS will faithfully propagate the changes to the other machines, which will be unaware that their disk images are changing underneath them, and you’ll probably end up with corrupt, unusable VMs everywhere.
I still can’t recommend BTS for synchronising virtual machines. Hashing a 60GB image just takes too long. To reduce the time, you can:
I also see relatively slow sync speeds (3-6MB/sec, even though the network will easily do ten times that). That’s a different avenue that I should explore.
I am tempted to write a VM-specific sync application that solves these problems, but it’s very likely that I can do no better anyway. If you have some spare time and want to try it, I suggest:
Underneath it all, most filesystems (a) do not track changes within a file, and (b) do not checksum files. If you were to put your VM images on say, a ZFS volume, changes to them could be synchronised very quickly and efficiently (seconds, instead of hours) simply because the filesystem already keeps the hashes and diffs that are needed for the synchronisation app to do its job. Without that information the app must scan through the (extremely large) VM images to find the (relatively small) changes that it should propagate.
If you restore a peer from Time Machine, things seem to go screwy. By ‘screwy’, I mean:
If you’re going to restore a peer from Time Machine, I would suggest removing any synced folders from it altogether and resyncing them from the other peers.
Outlook for Mac 2011 stores its data files in Documents, so if you sync that, rename each machine’s identity so they don’t conflict. (You’ll get duplicate emails and error messages.)
Shut down all Office applications
Documents/Microsoft User Data/Office 2011 Identities and rename ‘Main Identity’ (or whatever you use) to something else; I use ‘Main Identity
Open Microsoft Database Utility, click your renamed identity, click the gear icon and click ‘Set as Default’