rsync Tutorial

When copying or moving large numbers of files, the generic UNIX utilities cp and mv are actually dangerous. Since the operations can take a long time, there is a fair chance something will happen to interrupt or stop the copy or move. When that happens on a move operation, your data will be in a inconsistant state with part of it still in the original location and part of it in the target destination. Even for a plain copy, restarting the copy is less than ideal as it will recopy everything that already got copied. This same issue applies to the SSH copy utility scp.

The rsync utility is a very advanced file transfer utility that solves these issues. It is installed on both OSX and Linux. If you look at the man page for rsync you will see it has a ton of options. Don't let that apparent complexity scare you. Using it for most copy or move jobs is very simple.

Simple Copy/Move Example

Take a look at this example of copying /source/dir/to/copy into /target/dir

rsync -a /source/dir/to/copy /target/dir/

The end result will be a copy of /source/dir/to/copy located at /target/dir/copy. You can at this point actually run the exactly same command again. In fact you should do this to verify the copy. rsync looks at each file and only copies over what is not present or is different at the target destination. If you want to see each file that gets copied as it happens, add the -v option. Also, you can add --progress to get a progress bar on each files which is helpful when you have very large files.

If your intention was to move the data instead of just copy, you would then just run

rm -r /source/dir/to/copy

GOTCHA WARNING: one thing to be careful of is trailing slashes. Normally you NEVER want a trailing slash on the source directory but you DO want a trailing slash on the target directory. See the man page for more info.

File Syncing

In the simple example above, if there are files in the target destination that are not present at the source, they will be left alone and not touched. Sometimes you want to the target destination to become an exact copy of the source, aka "a mirror". To do that you want files on the target destination side to be deleted if they do not exist at the source. To do this you simply add the --delete option to rsync.

rsync -a --delete /source/dir/to/copy /target/dir/

Now any files under /target/dir/copy that are not also present under /source/dir/to/copy will be deleted.

File Transfer over the Network

If you want to copy/move files to a directory over the network to another computer, you simply need to preface the destination directory with the hostname of the remote computer to copy to followed by a colon.

rsync -a /source/dir/to/copy remotehost:/target/dir/

You will be prompted for your password on the remote host before the copy starts. If your user name is different on the remotehost than on the computer you are running rsync, then you need to specify username@remotehost rather than just remotehost.

If you are using rsync from a remote site outside MGH, please use the gateway server for your data transfer purposes.

Futher Examples

Please check out the man page for other examples of rsync usage and how to use more advanced options.

Contact the Webmaster