Automatic backups



Learn about rsync

  • is our backup server. It doesn't provide any special services for doing backups. We just use rsync to copy stuff onto it. (rsync is a tool that computes the difference between two folders and just copies what is necessary to make them the same.) Example:
rsync -a --delete --exclude '*~' --exclude '*.o' /home/yourname/yourprojectsfolder
Note that the "--delete" flag tells it to delete files from the backup machine when you delete them from your machine. If you omit this flag, then the backup machine will accumulate historical junk. This can be useful if you have a habit of deleting things and then eventually regretting it, but you will need to periodically clean up the backup machine to be sure that you're not wasting space. Also, if your machine fails and you need to restore everything from the backup, you might wish the historical junk wasn't there. I recommend keeping the "--delete" flag. Backups are good for protecting from hardware failures. They aren't a very good solution for sloppiness.
Note that the "--exclude" flag enables you to avoid backing up junk that is difficult to keep out of your projects folder.
Note that the "-z" flag is useful if you are transmitting files over the Internet. It compresses the files during transmission. Since we're just sending them over a local network, that's not really necessary.

How to set up your machine to automatically make backups

On Linux

Set up an account on the backup machine

  • ssh (Log into the backup server using the lab password)
  • sudo adduser yourname (Make an account for yourself. Replace "yourname" with your username)
  • exit (log out)

Set up ssh without using a password:

  • Create authentication keys
ssh-keygen -t rsa
(Do NOT enter a pass phrase.)
  • Create the .ssh directory on the backup machine
ssh mkdir -p .ssh
  • Copy the public key to the backup machine
cat ~/.ssh/ | ssh 'cat >> .ssh/authorized_keys'
  • TROUBLESHOOTING: If you are still having trouble, try changing the permissions for some directories on the backup machine. This is needed because if the directories have write permissions ssh will decline the connection because it feels it is too insecure.
chmod 700 .ssh
chmod 600 .ssh/authorized_keys
  • Create a directory for the backups
sudo mkdir /media/backupdrive/yourname
sudo chown yourname /media/backupdrive/yourname

RSYNC with the backup server

  • Please make sure you have a place on your machine where you put your temporary junk, and another place where you put things that have value, so you don't swamp our backup server with garbage.
  • make a file somewhere called "backup.bash" with this line:
rsync -a --delete --exclude '*~' --exclude '*.o' /home/yourname/yourprojectsfolder
(You can also add multiple lines with different folders to back up multiple folders. Be sure to omit the backslash "/" so that rsync creates the folder on the backup machine.)
  • chmod 755 backup.bash
(Now you can make backups whenever you want by running your backup script)
(Be sure to test it before you put it in your crontab)
  • crontab -e
37 3 * * 6 /home/yourname/bin/backup.bash
(This will run your backup script every Saturday morning at 3:37am. You should change the time so everyone doesn't hit the backup server at the same time.)

On Windows

  • Install Cygwin (ssh keys are really finicky between windows and linux machines, cygwin can bridge this for you)
  • In Cygwin, follow the same steps as for Linux machines except for the crontab step
  • Write a bash script with the rsync command in place where cygwin can find it (This implies that your project folder is accessible by Cygwin)
  • In Windows, open up the Task Schedule
  • Make sure to Set up a trigger for the appropriate time (and set it to repeat)
  • Create an action with the following:
C:\cygwin\bin\bash.exe --login "/myfolder/"
  • You can tell the task to run immediately to test the functionality

What to do when the backup drive gets full

The drive has 2 terabytes. When we bought it, that was a lot. It is likely that someone is just wasting space in an irresponsible manner. Here's a useful script that will help you find who is hogging all the space. It prints the largest files and folders in the current directory. So, if you run it from the home directory on lobe, you will see who is to blame. You can then change into their home folder and see which sub-folder is to blame, etc.

set -u -e
ls -alS | head -n 7
du --max-depth 1 | sort -g -r | head -n 7
Personal tools
  • Log in