BACKUPS

When someone asks "how safe is my data?", there are at least two different aspects to what "safe" means: data loss/corruption and data access security. The first issue is discussed here. For security issues, see the Security web page.

Reasons for loss/corruption

Data loss/corruption can happen for many reasons which fall into three categories: user, system and environment.

The main methods of user-side loss/corruption are:

  1. Accidental file deletion
  2. Accidental file overwrite
  3. Bugs in analysis software

Some major examples of system-side loss/corruption are:

  1. Hard disk failure
  2. Hard disk/RAID controller failures
  3. Operating system (kernel) bugs
  4. Network failures

Some examples of environment-side loss/corruption are:

  1. Fire
  2. Flood
  3. Electric surges

These are somewhat interrelated. When a user puts their desktop in a badly ventilated area so that the drives overheat or even worse the dust bunnies catch the whole system on fire, it is really a user category problem.

NOTE ON RAID: RAIDs only protect against the system-side hard disk failure and only when it is just one disk that has failed. It does not protect against multi-disk failures or any of the other failure modes listed.

RAID IS NOT A REPLACEMENT FOR BACKUP!

And different backup schemes protect against different failures modes to different degrees as is discussed below.

Backup Policy and Methods

There are in general three main classes of storage in use at the center: Central Service storage, Central Purchased storage, and Desktop storage.

Central Service storage includes your home directories (with 2GB quota), mail boxes, and website directories. This have a weekly mirror backup. Contact IT Support for recovery.

Central Purchased storage is the NFS storage we sell at $26/TB/month on central RAID servers. These are most often under /cluster paths. This have a weekly mirror backup.

Desktop storage is any volume on a user Linus desktop workstation. These are most often under /space paths. Users are responsible for arranging backup -- there is no default backup. One thing users can do is ask us to set up weekly mirror jobs. For this, the user needs to supply a volume equal to or greater in size to the volume they want backed up. Preferably, this volume is on another desktop but at the very least needs to be on a different independent disk or RAID in the same machine to protect against disk/RAID failure. We also sell backup-only volumes on the central RAID servers for this purpose at $13/TB/month.

Your Windows/Mac files are on your local desktop (except in rare case that the users should know they are mapping UNIX volumes on their Windows/Mac systems). There is no central backup of WIndows and Mac local disks.

The history of mirror backup jobs can be searched at

Desktop Backup History Page

Note that when power outages occur that take down desktops, the backups will not happen. Normally the backups will resume and catchup the following week when both machines are back up. For failures of this type, the IT team does not force a by-hand backup. It is the user's responsibility to request a special makeup backup for that week.

Therefore users should check that their desktop backups look "right" once a week and inform the IT group if they find otherwise. There are some situations in which no error message is generated, for instance if both machines are down at the backup times.

Mirror Backup Discussion

With the mirror job, your backup is anywhere from zero to 7 days old. As long as there is not an error during the most recent backup, one is well protected against most methods of system-side and environment-side errors as one can easily recover to the last backup no more than 7 days old. Of course a fire or similar catastrophe that takes out the machine with the production volume and backup volume will result in a complete loss. The more you can geographically separate the two, the better.

Mirror jobs are not very good in dealing with user-side errors. This is mainly because often those errors are not recognized by users right away. Lets say you accidentally delete something on Friday, the mirror jobs happens on Sunday, and you realize your mistake on Monday. Well, tough luck. No recovery at that point.

For questions or to arrange a mirror backup or to ask for a restore from tape backup, contact the

Contact the Webmaster