BACKUPS
When someone asks "how safe is my data?", there are at least two
different aspects to what "safe" means: data loss/corruption
and data access security. The first issue is discussed here.
For security issues, see the Security
web page.
Reasons for loss/corruption
Data loss/corruption can happen for many reasons which fall into
three categories: user, system and environment.
The main methods of user-side loss/corruption are:
- Accidental file deletion
- Accidental file overwrite
- Bugs in analysis software
Some major examples of system-side loss/corruption are:
- Hard disk failure
- Hard disk/RAID controller failures
- Operating system (kernel) bugs
- Network failures
Some examples of environment-side loss/corruption are:
- Fire
- Flood
- Electric surges
These are somewhat interrelated. When a user puts their desktop in a
badly ventilated area so that the drives overheat or even worse
the dust bunnies catch the whole system on fire, it is really a user
category problem.
NOTE ON RAID: RAIDs only protect against the system-side hard disk failure
and only when it is just one disk that has failed. It does not protect
against multi-disk failures or any of the other failure modes listed.
RAID IS NOT A REPLACEMENT FOR BACKUP!
And different backup schemes protect against different failures modes
to different degrees as is discussed below.
Backup Policy and Methods
There are in general three main classes of storage in use at the
center: Central Service storage, Central Purchased storage, and
Desktop storage.
Central Service storage includes your home directories (with 2GB quota),
mail boxes, and website directories. This have a weekly mirror backup.
Contact IT Support for recovery.
Central Purchased storage is the NFS storage we sell at $26/TB/month
on central RAID servers. These are most often under /cluster paths.
This have a weekly mirror backup.
Desktop storage is any volume on a user Linus desktop workstation. These
are most often under /space paths. Users are responsible for
arranging backup -- there is no default backup.
One thing users can do is ask us to set up weekly mirror jobs.
For this, the user needs to supply a volume equal to or greater in size to
the volume they want backed up. Preferably, this volume is on another
desktop but at the very least needs to be on a different independent disk
or RAID in the same machine to protect against disk/RAID failure.
We also sell backup-only volumes on the central RAID servers for this
purpose at $13/TB/month.
Your Windows/Mac files
are on your local desktop (except in rare case that the users should
know they are mapping UNIX volumes on their Windows/Mac systems). There
is no central backup of WIndows and Mac local disks.
The history of mirror backup jobs can be searched at
Desktop Backup History Page
Note that when power outages occur that take down desktops, the
backups will not happen. Normally the backups will resume and catchup
the following week when both machines are back up. For failures of
this type, the IT team does not force a by-hand backup. It is the
user's responsibility to request a special makeup backup for that
week.
Therefore users should check that their desktop backups look
"right" once a week and inform the IT group if they find
otherwise. There are some situations in which no error message is
generated, for instance if both machines are down at the backup
times.
Mirror Backup Discussion
With the mirror job, your backup is anywhere from zero to 7 days old.
As long as there is not an error during the most recent backup, one is
well protected against most methods of system-side and
environment-side errors as one can easily recover to the last backup
no more than 7 days old. Of course a fire or similar catastrophe that takes out
the machine with the production volume and backup volume will result
in a complete loss. The more you can geographically separate the two,
the better.
Mirror jobs are not very good in dealing with user-side errors.
This is mainly because
often those errors are not recognized by users right away. Lets say you
accidentally delete something on Friday, the mirror jobs happens on
Sunday, and you realize your mistake on Monday. Well, tough luck. No
recovery at that point.
For questions or to arrange a mirror backup or to ask for a restore
from tape backup, contact the
|