Data Storage Space
Groups beginning new funded projects are advised to purchase their
own disk space for analysis and storage of data. You can get disk space
for data in one of two ways: buy space on the Martinos Center storage
cluster at $306/TB/year, OR buy workstations with large terabyte-sized
disks and have the IT group partition and export them.
We highly recommend buying space on our storage cluster. This space
is on high availability, high performance equipment with high speed
connection to our compute cluster. It comes with a weekly mirror backup
and groups allocations can be sized to many tens of terabytes. To
purchase storage cluster space, fill out our
Cluster Storage Purchase form.
Or when you buy your workstation(s), you can
make sure to buy lots of disk space internal to the box to use.
This can be a bit cheaper than the storage cluster but has several
drawbacks. You will need to supply your own backup (usually by buying
twice the disk you need and setting up weekly mirror jobs).
Lots of disk in your workstation will make your office much hotter
and noisier and may trip your power breakers. Performace will be
horrible or even unusable from jobs you run on the compute cluster.
Stability will be as stable as your workstation (no where near the
stability of the compute cluster).
All data storage spaces will be in Linux under the /space and /cluster
name spaces. Anything under /cluster is on the storage cluster. Some
things in /space have been moved to /cluster and just point there now.
Checking Disk Free/Usage
The normal command in UNIX for checking how much space is free on a disk
is the 'df' command. This works fine for the /space volume tree where
we place all the workstations disks and some of the older storage servers.
However, for the new /cluster volumes one must determine how much
of the quota allocated is left by running the clusterquota
command. For example, to check the space left on the /cluster/itgroup
volume, one would run:
Temporary Scratch Areas
Limited areas of temporary disk space are available to all for data
analysis. Users must remove data from these disks when
analyses are complete. These spaces are cleaned weekly and are not
backed up by the Martinos Center.
Any old data left on the day-of-the-week spaces will be removed
early in the morning of the named day (that is, /space/monday is
cleaned on Monday morning, etc.)
All other /space and /cluster filesystems are project specific and
can be used only with permission of the owner of that space.
Home directories are NOT to be used for image data. They are for
code, documents, summarized results, etc. This covers anything under
the /homes namespace.
Home directory disk space on central UNIX/linux based systems is
limited to a 2GB quota for each user. To determine how much is currently being
used, run the quota command. To figure out what files/dirs are using
all the space in your home directory, type the following command at a UNIX/linux command line from
your home directory
du -sk .??* * | sort -n
The resulting numbers will be in KiloBytes. The sort will order the
results so the biggest space hogs are last.
See the Backups page for information
on storage volume backups.
See the Understanding Group Permissions in UNIX
page for information on general file permissions in UNIX and how
to set up group areas for shared write access by the whole group.
See the Security page for information
about data file security issues.
It is important to note that NO NETWORK DISK SPACE in the Martinos
center is safe for HIPAA sensitive files.