Data Storage Space
Groups beginning new funded projects are advised to purchase their
own disk space for analysis and storage of data. You can get disk space
for data in one of two ways: buy space on the Partners Research Computing
ERIS cluster storage servers, OR buy workstations with large terabyte-sized
disks and have the IT group partition and export them.
We highly recommend buying space on ERIS cluster. This space
is on high availability, high performance equipment with high speed connection
to our compute cluster. We can setup a weekly mirror backup to cheaper, slower
network disk servers. The total cost for the ERIS storage and backup will run
about $390/TB/year. Groups allocations can be sized to many tens of terabytes.
Contact us at firstname.lastname@example.org for more details.
The GPFS cluster storage we got on a 2009 Shared Instrument Grant is now
fully allocated and no longer available for new requests. If you need more
space on a current GPFS cluster volume contact us about options.
Or when you buy your workstation(s), you can
make sure to buy lots of disk space internal to the box to use.
This can be a bit cheaper than the storage cluster but has several
drawbacks. You will need to supply your own backup (usually by buying
twice the disk you need and setting up weekly mirror jobs).
Lots of disk in your workstation will make your office much hotter
and noisier and may trip your power breakers. Performance will be
horrible or even unusable from jobs you run on the compute cluster.
Stability will be as stable as your workstation (no where near the
stability of the compute cluster).
All data storage spaces will be in Linux under the /space and /cluster
name spaces. Anything under /cluster is on the storage cluster. Some
things in /space have been moved to /cluster and just point there now.
Checking Disk Free/Usage
The normal command in UNIX for checking how much space is free on a disk
is the 'df' command. This works fine for the /space volume tree where
we place all the workstations disks and some of the older storage servers.
However, for the new /cluster volumes one must determine how much
of the quota allocated is left by running the clusterquota
command. For example, to check the space left on the /cluster/itgroup
volume, one would run:
Temporary Scratch Areas
Limited areas of temporary disk space are available to all for data
analysis. Users must remove data from these disks when
analyses are complete. These spaces are cleaned weekly and are not
backed up by the Martinos Center.
Any old data left on the day-of-the-week spaces will be removed
early in the morning of the named day (that is, /space/monday is
cleaned on Monday morning, etc).
Please run 'clusterquota scratch' to see how much space is available before you copy data to any of these partitions.
All other /space and /cluster filesystems are project specific and
can be used only with permission of the owner of that space.
Home directories are NOT to be used for image data. They are for
code, documents, summarized results, etc. This covers anything under
the /homes namespace.
Home directory disk space on central UNIX/linux based systems is
limited to a 2GB quota for each user. To determine how much is currently being
used, run the quota command. To figure out what files/dirs are using
all the space in your home directory, type the following command at a UNIX/linux command line from
your home directory
du -sk .??* * | sort -n
The resulting numbers will be in KiloBytes. The sort will order the
results so the biggest space hogs are last.
See the Backups page for information
on storage volume backups.
See the Understanding Group Permissions in UNIX
page for information on general file permissions in UNIX and how
to set up group areas for shared write access by the whole group.
See the Security page for information
about data file security issues.
It is important to note that NO NETWORK DISK SPACE in the Martinos
center is safe for HIPAA sensitive files.