Disk Space

Data Storage Space

Groups beginning new funded projects are advised to purchase their own disk space for analysis and storage of data. You can get disk space for data in one of two ways: buy space on the Martinos Center storage cluster at $306/TB/year, OR buy workstations with large terabyte-sized disks and have the IT group partition and export them.

We highly recommend buying space on our storage cluster. This space is on high availability, high performance equipment with high speed connection to our compute cluster. It comes with a weekly mirror backup and groups allocations can be sized to many tens of terabytes. To purchase storage cluster space, fill out our Cluster Storage Purchase form.

Or when you buy your workstation(s), you can make sure to buy lots of disk space internal to the box to use. This can be a bit cheaper than the storage cluster but has several drawbacks. You will need to supply your own backup (usually by buying twice the disk you need and setting up weekly mirror jobs). Lots of disk in your workstation will make your office much hotter and noisier and may trip your power breakers. Performace will be horrible or even unusable from jobs you run on the compute cluster. Stability will be as stable as your workstation (no where near the stability of the compute cluster).

All data storage spaces will be in Linux under the /space and /cluster name spaces. Anything under /cluster is on the storage cluster. Some things in /space have been moved to /cluster and just point there now.

Checking Disk Free/Usage

The normal command in UNIX for checking how much space is free on a disk is the 'df' command. This works fine for the /space volume tree where we place all the workstations disks and some of the older storage servers. Example:

df /space/sake/5/users

However, for the new /cluster volumes one must determine how much of the quota allocated is left by running the clusterquota command. For example, to check the space left on the /cluster/itgroup volume, one would run:

clusterquota itgroup

Temporary Scratch Areas

Limited areas of temporary disk space are available to all for data analysis. Users must remove data from these disks when analyses are complete. These spaces are cleaned weekly and are not backed up by the Martinos Center.

/cluster/scratch/monday
/cluster/scratch/tuesday
/cluster/scratch/wednesday
/cluster/scratch/thursday
/cluster/scratch/friday

Any old data left on the day-of-the-week spaces will be removed early in the morning of the named day (that is, /space/monday is cleaned on Monday morning, etc).

Please run 'clusterquota scratch' to see how much space is available before you copy data to any of these partitions.

All other /space and /cluster filesystems are project specific and can be used only with permission of the owner of that space.

Home Directories

Home directories are NOT to be used for image data. They are for code, documents, summarized results, etc. This covers anything under the /homes namespace.

Home directory disk space on central UNIX/linux based systems is limited to a 2GB quota for each user. To determine how much is currently being used, run the quota command. To figure out what files/dirs are using all the space in your home directory, type the following command at a UNIX/linux command line from your home directory

du -sk .??* * | sort -n

The resulting numbers will be in KiloBytes. The sort will order the results so the biggest space hogs are last.

Other

See the Backups page for information on storage volume backups.

See the Understanding Group Permissions in UNIX page for information on general file permissions in UNIX and how to set up group areas for shared write access by the whole group.

See the Security page for information about data file security issues. It is important to note that NO NETWORK DISK SPACE in the Martinos center is safe for HIPAA sensitive files.

Contact the Webmaster