Docker/Singularity at the Martinos Center
  Due to 
  
  security concerns
with Docker, we do not support running Docker in full access
mode on our Linux
workstations or the compute cluster.  In limited cases, we do support Docker in 
isolation (namespace remap) mode.  This mode lets you build 
Docker containers but the isolation restriction prevents
proper binding of local storage into the container when running them.  This means
the programs in the container cannot operate on your data outside
the container.
 
In the HPC community a popular alternative is 
Singularity.
This lets you run most Docker containers used
for data analysis type workflows.  It also lets you access
the data on storage outside the container with the bind
mechanism giving you the same access you have normally.
However the big issue with Singularity is in general it
requires root access to build new Singularity images from scratch.
But there are a couple of workarounds to that.
 
Setup your environment for Docker/Singularity
Docker is not installed on the center's Linux Workstations by default.
  If you want it installed, you need to request it from the Martinos
  Help desk.  Check for existance of the /var/run/docker.sock file. Singularity, being a simple user-space program is installed
  everywhere.
 
  A very important issue is that Docker and Singularity can end up
    writing many GBs to areas of your home directory that will overflow
    your quota.  To prevent this you must symlink link the directories
    they use to point 
    to other storage volumes you own or in your group.
   
   The two places that need symlinking are ~/.singularity
    and for Docker/Podman ~/.local/share/containers.  Here is an example:
   
  cd /cluster/mygroup/users/raines  #<- change this to your group area
  mkdir singularity
  mkdir singularity/tmp
  mkdir singularity/cache
  rm -rf ~/.singularity
  ln -s $PWD/singularity ~/.singularity
  mkdir docker
  rm -rf ~/.local/share/containers
  ln -s $PWD/docker ~/.local/share/containers 
   (do this or singularity/docker will fill up homedir or workstation OS disk at /tmp)
  
 
NOTE: In some cases SINGULARITY_TMPDIR 
  and SINGULARITY_CACHEDIR must
  be on local disk space instead of network space.  In these cases
  you should temporarily set these in your shell to space under /scratch
  if that exists on the machine.  On the MLSC cluster this is done
  automatically.  You need to do this for builds (see below) 
Running pre-built images from Docker Hub
  
  Here is an example using dhcp-structural-pipeline. 
 
  cd /cluster/mygroup/users/raines
  mkdir dhcp-structural-pipeline
  cd dhcp-structural-pipeline
  singularity pull dhcpSP.sif docker://biomedia/dhcp-structural-pipeline:latest
   (this will take a long time)
  mkdir data
   (copy your T1 and T2 data into new data subdir)
  singularity run -e -B $PWD/data:/data dhcp-SP.sif \
    subject1 session1 44 -T1 /data/T1.nii.gz -T2 /data/T2.nii.gz -t 8 -d /data 
 
The -e option sets a clean environment before running the container. 
  For some software, you don't want to do this in order to pass in settings 
  by enviornment variables.  If you do not use -e then you should 
  remove certain variables that can break the container.  
   
  The LD_LIBRARY_PATH is one example that will really screw things up.
  Variables with XDG and DBUS in their name can also cause
  problems.
  In dhcp-structural-pipeline for example, if you have FSLDIR set to something under /usr/pubsw or /space/freesurfer it will fail since those paths will not be found in the container. Be aware and be careful.
 
Checkout using -e combined with the --env-file option 
  for more consistant control of your shell environment when using containers
 
  Also note you do not need to pull the image every time you run it.
  You only pull it again to get new versions.
 
To make the file path environment inside the container very much like
  it is outside the container when running things normally on Martinos
  Linux machines, you can add the following options: 
  -B /autofs -B /cluster -B /space -B /homes -B /vast -B /usr/pubsw -B /usr/local/freesurfer
 
This would let you source the Freesurfer environment as normal inside
  the container.  For that though the container would be need to be
  one with a non-minimal OS install that would have all the
  system libraries Freesurfer requires.
 
WARNING: The NVIDIA NGC containers do a 'find -L /usr ...' in the
entrypoint script on startup.  So doing -B /usr/pubsw ends up making startup take over 15 minutes as it then searchs the 100's of GBs of files in
/usr/pubsw!  This 'find' is pretty useless so there are two
solutions: 
  - Just do not use -B /usr/pubsw if you don't need that path
    in what you are running
  
 
  - Add -B /cluster/batch/IMAGES/nvidia_entrypoint.sh:/usr/local/bin/nvidia_entrypoint.sh to your Singularity command line to overwrite the entrypoint script with a copy I made that removes the 'find'
  
 
 
  We have also discovered that most Docker images built for NVIDIA GPU
  use in AI/ML try to do some fancy stuff in their entrypoint script 
  on startup to put the "correct" CUDA libs in a directory named
  /usr/local/cuda/compat/lib. You will get errors regarding changing
  this in Singularity since the container's internal filesytem is
  unwritable.  Also your CUDA programs in the container might fail
  if the CUDA libs in that directory are used.
 
To fix this add as an option to singularity
 
  -B /opt:/usr/local/cuda/compat
 
 
  This basically just nullfies that directory so it is not used
  and has no libaries.  Singularity automatically adds to the 
  LD_LIBRARY_PATH defined in the container a directory with the
  correct CUDA libs matching the driver running on the host.
  
  For more info on singularity options run man singularity-run
  or man singularity-exec
  or read the User Guide.
 
  The difference between run and exec is that run
  will run the default entrypoint startup script built-in to the container
  while exec will just run the command you give on the command line
  instead and skip the startup configuration.
 
Building your own Singularity images
Typically, building Singularity images locally requires full 
  root access via sudo which we
  will not give on our Linux workstations. There are two workarounds
  to this. The simplest is that the organization that
  makes Singularity now has a 
    remote build option that works well you can register for.
   
  create Singularity definition file
  singularity remote login
  singularity build --remote myimage.sif Singularity
 
If there is anything sensitive in your build though you should not use this.
  Instead you should pull a SIF image of the base image you want to start
  with and then modify it as shown in the fakeroot/writable example
  in the next section below. 
Modify existing Singularity images
First you should check if you really need to modify the image.
  For example, if you are using Python in an image and simply need
  to add new packages via pip you can do that without
  modifying the image using PYTHONUSERBASE that you bind mount
  into the container.  For example: 
  cd /cluster/itgroup/raines
  mkdir -p local/lib
  vi vars.txt #create it with your favorite editor (emacs, pico)
  cat vars.txt
  ----------------------------------------------------------
  | PYTHONUSERBASE=/cluster/itgroup/raines
  | PYTHONPATH=$PYTHONUSERBASE/lib/python3.7/site-packages
  | PATH=$PYTHONUSERBASE/bin:$PATH
  ----------------------------------------------------------
  singularity exec --nv --env-file vars.txt \
    -B /cluster/itgroup/raines -B /scratch:/scratch \
    -B /autofs -B /cluster -B /space -B /vast \
    /cluster/batch/IMAGES/tensorflow-20.12-tf2-py3.sif \
    pip3 install nibabel
  singularity exec --nv --env-file vars.txt \
    -B /cluster/itgroup/raines -B /scratch:/scratch \
    -B /autofs -B /cluster -B /space -B /vast \
    /cluster/batch/IMAGES/tensorflow-20.12-tf2-py3.sif \
    python3 /cluster/itgroup/raines/script_needing_nibabel_and_TF.py
   
To modify a existing SIF image container file, one needs to first
  convert it to a sandbox, run a shell inside the sandbox in
  fakeroot/writable mode and do the steps in that shell to modify the
  container as desired.  Then you exit the container and convert
  the sandbox to a SIF file.  
 
For this to work
  you will have to email help@nmr.mgh.harvard.edu to request 
  to be added to the /etc/subuid file
  on the machine you will use for builds to turn on user namespace mapping.
  That machine need to have a large /scratch volume too (sandboxes do not
  work on network mounted volumes).  You then
  do something like this example: 
  mkdir -p /scratch/$USER/{tmp,cache}
  cd /scratch/$USER
  export SINGULARITY_TMPDIR=/scratch/$USER/tmp
  export SINGULARITY_CACHEDIR=/scratch/$USER/cache
  singularity build --sandbox --fakeroot myTF \
    /cluster/batch/IMAGES/tensorflow-20.11-tf2-py3.sif
  singularity shell --fakeroot --writable --net myTF
  > apt-get update
  > apt-get install -qqy python3-tk
  > python3 -m pip install matplotlib
  > exit
  singularity build --fakeroot /cluster/mygroup/users/$USER/myTF.sif myTF
 
NOTE: you can do a rm -rf /scratch/$USER afterward
  but there will a be a few files you cannot delete due to the
  namespace mapping that happens.  The daily /scratch cleaner job
  will eventually clean it up.
 
Building your own Docker image and running with Singularity
  There are plenty of tutorials on building Docker images online. You
  should go a read one of them to get started (here is the official one).  
  The main things you 
  need to keep in mind are to tag each build of your image with
  a unique version tag and that you DO NOT need to push/upload the image
  to any hub.  The image you build is not though a single file. It
  is a special overlay that ends up under /var/lib/docker but that
  you never touch directly.  All interaction is via the docker
  subcommands.
 
    docker build --tag proj_A_recon:v1 .
    docker image ls
 
 
  Note not all directives in a Dockerfile will convert to Singularity
  so some should be avoided.  More info can
  be found 
    here. Basically, only the FROM, COPY, ENV, RUN, CMD,
  HEALTHCHECK, WORKDIR and LABEL directives are supported. 
  Directives that effect the eventual runtime of the container
  like VOLUME will not translate.
 
  You can also do a "docker run -it --rm proj_A_recon bash" to
  shell into your container to verify and test things internally.
 
Next step is to convert to a Singularity SIF image.  This will be 
  a single file created in the directory you run the command in.
 
  singularity build proj_A_recon.sif docker-daemon://proj_A_recon:v1
 
And once that is done you can run it.
 
  singularity run -B /cluster/mygroup/data:/data proj_A_recon.sif
 
or 
  singularity exec -B /cluster/mygroup/data:/data proj_A_recon.sif /bin/bash
 
 |