P41-Project 2 Improved Statistical Analysis of fMRI Data

Chris Long, Emery Brown, Vic Solo

The Wavelet Regularised SpatioTemporal toolbox (WRST) comprises a set of matlab-based programs that implement some of our recently developed analysis methods for single subject fMRI data. The WRST toolbox fits a physiological model for the BOLD signal to the motion corrected fMRI data under appropriate noise assumptions, and estimates these noise processes using an EM algorithm. WRST  merges the resulting voxelwise estimates into a spatial estimation procedure that uses a modified wavelet thresholding to preserve spatial heterogeneity in the activation maps while providing the necessary regularisation. Thus the popular approach of spatially presmoothing the fMRI data with a fixed-width Gaussian kernel is not required, since all spatial operations are treated within a full spatiotemporal estimation. The wavelet estimation is nonstandard in this regard since unlike typical spatial wavelet techniques, it is modulated by the temporal characteristics of the data, providing better spatial adaptivity to the functional clusters.


The user may if desired, compare WRST performance with either standard presmoothing, or a related strategy that replaces the wavelet step with Gaussian smoothing. I.e. the signal parameter estimates are smoothed directly as opposed to smoothing the raw data  (presmoothing). This latter choice allows one to directly determine the benefits of adding the wavelet smoothing in the overall procedure. For completeness, the program also includes an option that allows replacement of the modified wavelet procedure, distinct to WRST, with a standard spatial wavelet smoother. This enables the user to contrast between a spatial wavelet + temporal approach and the full spatiotemporal wavelet analysis.


Please note: This program requires matlab 6.1 to run correctly. Later versions of matlab are not yet supported,
although may run if the appropriate matlab C-libraries are present.
  1. Download the compressed tar file.
  2. Uncompress using <gzip -d code.tar.Z> or <uncompress code.tar.Z>
  3. Extract the archive. <tar xvf code.tar>. This will create a subdirectory 'wrst'  in the current directory, wherein 'wrst/toolbox' contains all the scripts. Either point to the 'toolbox' directory in your matlab path, or cd wrst/toolbox to get started quickly.
  4. To point to the scripts in wrst, locate your matlab startup.m file. This is normally located in <home directory>/matlab
Add the commands 'wrsttoolbox = '<your path>/wrst/toolbox'; '


And 'path(path, wrsttoolbox);'
Start matlab and check you can see the directory by typing 'which bd_v4'.
 

Recompile the C code for your platform. In matlab, go to <your path>/wrst/toolbox, and type

     'mex Get_fs_D_SpatWavs_ARMA.c '
     'mex get_resids_mex.c'
     'mex emblock2d_mex_noreg.c'
     'mex get_slope_mex.c'
     'mex emblock2d_mex_v2.c'

If using matlab v6.5, try recompliling the above with the -V5 option.
i.e.
     'mex -V5 Get_fs_D_SpatWavs_ARMA.c '
etc.
Thats all.
To run through an example dataset, go to the Data Analysis section below.


 
  • Setting up the configuration file
Below is an example of a typical configuration file for the WRST program. A full description follows.
Part 1

-TR 2

-FlowTimeConstant 2

-VolumeTimeConstant 11

-DelayRange(Samples) 4

-DelayResolution(Samples) 1

-Slice(s) 9

-TrendRemoval 3

-Task E

-SpatialOption P % G = gaussian smooth, W =stdwavelet, VW= WRST, N=none, P=Presmooth

-FWHM 8 % Filter width choice for pre- or Gaussian smoothing spatial option

-SpikeRemoval 0

-h .05 % Noise regularisation parameter

-K 4 % low frequency cut-off for wavelet transform K<J. J=log2(n)

-lambda .5 % amount by which to scale universal thresholding i.e. lambda*sqrt(2*log(n))

    % rule of thumb for soft thresholding (.5), hard thresholding (1).

Part 2

%Input volume paths go here

-stem /autofs/space/inisheer_001/users/cjl/fbert/bold/007/fmc

-stem /autofs/space/inisheer_001/users/cjl/fbert/bold/008/fmc

-stem /autofs/space/inisheer_001/users/cjl/fbert/bold/009/fmc

% paradigm files go here

-designfile /autofs/space/inisheer_001/users/cjl/fbert/bold/007/sem_assoc.par

-designfile /autofs/space/inisheer_001/users/cjl/fbert/bold/008/sem_assoc.par

-designfile /autofs/space/inisheer_001/users/cjl/fbert/bold/009/sem_assoc.par

%output dirs for estimation parameters go here

-EstimationOutput /autofs/space/inisheer_001/users/cjl/fbert/maincjl2/est-007

-EstimationOutput /autofs/space/inisheer_001/users/cjl/fbert/maincjl2/est-008

-EstimationOutput /autofs/space/inisheer_001/users/cjl/fbert/maincjl2/est-009

%output dirs for Inference go here

-InferenceOutput /autofs/space/inisheer_001/users/cjl/fbert/maincjl2/inf-tot

%contrast file to be used for inference

-ContrastFile /autofs/space/inisheer_001/users/cjl/fbert/maincjl/omnibus.mat

The configuration file provides the program with the location of the functional data and associated paradigm files, and sets up the options for that analysis.  Let us look at Part 2 first. In the example above, the data were acquired from one subject, with three runs on that subject, numbered 007, 008, 009 in this case. The fmc prefix indicates that this data have been motion corrected. For this software, this is the only preprocessing requirement.
Each run requires its own paradigm file and these are listed below the input volumes. Also, you should create a subdirectory (in this
case 'maincjl2'). into which the estimated parameters for each slice and run will be stored. Finally, the inference directory will contain the Pvalues for each slice,  averaged across runs. The Contrast file option is unused at the moment. Currently only crass (or overall) significances across the conditions can be observed, though it is a simple step to implement between-condition contrasts.

Now onto Part 1. To fully understand the relative significance of these parameters, it may be best to refer to some of our publications. However, here we will provide a brief insight into their interpretation wrt to fMRI data analysis.

Time Constants
Our strategy is based on a convolution model, with two components, a blood flow and a blood volume piece to represent cerebral, BOLD-related activity. Naturally, each of these has an associated time-constant, or dispersion, that govern the response characteristics of the cerbrovasculature.  To first order, the values in place  (1.5 or 2 seconds for the flow and a slower, 11 or 12 seconds for the volume) work well for most applications.

Delay
One of the unique features of this package lies in its ability to automatically capture delay on a voxel by voxel basis. As long as slice timings are properly accounted for, this allows the user to investigate the spatial behaviour of the hemodynamic latency term.
For running the program, we constrain the search space for the delay - i.e. we expect the latency to occur anywhere between 4 and 8 seconds. For a repetition time (TR) of 2s, we would expect the delay to occur anytime up to 4 TRs, or samples.
The delay resolution defines the fineness of the grid across which we search. Since all our temporal computations are done in the frequency domain, this resolution can be arbitrarily fine. 

Slice(s)
One slice (give slice number starting from zero), or all slices ('ALL') may be analysed. 

Trend Removal
Specifies the order of low-frequency trend to be removed from each timecourse. 1 is linear, anything higher specifies the maximum number of harmonics (cosines) used to regress out trend (similar to SPM). 

Task
Unused at the moment - always choose E for Estimate - Inference is currently performed simultaneously. 

Spatial Option
At the moment, the program has several spatial options. These can be divided into those based on Gaussian smoothing, and those based on spatial wavelets. The main option 'VW' defines our latest work (see paper) on spatial conditioning of the activation maps, the others options are there mainly for the purposes of comparison. The basic idea of using wavelets is to preserve anatomic information in the activations, and also provide a multiresolutional spatial analysis of these maps. But also, these wavelets
are applied in the context of maximising a spatiotemporal likelihood, eg not applied as an adhoc preprocessing step. The other choices (including W for standard wavelet estimation) and apart from 'P' which does a conventional Gaussian presmoothing, are invoked at a similar point to 'VW' in the computations but while intuitively sensible, are adhoc in the sense that they do not minimise a proper cost function.
For example, very simply, the program (1) computes the noise, (2) uses weighted least squares to get maps of raw regression coefficients, and (3) treats these spatially. (4) Repeat the process. The spatial treatment can be 'none'  'N', 'G' which smooths the  maps using a Gaussian kernel of a particular FWHM (specified below), or 'W' which is a standard wavelet smoothing. However, 'VW' draws on temporal aspects of the data to inform how much spatial smoothing to apply. In this respect, 'VW' is a 'true' spatiotemporal procedure. 

FWHM
Full Width Half Max (mm) of the Gaussian kernel used either to directly smooth the activation coefficients ('G') or to presmooth the data ('P'). 

h
One of the key features of our method relates to the way in which the noise is computed. We assume an ARMA(1,1) temporal noise process, and estimate its parameters pixel by pixel using Expectation-Maximisation. However, we also recognise spatial continuity in these parameters which improves their estimation. This amounts to solving a locally weighted likelihood (see paper). h then is the amount of regularisation or smoothing that is applied to the noise field and its choice is still something of an empirical issue. We find that quite large values for h (between .04 - .1) improve the quality of the activation maps quite significantly. 

K
This parameter is only used for the spatial wavelet methods, 'W' & 'VW' and defines the number of resolutions at which the data is analysed. For example K=4, means that blocks of 2^(4-1) i.e. 8-pixels will be analysed in wavelet space.  The smaller the size of these blocks in wavelet space, the coarser (or larger) the features that can be analysed. E.g. a block of 2*2 pixels would correspond to very low frequencies in the image, eg. those features that extend across half the field of view of the image. For the current application we would not expect features much larger than an eighth of the total size of the image, and this motivates our choice of K=4.

Lambda
This value weights the universal thresholding criterion. The program implements soft thresholding which as a rule of thumb must be about half that of the equivalent hard thresholding rule to maintain comparable SNR yet visually superior maps. (see 'A Wavelet Tour of Signal Processing' , edition 2, by S. Mallat p458-) 



Data Analysis

To demonstrate some of the programs capabilities, we will work through an fMRI dataset that records a memory task, the Sternberg paradigm, which in this case is an event-related, modified version of the original task (see paper). This dataset was kindly provided by Dr. Manoach from the Department of Psychiatry here at MGH.
After unpacking the code, 

  1. Go to wrst/toolbox and open the configuration file 'example.cfg' using a text editor.
  2. Edit the path names <your path>/wrst/data/2000926SH/bold/ etc to reflect the path on your machine.
  3. Make sure you create a directory for the program outputs, eg <your path>/wrst/data/2000926SH/bold/mainvarwavtest/ .In this case ensure mainvarwavtest exists. The parameters for each run will be placed in subdirectories here. Start up matlab. Ideally you already have <your path>/wrst/toolbox placed in your matlab startup file, see Installation.Point to the configuration file. At the matlab prompt >> type 'cfgfile=['example.cfg'];'
  4. Run the script '>>bd_v4'

  5.  
If any problems occur, first check you have the correct location of the data entered in the configuration file.


Also check you are running matlab version 6.1. Earlier versions of matlab may also work, but have not yet been tested. 

The program shows the updated Fstatistic computations from the program - these are accumulated across conditions, (three in this case) and across runs. For slice 6, you should see something like: 

for the F-statistic (left), and the p-values (right), thresholded at p=.01. Note this data has been masked, mainly for computational speed.
To do this, look at the variables 'Fstat' and 'Pvals' in the matlab workspace; EG. the commands used to draw these figures were:
'figure;imagesc(Fstat);colormap spectral;colorbar'
'figure;imagesc((Pvals>2).*Pvals);colormap spectral;colorbar'
However it is also a good idea to mask from the point of view of estimating the noise. Since the noise is smoothed or regularised, masking prevents averaging brain noise with the background noise. The same holds true when regularising or smoothing the signal parameters.
To run unmasked, remove the mask_* files from the directories containing the BOLD data. 

These maps can be overlaid onto corresponding structural scans, using a program such as yakview, which is packaged with the MGH freesurfer distribution. When this is done, we see that most of the active regions are in the Doroslateral Prefrontal Cortex (DLPFC) with some sensory motor area activation.
Naturally, all slices in the study may be processed in this way, by setting the slices option to 'ALL' in the configuration file. Also. the effect of the different spatial processing strategies may be investigated. In the above example we used 'VW' which is the variable wavelet thresholder.
Try running this slice again using 'G' for Gaussian smoothing on the parameters or 'W' for standard wavelet estimation. 

It may also be of interest to inspect the noise and/or delay maps for a given run and slice. To do this, just define one run in the configuration file - you will need to comment out (using '%') the estimation and designfile directories corresponding to the runs that are taken out. 

You can inspect the three noise maps psigma_ar, psigma_whi. and p_ar. These correspond respectively to the parameters of our ARMA(1,1) noise model eg a white noise variance plus correlated physiological noise. The correlated noise consists of an independent (to the white noise) driving noise, and the correlation coefficient. For example, in slice 2 of the Sternberg dataset, you should be able to see the following (unmasked):

As expected the noise is most strongly correlated in the grey matter, indicating the presence of low frequency effects due to breathing, or cardiac pulsations for example. The correlated noise variance shows structure in areas we expect to be associated with experimental effect, perhaps indicating shortfalls in the signal model. 


last updated: Jan 2004