Software

IMAGE PROCESSING

In our experiment, we used FreeSurfer (surfer.nmr.mgh.harvard.edu) (distribution version 5.1) – a freely available, widely used and extensively validated brain MRI analysis software package - to process the structural brain MRI scans and compute morphological measurements.

The FreeSurfer pipeline is fully automatic and includes steps to compute a representation of the cortical surface between white and gray matter, a representation of the pial surface (Dale et al., 1999; Fischl et al., 1999a), and a segmentation of white matter regions; to perform skull stripping, B1 bias field correction, nonlinear registration of the cortical surface of an individual with a stereotaxic atlas (Fischl et al., 1999b), labeling of regions of the cortical surface (Fischl et al., 2004), and labeling of sub-cortical brain structures (Fischl et al., 2002). Furthermore, for each MRI scan, FreeSurfer automatically computes subject-specific thickness measurements across the entire cortical mantle and within anatomically defined cortical regions of interest (ROIs), volume estimates of a wide range of sub-cortical structures and estimates of the intra-cranial volume (ICV) and measures of image quality, such as white-matter signal to noise ratio (WM-SNR), which is computed based on the noise level (standard deviation of intensities) within the white matter.

Please go to FreeSurfer's website to download and install this software.

An alternative software package to process structural brain MRI scans is SPM, which can be found here.

MACHINE LEARNING ALGORITHMS

We employed publicly available implementations of the following three different classes of MVPA algorithms. 

1) The Support Vector Machine (SVM) is one of the most popular generic machine learning methods. In our experiments we used the publicly available implementation LibSVM (csie.ntu.edu.tw/~cjlin/libsvm). For the classification problems we used radial basis function (RBF) kernel, for which the parameters are optimized using a cross-validation loop over the training dataset (using the “grid.py” tool available on the LibSVM website). We preferred the RBF kernel because it yielded higher accuracies in the cross-validation loops compared to other kernels. We trained the SVM model for probability estimates. These estimates are directly used for the ROC analysis and thresholded at p=0.5 to compute the correct classification ratio. For the regression problem, we used a linear kernel and optimized its parameters as in the classification case. As before, the linear kernel yielded higher accuracy than the other kernels in the cross-validation loops.  
 
2) The Neighborhood Approximation Forest (NAF) is a generic variant of random decision forests (Konukoglu et al., 2013). The underlying principle of NAF is to approximate the “closest” training images to a given test image. The proximity between images is defined based on the variable of interest, such as diagnosis. During training, NAF learns to estimate the closest neighbors based on the image-derived measurements, such as ROI volumes or cortical thickness measurements. For a test image, NAF estimates its closest neighbors within the training set along with a weight associated with each neighbor indicating its approximate proximity to the test image. The prediction is then given as the weighted average of the labels of these closest neighbors. For all the experiments we use 15 closest neighbors both for binary and continuous variables. Further parameters of NAF are set heuristically based on experiments provided in previous publications13. These are: number of trees = 800, maximum tree depth = 12, stopping criteria = 10 samples and number of random samples per node = 20 for feature sets 1-3 and 1000 for feature set 4.  (http://www.nmr.mgh.harvard.edu/~enderk/software.html)
 
3) The Relevance Voxel Machine (RVoxM, tinyurl.com/rvoxm) (Sabuncu and Van Leemput, 2012), is an adaptation of Tipping's Bayesian Relevance Vector Machine (RVM) customized to handle image data. The RVM model assumes that the target variable is a noisy observation of a linear weighted sum of the feature data. For regression, the noise is an additive Gaussian model. For classification, a logistic link function is used. RVM builds on MacKay’s Automatic Relevance Determination (ARD) framework and employs a Gaussian prior on the weight parameters, which are (approximately) integrated (or marginalized) out during learning and prediction. RVM’s prior encourages sparsity, i.e., a small number of non-zero weights. RVoxM modifies this prior to also encourage spatial smoothness. We note that for Feature set 4 (thick), we utilized the neighborhood structure of the fsaverage5 surface mesh to define the Laplacian matrix that encourages the weights to be spatially smooth. For feature sets 1-3, we used no spatial smoothness, i.e., Laplacian term. Thus for the aseg and aparc features, the RVoxM model was essentially equivalent to a RVM model on the feature dimensions. We therefore refer to this algorithm as RVM.
 

 

REFERENCES

Dale, A.M., Fischl, B., Sereno, M.I., 1999. Cortical surface-based analysis: I. Segmentation and surface reconstruction. Neuroimage 9, 179-194.
 
Fischl, B., Salat, D.H., Busa, E., Albert, M., Dieterich, M., Haselgrove, C., van der Kouwe, A., Killiany, R., Kennedy, D., Klaveness, S., 2002. Whole brain segmentation: automated labeling of neuroanatomical structures in the human brain. Neuron 33, 341-355.
 
Fischl, B., Sereno, M.I., Dale, A.M., 1999a. Cortical surface-based analysis: II: Inflation, flattening, and a surface-based coordinate system. Neuroimage 9, 195-207.
 
Fischl, B., Sereno, M.I., Tootell, R.B., Dale, A.M., 1999b. High-resolution intersubject averaging and a coordinate system for the cortical surface. Human brain mapping 8, 272-284.
 
Fischl, B., Van Der Kouwe, A., Destrieux, C., Halgren, E., Ségonne, F., Salat, D.H., Busa, E., Seidman, L.J., Goldstein, J., Kennedy, D., 2004. Automatically parcellating the human cerebral cortex. Cerebral cortex 14, 11-22.
 
Sabuncu, M. & Van Leemput, K. The Relevance Voxel Machine (RVoxM): A Self-tuning Bayesian Model for Informative Image-based Prediction. IEEE Transactions on Medical Imaging (2012).
 
Konukoglu, E., Glocker, B., Zikic, D. & Criminisi, A. Neighbourhood Approximation using Randomized Forests. Medical image Analysis (2013).