{ "nbformat_minor": 0, "nbformat": 4, "cells": [ { "execution_count": null, "cell_type": "code", "source": [ "%matplotlib inline" ], "outputs": [], "metadata": { "collapsed": false } }, { "source": [ "\n\n# Permutation t-test on toy data with spatial clustering\n\n\nFollowing the illustrative example of Ridgway et al. 2012 [1]_,\nthis demonstrates some basic ideas behind both the \"hat\"\nvariance adjustment method, as well as threshold-free\ncluster enhancement (TFCE) [2]_ methods in mne-python.\n\nThis toy dataset consists of a 40 x 40 square with a \"signal\"\npresent in the center (at pixel [20, 20]) with white noise\nadded and a 5-pixel-SD normal smoothing kernel applied.\n\nIn the top row plot the T statistic over space, peaking toward the\ncenter. Note that it has peaky edges. Second, with the \"hat\" variance\ncorrection/regularization, the peak becomes correctly centered. Third,\nthe TFCE approach also corrects for these edge artifacts. Fourth, the\nthe two methods combined provide a tighter estimate, for better or\nworse.\n\nNow considering multiple-comparisons corrected statistics on these\nvariables, note that a non-cluster test (e.g., FDR or Bonferroni) would\nmis-localize the peak due to sharpness in the T statistic driven by\nlow-variance pixels toward the edge of the plateau. Standard clustering\n(first plot in the second row) identifies the correct region, but the\nwhole area must be declared significant, so no peak analysis can be done.\nAlso, the peak is broad. In this method, all significances are\nfamily-wise error rate (FWER) corrected, and the method is\nnon-parametric so assumptions of Gaussian data distributions (which do\nactually hold for this example) don't need to be satisfied. Adding the\n\"hat\" technique tightens the estimate of significant activity (second\nplot). The TFCE approach (third plot) allows analyzing each significant\npoint independently, but still has a broadened estimate. Note that\nthis is also FWER corrected. Finally, combining the TFCE and \"hat\"\nmethods tightens the area declared significant (again FWER corrected),\nand allows for evaluation of each point independently instead of as\na single, broad cluster.\n\n

Note

This example does quite a bit of processing, so even on a\n fast machine it can take a few minutes to complete.

\n\n" ], "cell_type": "markdown", "metadata": {} }, { "execution_count": null, "cell_type": "code", "source": [ "# Authors: Eric Larson \n# License: BSD (3-clause)\n\nimport numpy as np\nfrom scipy import stats\nfrom functools import partial\nimport matplotlib.pyplot as plt\n# this changes hidden MPL vars:\nfrom mpl_toolkits.mplot3d import Axes3D # noqa\n\nfrom mne.stats import (spatio_temporal_cluster_1samp_test,\n bonferroni_correction, ttest_1samp_no_p)\n\ntry:\n from sklearn.feature_extraction.image import grid_to_graph\nexcept ImportError:\n from scikits.learn.feature_extraction.image import grid_to_graph\n\nprint(__doc__)" ], "outputs": [], "metadata": { "collapsed": false } }, { "source": [ "Set parameters\n--------------\n\n" ], "cell_type": "markdown", "metadata": {} }, { "execution_count": null, "cell_type": "code", "source": [ "width = 40\nn_subjects = 10\nsignal_mean = 100\nsignal_sd = 100\nnoise_sd = 0.01\ngaussian_sd = 5\nsigma = 1e-3 # sigma for the \"hat\" method\nthreshold = -stats.distributions.t.ppf(0.05, n_subjects - 1)\nthreshold_tfce = dict(start=0, step=0.2)\nn_permutations = 1024 # number of clustering permutations (1024 for exact)" ], "outputs": [], "metadata": { "collapsed": false } }, { "source": [ "Construct simulated data\n------------------------\n\nMake the connectivity matrix just next-neighbor spatially\n\n" ], "cell_type": "markdown", "metadata": {} }, { "execution_count": null, "cell_type": "code", "source": [ "n_src = width * width\nconnectivity = grid_to_graph(width, width)\n\n# For each \"subject\", make a smoothed noisy signal with a centered peak\nrng = np.random.RandomState(42)\nX = noise_sd * rng.randn(n_subjects, width, width)\n# Add a signal at the dead center\nX[:, width // 2, width // 2] = signal_mean + rng.randn(n_subjects) * signal_sd\n# Spatially smooth with a 2D Gaussian kernel\nsize = width // 2 - 1\ngaussian = np.exp(-(np.arange(-size, size + 1) ** 2 / float(gaussian_sd ** 2)))\nfor si in range(X.shape[0]):\n for ri in range(X.shape[1]):\n X[si, ri, :] = np.convolve(X[si, ri, :], gaussian, 'same')\n for ci in range(X.shape[2]):\n X[si, :, ci] = np.convolve(X[si, :, ci], gaussian, 'same')" ], "outputs": [], "metadata": { "collapsed": false } }, { "source": [ "Do some statistics\n------------------\n\n

Note

X needs to be a multi-dimensional array of shape\n samples (subjects) x time x space, so we permute dimensions:

\n\n" ], "cell_type": "markdown", "metadata": {} }, { "execution_count": null, "cell_type": "code", "source": [ "X = X.reshape((n_subjects, 1, n_src))" ], "outputs": [], "metadata": { "collapsed": false } }, { "source": [ "Now let's do some clustering using the standard method.\n\n

Note

Not specifying a connectivity matrix implies grid-like connectivity,\n which we want here:

\n\n" ], "cell_type": "markdown", "metadata": {} }, { "execution_count": null, "cell_type": "code", "source": [ "T_obs, clusters, p_values, H0 = \\\n spatio_temporal_cluster_1samp_test(X, n_jobs=1, threshold=threshold,\n connectivity=connectivity,\n tail=1, n_permutations=n_permutations)\n\n# Let's put the cluster data in a readable format\nps = np.zeros(width * width)\nfor cl, p in zip(clusters, p_values):\n ps[cl[1]] = -np.log10(p)\nps = ps.reshape((width, width))\nT_obs = T_obs.reshape((width, width))\n\n# To do a Bonferroni correction on these data is simple:\np = stats.distributions.t.sf(T_obs, n_subjects - 1)\np_bon = -np.log10(bonferroni_correction(p)[1])\n\n# Now let's do some clustering using the standard method with \"hat\":\nstat_fun = partial(ttest_1samp_no_p, sigma=sigma)\nT_obs_hat, clusters, p_values, H0 = \\\n spatio_temporal_cluster_1samp_test(X, n_jobs=1, threshold=threshold,\n connectivity=connectivity,\n tail=1, n_permutations=n_permutations,\n stat_fun=stat_fun, buffer_size=None)\n\n# Let's put the cluster data in a readable format\nps_hat = np.zeros(width * width)\nfor cl, p in zip(clusters, p_values):\n ps_hat[cl[1]] = -np.log10(p)\nps_hat = ps_hat.reshape((width, width))\nT_obs_hat = T_obs_hat.reshape((width, width))\n\n# Now the threshold-free cluster enhancement method (TFCE):\nT_obs_tfce, clusters, p_values, H0 = \\\n spatio_temporal_cluster_1samp_test(X, n_jobs=1, threshold=threshold_tfce,\n connectivity=connectivity,\n tail=1, n_permutations=n_permutations)\nT_obs_tfce = T_obs_tfce.reshape((width, width))\nps_tfce = -np.log10(p_values.reshape((width, width)))\n\n# Now the TFCE with \"hat\" variance correction:\nT_obs_tfce_hat, clusters, p_values, H0 = \\\n spatio_temporal_cluster_1samp_test(X, n_jobs=1, threshold=threshold_tfce,\n connectivity=connectivity,\n tail=1, n_permutations=n_permutations,\n stat_fun=stat_fun, buffer_size=None)\nT_obs_tfce_hat = T_obs_tfce_hat.reshape((width, width))\nps_tfce_hat = -np.log10(p_values.reshape((width, width)))" ], "outputs": [], "metadata": { "collapsed": false } }, { "source": [ "Visualize results\n-----------------\n\n" ], "cell_type": "markdown", "metadata": {} }, { "execution_count": null, "cell_type": "code", "source": [ "fig = plt.figure(facecolor='w')\n\nx, y = np.mgrid[0:width, 0:width]\nkwargs = dict(rstride=1, cstride=1, linewidth=0, cmap='Greens')\n\nTs = [T_obs, T_obs_hat, T_obs_tfce, T_obs_tfce_hat]\ntitles = ['T statistic', 'T with \"hat\"', 'TFCE statistic', 'TFCE w/\"hat\" stat']\nfor ii, (t, title) in enumerate(zip(Ts, titles)):\n ax = fig.add_subplot(2, 4, ii + 1, projection='3d')\n ax.plot_surface(x, y, t, **kwargs)\n ax.set_xticks([])\n ax.set_yticks([])\n ax.set_title(title)\n\np_lims = [1.3, -np.log10(1.0 / n_permutations)]\npvals = [ps, ps_hat, ps_tfce, ps_tfce_hat]\ntitles = ['Standard clustering', 'Clust. w/\"hat\"',\n 'Clust. w/TFCE', 'Clust. w/TFCE+\"hat\"']\naxs = []\nfor ii, (p, title) in enumerate(zip(pvals, titles)):\n ax = fig.add_subplot(2, 4, 5 + ii)\n plt.imshow(p, cmap='Purples', vmin=p_lims[0], vmax=p_lims[1])\n ax.set_xticks([])\n ax.set_yticks([])\n ax.set_title(title)\n axs.append(ax)\n\nplt.tight_layout()\nfor ax in axs:\n cbar = plt.colorbar(ax=ax, shrink=0.75, orientation='horizontal',\n fraction=0.1, pad=0.025)\n cbar.set_label('-log10(p)')\n cbar.set_ticks(p_lims)\n cbar.set_ticklabels(['%0.1f' % p for p in p_lims])\n\nplt.show()" ], "outputs": [], "metadata": { "collapsed": false } }, { "source": [ "References\n----------\n.. [1] Ridgway et al. 2012, \"The problem of low variance voxels in\n statistical parametric mapping; a new hat avoids a 'haircut'\",\n NeuroImage. 2012 Feb 1;59(3):2131-41.\n\n.. [2] Smith and Nichols 2009, \"Threshold-free cluster enhancement:\n addressing problems of smoothing, threshold dependence, and\n localisation in cluster inference\", NeuroImage 44 (2009) 83-98.\n\n" ], "cell_type": "markdown", "metadata": {} } ], "metadata": { "kernelspec": { "display_name": "Python 2", "name": "python2", "language": "python" }, "language_info": { "mimetype": "text/x-python", "nbconvert_exporter": "python", "name": "python", "file_extension": ".py", "version": "2.7.13", "pygments_lexer": "ipython2", "codemirror_mode": { "version": 2, "name": "ipython" } } } }