Deploying mne.stats.permutation_cluster_test on very large 3D dataset

Hello MNE team,

I want to use a permutation cluster test using TFCE method on a large 3D dataset I have. The dataset contains 6 subjects (more subjects will be added soon); I have two groups of 3 subjects. Each subject’s data has a size of 528x320x426. Through a preprocessing step, I have to flatten data and remove some features, ending up with a size of around thresh_data(6, 500000). This is how I use the MNE function:

threshold_tfce = dict(start=0, step=0.2)
F_obs, clusters, p_tfce, H0 = permutation_cluster_test(
    [thresh_data[0:2,:], thresh_data[3:,:]] , threshold=threshold_tfce, adjacency=None, out_type='slice', n_permutations=200, buffer_size=None, n_jobs=-1)

I am using Ubuntu 20.04.3 LTS, MNE V1.0.3, and Python V3.7.11. I have the following questions. Thanks a lot in advance.

  1. Am I passing the inputs to the function correctly?
  2. Is there a way to expedite the process? I ran the command with downsampling the data with Gaussian with a sigma of 3 voxels and it took around 3 days on a system with 10 CPU cores and 128 GB System ram. I also tried to create a directory, used mne.set_cache_dir(cache_dir) command, and played with buffer_size; however, it froze after a number of permutations (around 70 have been done out of 200).
  3. I am a little bit confused about the theory of clustering MNE function does. I read the main paper on clustering (Smith et al. 2009). Based on my understanding, what should be done is the following:
  • Compute the F ANOVA voxel-wise
  • Define clusters according to the TFCE method
  • Compute cluster statistics
  • Repeat the above three according to permutation

My ambiguity is coming from the way the clusters are defined; according to the MNE tutorial, MNE looks into adjacency to define clusters, so in this way, clusters are pre-defined by users; however, based on the paper, the clusters should be defined based on the statistic’s results. I am wondering where I am wrong in this. Indeed, I set the adjacency to None for this reason.
4. My final question is about how I can interpret the results. According to MNE tutorial, I did the following:

p_clust = np.ones("size of my data")
for cl, p in zip(clusters, p_values):
    p_clust[cl] = p

So, in this p_cluster wherever I have values lower than 1, it means that this voxel belongs to a cluster whose p_value tells me whether things that happened in this cluster were significant or not. Is my interpretation correct? Moreover, is there a way to differentiate between different clusters in this method? e.g. how can I say these voxels to cluster 1 and others to cluster 2?
Thank you.

as far as I can tell.

permutation-based clustering is slow. TFCE is even slower, because it repeats the clustering many times at many different thresholds.

Clusters are not pre-defined by users. The adjacency matrix determines what could possibly form a cluster, based on spatial (or temporal or spectral) proximity. In other words, simultaneous activity in some left-frontal vertices and some right-parietal vertices will not form a single cluster (they might form 2 separate clusters though). But whether something counts as a cluster, and what its spatial/temporal/spectral extent is, gets determined by the statistic, just as you think it should.

Please read again the docstring entry for adjacency. Setting it to None uses regular lattice adjacency which is probably not what you want, given that you’re passing in a flattened/raveled 3D array for each subject.

Not exactly. The p-value for “regular” clustering applies to the cluster as a whole, and tells you something about how likely it would be to find a cluster of that size/extent by random chance. When using TFCE, each vertex/voxel gets its own (FWER-corrected) p-value, but it’s still not quite interpretable as “there was significant activity exactly here at exactly this time.” I’m having trouble finding the correct reference / link for that at the moment, maybe @britta-wstnr has it ready-to-hand?

Indeed, there is “no such thing as a significant cluster” (quote FieldTrip homepage, link below) - since your null hypothesis of the test is simply “there is no difference between condition A and condition B” and is not directed at a specific time or place.

You can find more info, including literature references, on the FieldTrip homepage: How NOT to interpret results from a cluster-based permutation test - FieldTrip toolbox

1 Like