# creating the permutation null distribution

Hello,

I am trying to use mne.stats.permutation_cluster_1samp_test to run a non-parametric cluster based statistics. However, instead of letting the mne function create the permutated timeseries to compose the estimated null distribution of the time courses. I would like to directly supply the function with these permutation time series myself. Is this possible?

The reason, I want to do this is because I want to implement a statistical test of representation dissimilarity matrices in which the null distribution is created by randomly reassigning the rows and columns of the RDM.

Papers suggesting this approach:
Kriegeskorte, N., Mur, M., & Bandettini, P. A. (2008). Representational similarity analysis-connecting the branches of systems neuroscience. Frontiers in systems neuroscience, 4.
https://doi.org/10.3389/neuro.06.004.2008

Martin N Hebart, Brett B Bankson, Assaf Harel, Chris I Baker, Radoslaw M Cichy (2018) The representational dynamics of task and object processing in humans eLife 7:e32816
https://doi.org/10.7554/eLife.32816

Thank you!

Ana

Unfortunately, there doesn’t seem to be an easy way to do this right now. The permutation cluster test code is designed for maximum speed, not easy configuration. I can see if I can implement this in the MNE-RSA package, but it may be a while before I get to it.

1 Like

That said. Here is the original suggestion from Kriegeskorte et al.:

Step 5: Testing Relatedness of Two Dissimilarity Matrices by Randomization

In order to decide whether two dissimilarity matrices are related, we can perform statistical inference on the RDM correlation. The classical method for testing correlations assumes independent measurements for the two variables. For dissimilarity matrices such independence cannot be assumed, because each similarity is dependent on two response patterns, each of which also codetermines the similarities of all its other pairings in the RDM.

We therefore suggest testing the relatedness of dissimilarity matrices by randomizing the condition labels. We choose a random permutation of the conditions, reorder rows and columns of one of the two dissimilarity matrices to be compared according to this permutation, and compute the correlation. Repeating this step many times (e.g., 10,000 times), we obtain a distribution of correlations simulating the null hypothesis that the two dissimilarity matrices are unrelated. If the actual correlation (for consistent labeling between the two dissimilarity matrices) falls within the top α × 100% of the simulated null distribution of correlations, we reject the null hypothesis of unrelated dissimilarity matrices with a false-positives rate of α. The p-value for each brain region’s relatedness to each model is given beneath the model’s bar in Figure 8 . They are conservative estimates based on 10,000 random relabelings, so the smallest possible estimate is 10−4.

This procedure is much easier than a full-blown cluster permutation test. No clustering is involved (hence also no control for multiple tests, so take that into account). You perform the RSA analysis a large number of times with shuffled condition labels. In the `mne_rsa` package, condition labels can be assigned using the `y` parameter of the functions, so it should be straightforward to compute RSA with shuffled condition labels.

1 Like

Dear @wmvanvliet ,

Thank you so much for your answer and for developing the mne-rsa library!
I believe that you are correct in suggesting the approach from Kriegeskorte et al..

But if I am running the rsa over time would I still have a multiple comparison problem?
I am conducting the analysis over time - one RDM for each time-point. Following Kriegeskorte et al. (2008) I would:

• Step 1) Re-label the RDM many times to create a simulated null distribution.
• Step 2) After that, I would be able to determine a p-value for each time point. But, am I correct to assume, that at this point I would need some sort of cluster approach to overcome the multiple comparison problem that comes from having a p-value for each timepoint?

As you mentioned, step 1 can be implemented in mne-rsa.
Do you have some pointers on how to implement step 2?
Is there a python function that would allow me to input the simulated null distribution and output a cluster selection?

Many thanks for your time and insight!

Ana

You are correct.

Not in the official API. Looking through the code, there is the private `mne.stats.cluster_level._find_clusters` function that is used internally by the big `mne.stats.permutation_cluster_1samp_test` function. It seems to do what you want so you could try that. You’ll still need to build the rest of the statistics around that though, as in looping over the random iterations, finding the cluster with the largest sum(t-statistic), etc.

1 Like