# Spatio-temporal custer permutation with linear regression

Hi all,

In my MEG project, I want to run a linear regression across sensors and time and combine it with cluster-based permutation to control for multiple comparisons. Based on a few tutorials I managed to write some code that gives me reasonably-looking output, but I am not sure whether I did it correctly, that’s why I am here.

For some context, I have a within-subject design with 58 subjects that each did 4 conditions. The four conditions are on a continuous scale from zero to one, so I think linear regression would be best suited to analyse it even though an Anova with 1 Factor and 4 levels would also work.

To run the regression, I created a dataframe with the shape `(4, 58, 150, 102)` (space last), and defined a custom `stat_fun` like so:

``````def linreg_cluster(*args):
labels = ["Intercept", "Coherence"]
n_cond = len(args) # 4
n_obs = args[0].shape[0] # 58
dm = np.ones((n_cond * n_obs, 2))
dm[:, 1] = np.repeat([0.03, 0.06, 0.12, 0.24], n_obs) # levels of the factor coherence
data = np.concatenate(args)
# use private function, because public API only works on Epochs or Sourceestimates
beta, stderr, t_val, p_val, mlog10_p_val = mne.stats.regression._fit_lm(data, dm, labels)
return t_val['Coherence']
``````

Then I computed the cluster-forming t-statistics threshold like this:

``````pval = 0.001
df = 58 - 4 - 1  # degrees of freedom for the test
thresh = scipy.stats.t.ppf(1 - pval / 2, df)  # two-tailed, t distribution
``````

and finally ran the test:

``````    cluster_stats = mne.stats.spatio_temporal_cluster_test(
data,
n_permutations=500,
threshold=thresh,
stat_fun=linreg_cluster,
buffer_size=None,
``````

Does this approach look right to you? I also tried the same procedure but then with a one-way F-test instead of a linear regression as statistical function. This was a bit a easier and I am reasonably confident that the results check out. But while the linear regression results are somewhat similar, they look much worse than I expected (i.e. the clusters have essentially no temporal extent, see below). So, I was wondering whether I did do anything wrong.

Here an example of one cluster (ignore the `F` values in the labels, I forgot to change them to `t`s:

Eduard

Hello Eduard,

I would suggest to first check the linear regression output. I assume you run the regression on evoked data? I think regression is best suited for single trial analysis, so instead of running on evokeds I would run a regression model over epochs for each participant and then use the resulting evoked beta coefficients for further group level statistics. Maybe this tutorial helps?

Best,

Carina

1 Like

Hi Carina,