Hi everyone,

I’ve recently been working with SEEG data in MNE, focusing primarily on (generalized linear model) **GLM analysis**. However, I have some uncertainties regarding the **statistical tests** applicable to my analysis, and I would appreciate any insights or suggestions you could provide.

## Background

### Preprocessing

Assume I am analyzing data from a **single subject**, specifically focusing on contacts located in the orbitofrontal cortex (**OFC**).

The general workflow involves preprocessing the SEEG data, epoching (`epochs`

), and then computing time-frequency representation (`tfr`

) for channels (contacts) within OFC.

```
# pseudocode
epochs = seeg_preprocess(sub)
tfr = epochs.compute_tfr(picks=["OFC"])
# tfr.data.shape: (n_trials, n_chs, n_freqs, n_timepoints)
```

Next, I extract the high-gamma (80-150Hz) activity for GLM analysis. Specifically, I extract data from this frequency band and compute the mean across these frequencies, resulting in `hga`

with the shape of `n_trials, n_chs, and n_timepoints`

, which represent high-gamma activity.

```
# pseudocode
hga = tfr.data[:, :, 80:150, :].mean(axis=2)
# shape: (n_trials, n_chs, n_timepoints)
```

### GLM

Each channel’s data in OFC (with shape `n_trials,n_timepoints`

for one channel) is then analyzed using a GLM, with a behavioural metric `X`

(assuming a single regressor, shape: `n_trials,1`

) predicting the HGA activity (`hga`

) on each time point. So the output (`glm_betas_ts`

) is a beta time course for all channels, reflecting the association between the behavioral metric and HGA over time.

```
# pseudocode
glm_betas_ts = np.array([glm(y=hga[:, ch], x=X).fit().coef for ch in range(hga.shape[1])]).squeeze()
# shape: (n_chs, n_timepoints)
```

## Statistics

Currently, I am using `permutation_cluster_1samp_test`

for statistical testing, treating multiple contacts as multiple samples.

```
# pseudocode
# glm_betas_ts.shape: (n_chs, n_timepoints)
_, c, p, _ = permutation_cluster_1samp_test(glm_betas_ts, tail=0, ...)
```

final result example (two lines correspond to two behavior metrics, horizontal segments represent significantly different from 0):

Additionally, I’ve considered not averaging over the frequency bands and directly performing GLM analysis on the entire TFR dataset for each frequency. I used a similar statistical method like the beta time course:

```
# pseudocode
glm_betas_tfr = np.array([glm(y=tfr.data[:, ch, :, :], x=X).fit().coef for ch in range(tfr.data.shape[1])]).squeeze()
# glm_betas_tfr.shape: (n_chs, n_freqs, n_timepoints)
_, c, p, _ = permutation_cluster_1samp_test(glm_betas_tfr, tail=0, ...)
```

final result example:

## Question

**Is the logic of statistical testing in these scenarios appropriate?**- Are there other considerations or methods I should consider to ensure the robustness and accuracy of my findings?
- I tried different setups for the
`permutation_cluster_1samp_test`

, including default parameters (`tail=0`

) or TFCE (with`threshold=dict(xxxx)`

), but got quite different results. I want to know if it’s normal (I will put data and results later if needed).

Any guidance or references to similar analyses or resources would be greatly appreciated. Thank you!

Best Regards,

Kun