I am working on eeg TFR data. I have two groups (1& 2) and I used morlet wavelets to get the TFR data. Now I need to perform a cluster permutation test to identify significant clusters for 2 ROIs (c1 +c3 and c2+c4 channels) between group 1 and 2.
The tfr_group_1.data and tfr_group_2.data have shape: [10, 6,40,4609] (that is 10 observations, 6 channels, 40 freqs, 4609 time points).
To conduct cluster permutation test , I need the adjacency matrix. How do I get that from the TFR data ?
Do I do it in this order ?
Use mne.channels.find_ch_adjacency(info , ch_type ) , ch_type=eeg . ch_adjacency = mne.channels.find_ch_adjacency(raw.info, ch_type='eeg')
Do you create a fake raw object here to account for 4 channels only (given I am interested in 2 ROIs ?
full_adj = combine_adjacency(1, 1, ch_adjacency) (freqs, times, chan) shape
Here ā1ā for position 1, means that you are looking at 1 freq point above and below the freq of interest ?
How do you account for ROI of interest? Do I get the mean of the TFR for the ROIs ?
Just confirming: āobservationā here means participant, right? Iām a bit confused by tfr_group_1.data because the .data part implies that this is a TFR object, which shouldnāt have data from multiple subjects in it. Maybe itās an EpochsTFR? So 10 epochs per participant? Or is the .data part wrong, and this is actually a numpy array?
I donāt think itās a good idea to restrict to 4 ROI channels before clustering. Rather, Iād suggest finding clusters using all channels you have and then afterwards looking to see if your ROIs happen to form a significant cluster at the time(s) and frequenc(ies) where you expect to see an effect. So if youāre starting out with e.g. 32 channels, use all 32.
That is basically right, but the function returns 2 things (adjacency matrix and list of channel names). So should be more like
That is correct, as long as you remember that you need to transpose the data array so that the order of dimensions is observations (AKA "subjects" in your case I think), frequency, time, channel.
As mentioned above, I would cluster on all channels and then check to see if I find a significant effect on the channels I care about. Itās effectively what youāre doing for frequency and time too (you didnāt mention cropping to just the times/frequencies of interest right? you can treat channels the same way).
Sorry for the confusion , I meant to say they are two array lists. Group 1 and 2 have 10 subjects each . The transposed dimensions are [subjects, freq, time,chan] and are put into one list X . This list has two arrays , for group 1 and 2.
Ok I understand why all channels are to be used. But what if we have unequal number of channels across subjects/groups , as some subjects had different ābadā channels that were excluded? Just take the common channels across subjects ? so run a loop to check for common channels ?
For full_adj = combine_adjacency(1, 1, ch_adjacency) , the ch_adjacency is the adj_mat right ?
And this is how you use the final cluster function for list X? and for stat_fun I want to use the stats.ttest_ind from scipy stats library
def t_stat_fun(x, y, equal_var=True):
t_statistic, _ = stats.ttest_ind(x, y, equal_var=equal_var)
return t_statistic
Normally I might say āa good use case for interpolate_badsā but I think that might be a false-positive risk (i.e., interpolated channels are kind of guaranteed to somewhat resemble their neighbors, making cluster-formation more likely). So excluding the ācommon badsā from all subjs might be the better choice? Depends on how many channels you have and how many you end up excluding. Definitely do plot_adjacency once youāve excluded all bad chans and see how many āholes in the latticeā you end up with ā it might be untenable in which case you would probably have to interpolate, and just be cautious about how you interpret significant clusters that contain interpolated channels.
yes
stat_fun wonāt work as written, because of what ttest_ind returns (not a tuple of length 2). if you call its output result then you would return result.statistic. Read the āReturnsā section of scipy.stats.ttest_ind ā SciPy v1.13.0 Manual. You may or may not also need to specify keepdims, so if you try it and it fails, try again with keepdims=True
The sample code for permutation_cluster_test looks almost right; the first argument X is missing, and Iām not sure why youāre calling the first return value w (it will be t-values)
I looked at the adjacency plot, there are not too many common channels. I am thinking of just identifying clusters for these channels only (than interpolating)
ValueError: adjacency (len 19) must be of the correct size, i.e. be equal to or evenly divide the number of tests (3064985).
If adjacency was computed for a source space, try using the fwd["src"] or inv["src"] as some original source space vertices can be excluded during forward computation
full_adj_matrix.shape is (19,19) (basically 19 channels) , does this have to be a different dimension ?
combined_tfrs[0].shape is (15 subjects, 35 freqs , 4609 times, 19 chan)
The plot is right, I have taken the common channels across subjects and groups. The frontal , temporal and occipital channels are noisy for most subjects and have been flagged as ābadā. That is why you see such few channels
itās not the number of channels iām concerned about. Itās the lines connecting them. E.g., the channel that is farthest right-frontal looks like it might connect to the one that is furthest right-temporal, and it is clearly connected to a left-parietal-ish one. Those are not plausibly āadjacentā.
Thanks, resolved! but when I run the cluster perm test I still get the error:
ValueError: adjacency (len 19) must be of the correct size, i.e. be equal to or evenly divide the number of tests (3064985).
If adjacency was computed for a source space, try using the fwd["src"] or inv["src"] as some original source space vertices can be excluded during forward computation