combine_adjacency for TFR eeg data

I am working on eeg TFR data. I have two groups (1& 2) and I used morlet wavelets to get the TFR data. Now I need to perform a cluster permutation test to identify significant clusters for 2 ROIs (c1 +c3 and c2+c4 channels) between group 1 and 2.

The tfr_group_1.data and tfr_group_2.data have shape: [10, 6,40,4609] (that is 10 observations, 6 channels, 40 freqs, 4609 time points).

To conduct cluster permutation test , I need the adjacency matrix. How do I get that from the TFR data ?

Do I do it in this order ?

  1. Use mne.channels.find_ch_adjacency(info , ch_type ) , ch_type=eeg .
    ch_adjacency = mne.channels.find_ch_adjacency(raw.info, ch_type='eeg')
    Do you create a fake raw object here to account for 4 channels only (given I am interested in 2 ROIs ?
  2. full_adj = combine_adjacency(1, 1, ch_adjacency) (freqs, times, chan) shape
    Here ā€˜1ā€™ for position 1, means that you are looking at 1 freq point above and below the freq of interest ?
  3. How do you account for ROI of interest? Do I get the mean of the TFR for the ROIs ?

@drammock @larsoner

Just confirming: ā€œobservationā€ here means participant, right? Iā€™m a bit confused by tfr_group_1.data because the .data part implies that this is a TFR object, which shouldnā€™t have data from multiple subjects in it. Maybe itā€™s an EpochsTFR? So 10 epochs per participant? Or is the .data part wrong, and this is actually a numpy array?

I donā€™t think itā€™s a good idea to restrict to 4 ROI channels before clustering. Rather, Iā€™d suggest finding clusters using all channels you have and then afterwards looking to see if your ROIs happen to form a significant cluster at the time(s) and frequenc(ies) where you expect to see an effect. So if youā€™re starting out with e.g. 32 channels, use all 32.

That is basically right, but the function returns 2 things (adjacency matrix and list of channel names). So should be more like

adj_mat, ch_names = mne.channels.find_ch_adjacency(raw.info, ch_type='eeg')

That is correct, as long as you remember that you need to transpose the data array so that the order of dimensions is observations (AKA "subjects" in your case I think), frequency, time, channel.

As mentioned above, I would cluster on all channels and then check to see if I find a significant effect on the channels I care about. Itā€™s effectively what youā€™re doing for frequency and time too (you didnā€™t mention cropping to just the times/frequencies of interest right? you can treat channels the same way).

  1. Sorry for the confusion , I meant to say they are two array lists. Group 1 and 2 have 10 subjects each . The transposed dimensions are [subjects, freq, time,chan] and are put into one list X . This list has two arrays , for group 1 and 2.

  2. Ok I understand why all channels are to be used. But what if we have unequal number of channels across subjects/groups , as some subjects had different ā€˜badā€™ channels that were excluded? Just take the common channels across subjects ? so run a loop to check for common channels ?

  3. For full_adj = combine_adjacency(1, 1, ch_adjacency) , the ch_adjacency is the adj_mat right ?

  4. And this is how you use the final cluster function for list X? and for stat_fun I want to use the stats.ttest_ind from scipy stats library

def t_stat_fun(x, y, equal_var=True):
      t_statistic, _ = stats.ttest_ind(x, y, equal_var=equal_var)
      return t_statistic

w, clusters, clusters_pv, h0 = mne.stats.permutation_cluster_test(, threshold=5, n_permutations=5000, stat_fun = t_stat_fun, adjacency=adj_matrix)

Normally I might say ā€œa good use case for interpolate_badsā€ but I think that might be a false-positive risk (i.e., interpolated channels are kind of guaranteed to somewhat resemble their neighbors, making cluster-formation more likely). So excluding the ā€œcommon badsā€ from all subjs might be the better choice? Depends on how many channels you have and how many you end up excluding. Definitely do plot_adjacency once youā€™ve excluded all bad chans and see how many ā€œholes in the latticeā€ you end up with ā€” it might be untenable in which case you would probably have to interpolate, and just be cautious about how you interpret significant clusters that contain interpolated channels.

yes

stat_fun wonā€™t work as written, because of what ttest_ind returns (not a tuple of length 2). if you call its output result then you would return result.statistic. Read the ā€œReturnsā€ section of scipy.stats.ttest_ind ā€” SciPy v1.13.0 Manual. You may or may not also need to specify keepdims, so if you try it and it fails, try again with keepdims=True

The sample code for permutation_cluster_test looks almost right; the first argument X is missing, and Iā€™m not sure why youā€™re calling the first return value w (it will be t-values)

Thanks @drammock

  1. I looked at the adjacency plot, there are not too many common channels. I am thinking of just identifying clusters for these channels only (than interpolating)

  1. I chose wilcoxon test instead and ran this for two groups and I am getting an error :
adj_matrix = mne.channels.find_ch_adjacency(raw_drop.info, ch_type='eeg')[0]
full_adj_matrix = mne.stats.combine_adjacency(1, 1, adj_matrix)

def stat_fun_wilcox(X,Y):
    result = wilcoxon(X,Y)
    return result.statistic

w,clusters, clusters_pv, h0 = mne.stats.permutation_cluster_test(combined_tfrs, threshold=None,n_permutations=1024, 
                                                                  stat_fun=stat_fun_wilcox, adjacency=full_adj_matrix)

ValueError: adjacency (len 19) must be of the correct size, i.e. be equal to or evenly divide the number of tests (3064985).

If adjacency was computed for a source space, try using the fwd["src"] or inv["src"] as some original source space vertices can be excluded during forward computation

full_adj_matrix.shape is (19,19) (basically 19 channels) , does this have to be a different dimension ?
combined_tfrs[0].shape is (15 subjects, 35 freqs , 4609 times, 19 chan)

adjacency plot doesnā€™t look right. Hereā€™s what is expected (these are the EEG channels in the MNE sample data):

notice how in the plot in the tutorial the connecting lines are all fairly ā€œlocalā€. You may need to remove some connections. See the warning in the notes section here: mne.channels.find_ch_adjacency ā€” MNE 1.7.0 documentation

The plot is right, I have taken the common channels across subjects and groups. The frontal , temporal and occipital channels are noisy for most subjects and have been flagged as ā€˜badā€™. That is why you see such few channels

itā€™s not the number of channels iā€™m concerned about. Itā€™s the lines connecting them. E.g., the channel that is farthest right-frontal looks like it might connect to the one that is furthest right-temporal, and it is clearly connected to a left-parietal-ish one. Those are not plausibly ā€œadjacentā€.

Thanks, resolved! but when I run the cluster perm test I still get the error:

ValueError: adjacency (len 19) must be of the correct size, i.e. be equal to or evenly divide the number of tests (3064985).

If adjacency was computed for a source space, try using the fwd["src"] or inv["src"] as some original source space vertices can be excluded during forward computation