Dear MNE community,
I am writing to ask you some suggestions about EEG connectivity analysis. Specifically, I would like to compare the Imaginary part of coherence among pair of sensors between two conditions in each frequency band, from theta to gamma. Since I do not have strong a priori hypotheses, I would like to compare all possible pair of electrodes between two conditions but I am a bit worried about the multiple comparison problem and I fear that Bonferroni or FDR correction could not be the best solution. I am wondering whether there is a wiser strategy to compute the statistics, e.g., permutation or selection of a group of connections.
I see you suggest to use graph-based analysis, since I am quite new to the topic could you suggest an algorithm I could use for my analysis?
Thank you,
I can’t give much advice on the graph analysis, but for your other points:
I am wondering whether there is a wiser strategy to compute the statistics, e.g., permutation or selection of a group of connections.
Without an a priori idea about which connections are relevant for the conditions being compared, I don’t see how you can select a specific group of connections to focus on.
This should really come from some previous literature or an existing hypothesis you have, etc… It would be problematic to perform the tests on the all-to-all connectivity to identify the relevant regions and then report the results for a subsequent set of tests on only these regions.
If you want to try a permutation test, there are a couple functions for doing so in the Statistics module depending on what fits your data best.
I would like to compare all possible pair of electrodes between two conditions but I am a bit worried about the multiple comparison problem
A more general point, just keep in mind with the code snippet you shared where indices is not specified that this will give you an array of connectivity values for all possible sensors * sensors combinations.
However, only those entries corresponding to the lower-triangular part of the [sensors x sensors] matrix will be useful (zeros everywhere else), so you can ignore these other values which will reduce the number of multiple comparisons to correct for (if you’re not accounting for this already).
Same for averaging over the frequencies in each band before running the stats.
Hi Thomas (if I may),
thank you very much for your reply. The input data that I used to compute connectivity has shape (n_epochs, n_signals, n_times).
Yesterday I tried to follow this workaround. I was using plot_sensors_connectivity function to plot connectivity results for each condition. By reading the documentation I see that by default the 20 strongest connections are plotted. Therefore I selected the 20 strongest connections for condition 1, then the 20 strongest connections for condition 2 and kept only the connections appearing in both the conditions and performed statistical analysis on them. I do not know whether this could be a good starting point.
So just to preface, I’m not a stats expert. I can’t say for certain where the cutoff is between piloting to identify some relevant features, and a circular analysis to just find any significant p-value (not that I’m suggesting you’re doing that!). Perhaps someone more familiar can weigh in.
But, in terms of getting an idea for what the relevant connections are for your task, I would agree your approach is a sensible way to go.
I am not a “stat expert” either, although I have the same problem as you have, I was thinking for my data to go for generalizing the hypothesis, not only between 2 conditions shaping the question to “is there a particular relationship between my independent variable and my dependent variable ?” allowing to test for this relationship rather than statistical differences for multiple comparisions. I think it’s best if you have no a priori on the question because it’s not obvious that the strongest shared connections will be the one statistically significative because it could be anything.
Also if you prefer the clustering strategy you could compute the connectivity matrix for each patient, flatten the half triangle and you’ll get one vector reprensenting the connectivity of each patient, then you just have to cluster your vectors and see if 2 clusters emerge that would be your conditions.