Data stratification when applying LCMV beamformers for decoding

Hi all,

I’m planning to apply decoding analyses to source estimates and could use some advice on best practices. My current approach involves applying LCMV beamformers to single-subject evoked data. So far, I’ve been estimating spatial filters based on task-related time points from single-trial data, creating one set of filters per subject.

Next, I want to decode ConditionA from spatiotemporal patterns in a specific ROI. I plan to perform the decoding separately for ConditionC1 and ConditionC2, then compare the decoding performance between these two conditions.

My question is about data stratification. From what I understand, I need to ensure my data is stratified before estimating the spatial filters to avoid bias. How strict does the balancing need to be?

Currently, I have two conditions: the conditions I want to compare (ConditionC1 and ConditionC2), and the condition I want to decode (ConditionA). Do I need to ensure that each combination of these conditions appears equally often? Is one more crucial to balance than the other?

I’m concerned that if I start decoding other conditions, it could get tricky, as I might need to exclude many trials to maintain a fully balanced dataset. Would it be better in this case to create different sets of spatial filters for each analysis? For example, one set of filters for comparing decoding of ConditionA between ConditionC1 and ConditionC2, ensuring balanced combinations of Conditions A and C, and another set for comparing decoding of ConditionB between ConditionC1 and ConditionC2, with balanced combinations Conditions B and C?

Any insights or recommendations would be appreciated!

MNE version: 1.7
OS: Ubuntu 22.04