AutoReject as a preprocessing step

Hej all,

I am trying to shape my preprocessing steps for the data was recorded 64 Ag/AgCI BrainProducts active electrodes (3 eye electrodes: 2 outer canthi and 1 under left eye). The stimuli were shown in the center of the screen, so I don’t have many saccades, but blinks. To deal with bad trials, I’d like to mimic the example from @mainakjas which suggests local Autorejection before and after ICA.

It seems like numbers of trials are going to be interpolated/rejected heavily based on parameter values (i.e., number of channels, random seeds). My question is how one can decide? What would be a good practice?

1 Like

Hi @gozemturan! I have had the same questions and from experience have a few suggestions. First, the random_seed is for reproducibility so you are good to just pick any number and stick with it, that way if I pick the same number I should be able to get exactly your results. Second, the number of channels to interpolate is crucial. I’m not a big fan of the default ([1, 4, 32]; perhaps I should make a PR and change it) because you would never want to interpolate 32/64 channels, that would be madness :slight_smile: and not a very good analysis. I suppose it is useful to know that a high number of channels being interpolated means that maybe you need to rethink your setup and get better quality data. I generally use something like [1, 2, 3, 4, 5, 6] or so, if you’ve got all day, maybe go up to 10 or 20, it just takes computational time, but again, I’m not sure I would trust analyzing data with that many channels interpolated. That’s the key thing for me, you want to remove bad epochs and save epochs by interpolation where there is just a pop or something weird going on in one channel but you don’t want autoreject to fundamentally change your data. I think [1, 4, 32] is good for running the examples as well just as a side note. There isn’t perfect consensus on how many channels are okay to interpolate, and I think there might not be because the signal-to-noise of setups differ quite substantially as does the use of the data (i.e. maybe for a simple analysis of a cordless EEG with 10 channels with the participant moving, just want to throw out the worst of the worst and set your interpolations to be really low but for really exact timing with lots of trials and a participant sitting deathly still, you also want to throw out all bad trials and not interpolate anything and maybe for more of an average setup like you’re describing 3 to 5 would be acceptable and if it’s any more I’d start throwing out epochs instead and sacrifice some trials). Hope that helps!

2 Likes

Also, it kind of got buried in that long answer but the main tradeoff with number of channels interpolated is the number of trials rejected; if you interpolate more channels, you can save more trials. But should you? I think it’s a balance and you should be clear and reasonable and not interpolate any more than you need to but also I think throwing out trials with one electrode pop is a bit of a waste as well. Anyway, it’s about balancing.

1 Like

Many thanks, @alexrockhill for your detailed response and for sharing your experience! It was very helpful! I think I will go with low numbers, not with the default values. And yes, as you said it is a tradeoff one should think about how to proceed whether rejecting the trials or interpolating.
Thanks again! : )

1 Like