EEG processing pipeline with autoreject

Hi all. I am working on an approach to automatically process resting state EEG recordings from many participants before extracting some features. Processing includes:

  1. filtering
  2. epoching
  3. blink artifact removal (I use Fp1 as proxy to eog channel)
  4. bad epoch rejection

Considering raw is the continous EEG recording for a participant, I have implemented the following pipeline:

################################################################ Filtering
filter_params = {'l_freq': 1, 
                 'h_freq': 45, 
                 'h_trans_bandwidth': 5, 
                 'fir_design': 'firwin2', 
                 'phase': 'zero-double'}

raw.filter(**filter_params)

################################################################ Epoching
# Create events defining equally spaced non-overlapping epochs
events = mne.make_fixed_length_events(raw        = raw, 
                                      start      = keep_start,
                                      stop       = keep_stop,
                                      duration   = epoch_sec, 
                                      overlap    = 0,
                                      first_samp = False)

# Create  Epochs object
epochs = mne.Epochs(raw      = raw, 
                    events   = events, 
                    tmin     = 0,
                    tmax     = epoch_sec - 1/fs,
                    detrend  = 1,
                    baseline = None,
                    preload  = True)

################################################# Blink artifact removal (ICA)
# Fit ICA to epochs w/o considering artifactual epochs
reject = get_rejection_threshold(epochs, random_state = 100)
ica    = mne.preprocessing.ICA(random_state = 100).fit(epochs, reject = reject)

# Find eog epochs from Fp1 channel (first minute)
epochs_eog = mne.preprocessing.create_eog_epochs(raw = raw, ch_name = 'Fp1', reject = reject)

#  Identify the components having >0.5 score as blink-related.
_, eog_scores = ica.find_bads_eog(epochs, ch_name = 'Fp1', measure = 'zscore')
ica.exclude   = np.argwhere(abs(eog_scores) > 0.5).ravel().tolist()

# Apply ICA to epochs
epochs_clean = epochs.copy()
ica.apply(epochs_clean)

##################################################### Bad epoch rejection
epochs_clean = AutoReject(random_state = 100).fit_transform(epochs_clean)

I have some concerns regarding this procedure:

  1. Is it good practice to use AutoReject after applying ICA?
  2. Should I use a fixed threshold (like 100 uV) instead of applying AutoReject?
  3. After all the processing, should I normalize my epochs prior to feature extraction?

Hello @eduardo,

This will include eog in the returned reject dictionary – which might not be what you want, as it will lead to the removal of epochs with blinks. You may wish to consider passing ch_types='eeg' to only retrieve a rejection threshold for EEG.

Note that reject is simply ignored if you pass epochs to ICA.fit(). This is a little bit hidden in the documentation (and a reminder for me that we should add a clear warning):

If you want to fit ICA on epochs with a peak-to-peak rejection threshold applied, you must drop the bad epochs before ICA, either by calling Epochs.drop_bad() or by passing reject to Epochs() on instantiation.

Yes, this is a good idea. In fact, it appears that using autoreject “local” before and after ICA is in fact a good choice – we were seeing some issues with autoreject “global” before ICA in EEG layouts with some electrodes placed close to the eyes, like Fp1/Fp2. The respective tutorial has been updated very recently by @mainakjas:

I think it has not yet been deployed to the autoreject documentation website, therefore I’m linking to the file on GitHub.

That’s what we do in the MNE-BIDS-Pipeline before ICA. However, the problem with fixed thresholds is that … they’re fixed (surprise!), and the same thresholds might now work across all participants in your study. On the other hand, autoreject “global” led to problems in some of our datasets (see my comments above), and running auroreject “local” twice (before and after ICA) can be quite time-consuming. It’s a tradeoff, really.

Yes, depending on your data and specific analysis, you may want to use MNE-Python’s Scaler or scikit-learn’s StandardScaler, see Decoding (MVPA) — MNE 0.24.dev0 documentation for examples how to use them.

1 Like

Hi Eduardo,

We recently created a tutorial in autoreject with the help of Alex Rockhill, just to address these kinds of questions:

https://872-59485140-gh.circle-artifacts.com/0/html/auto_examples/plot_autoreject_workflow.html#sphx-glr-auto-examples-plot-autoreject-workflow-py

Specific answers:

  1. Yes, autoreject after ICA is the correct order. You can also apply it before, and use the bad epochs detected to improve ICA decomposition. Local autoreject before ICA seems to work better than global autoreject. If you’re lazy, you could also just use a high threshold like 500 uV pre-ICA. I haven’t investigated in detail the impact of these different choices, so make sure to inspect your data.

  2. See 1.

  3. I’m not exactly sure what kind of features you’re talking about. If you intend to do some machine learning, I would normalize the features, yes.

Best,

Mainak

2 Likes

Dear @richard and @mainakjas, I really appreciate your feedback. After reading your comments I think my two best choices are:

  1. Apply autoreject (local), apply ICA , and apply autoreject (local) again.
  2. Drop epochs based on threshold, apply ICA, and apply autoreject (local).

Since time is not an issue for me at processing stage (for now at least), I will go with the first option:

# Autoreject (local) epochs to benefit ICA (fit on 20 epochs to save time)
auto_reject_pre_ica   = AutoReject(random_state = 100).fit(epochs[:20])
epochs_ar, reject_log = auto_reject_pre_ica.transform(epochs, return_log = True)
            
# Fit ICA on non-artifactual epochs 
ica = mne.preprocessing.ICA(random_state = 100)
ica.fit(epochs[~reject_log.bad_epochs])

# Exclude blink artifact components (use Fp1 as EOG proxy)
epochs_eog    = mne.preprocessing.create_eog_epochs(raw = raw, ch_name = 'Fp1') 
_, eog_scores = ica.find_bads_eog(epochs_eog, ch_name = 'Fp1', measure = 'zscore')
ica.exclude   = np.argwhere(abs(eog_scores) > 0.5).ravel().tolist()
            
# Apply ICA
epochs_clean = epochs.copy()
ica.apply(epochs_clean)
            
# Autoreject (local) on blink-artifact-free epochs
auto_reject_post_ica = AutoReject(random_state = 100).fit(epochs_clean[:20])
epochs_clean         = auto_reject_post_ica.transform(epochs_clean)

Please, feel free to comment anything else you deem valuable. The tutorial you both pointed me to was really helpful.

I have some concerns regarding this procedure:

  1. Is it good practice to use AutoReject after applying ICA?
  2. Should I use a fixed threshold (like 100 uV) instead of applying AutoReject?
  3. After all the processing, should I normalize my epochs prior to feature extraction?

Finally, regarding my third question here, let me clarify things: I want to extract features (like sample entropy, PSD, fractal dimension, etc.) from the clean epochs. I am aware, after extracting the features, I need to scale them before passing them to the learning algorithm. My question is, do I also need to standarize the EEG epochs before extracting the features?

Thanks again.

Since apply() returns the modified instance, you could simplify this to:

epochs_clean = ica.apply(epochs.copy())

The way you’re phrasing it now, I would say: no. Just be sure to standardize whatever you feed into the ML algorithm.

1 Like

All clear now, thanks again @richard.