This post builds on my previous discussion from last week (“Preprocessing pipeline - bandpass filtering, Autoreject, and ICA”), in which I asked about the appropriate sequencing of Autoreject and ICA. At the time, I proposed sandwiching ICA between two runs of Autoreject (consistent with what’s suggested on the Autoreject website). Based on feedback from another user, I revised the pipeline to include Autoreject only after ICA. Ultimately, this modification produced high-quality ERP data without sacrificing a substantial number of epochs.
After reflecting further, however, I realized something problematic about this approach: although I initially implemented Autoreject to automate artifact rejection, incorporating MNE’s general ICA method reintroduced a manual inspection step into the pipeline. Of course, even Autoreject requires some oversight to confirm that epochs are not rejected too liberally, but as someone with limited experience identifying artifactual ICs, I began searching for a more guided approach (ideally one in which a trained algorithm could flag likely artifact components for me to review and approve).
This led me to discover ICLabel, which appeared well suited for this purpose.
I subsequently modified my pipeline to facilitate optimal IC_Label functioning, while also retaining the standard preprocessing steps implemented in previous iterations (i.e., importing raw data, setting an average reference, downsampling from 2048 to 256 Hz, and applying high- and low-pass filters at 0.1 and 30 Hz, respectively). Because ICLabel performs best on data high-pass filtered at 1 Hz, I created a copy of the dataset and applied a 1 Hz high-pass filter solely for ICA decomposition. The intention behind this was to preserve the slow neural activity in the primary dataset, which is critical given my project’s focus on the Late Positive Potential (LPP).
Next, I epoched both the minimally pre-processed data and the 1 Hz high-passed copy used for ICA. I fit the ICA model to the 1 Hz high-passed epoched data, ran ICLabel to classify components, and then applied the resulting ICA model to the 0.10-30 Hz epoched data. I then applied baseline-correction and Autoreject for final epoch-level cleaning. Below is a brief excerpt of the code from my pipeline (excluding information that is less relevant to my question, such as the event ID dictionary):
##High- and low-pass filtering
raw.filter(0.1, 30)
#ICA Preparation (1 Hz copy, part of IC_Label requirements))
raw_ica = raw.copy().filter(1., None)
#Generate epochs
epochs = mne.Epochs(
raw,
events,
event_ids,
tmin=-.2,
tmax=1.5,
baseline= None, #Apply after ICA
on_missing='warn',
preload=True,
reject_by_annotation=False)
epochs_ica = mne.Epochs(
raw_ica,
events,
event_ids,
tmin=-.2,
tmax=1.5,
baseline= None, #Apply after ICA
on_missing='warn',
preload=True,
reject_by_annotation=False)
#Apply ICA
ica = mne.preprocessing.ICA(
method = 'infomax',
random_state=99,
n_components = 0.99)
ica.fit(epochs_ica)
ica.plot_components()
ic_labels = label_components(epochs_ica, ica, method="iclabel")
labels = ic_labels["labels"]
exclude_idx = [idx for idx, label in enumerate(labels) if label not in ["brain", "other"]]
print(f"Excluding these ICA components: {exclude_idx}")
ica.apply(epochs, exclude=exclude_idx)
#Baseline Correction
epochs.apply_baseline((None, 0))
#Autoreject
ar = autoreject.AutoReject(n_interpolate=[1, 4, 32], random_state=42,
n_jobs=1, verbose=True)
epochs_clean, reject_log = ar.fit_transform(epochs, return_log=True)
Overall, the results from this pipeline exceeded my expectations. The data appeared extremely clean, nearly 2x as many epochs were retained, and the entire process was completed in a reasonable amount of time.
My primary question concerns the relative absence of examples combining Autoreject and ICLabel within the same preprocessing pipeline. I’ve spent hours searching the Internet and have not found documentation or tutorials demonstrating this approach. Is there a methodological reason these tools are not typically implemented together (e.g., an overeliance on automation), or is this simply an unexplored combination in the MNE/Autoreject examples?