Recommended preprocessing pipeline for 32-channel eyes-closed resting EEG before PSD analysis

Greetings!,

I am working with EEG data recorded from 32 channels (Brain Products) during eyes-closed, guided-meditation (Yoga Nidra) in a supine condition. The dataset includes 12 subjects. For each recording, the beginning 5 minutes are an eyes-closed resting state, and we plan to perform PSD analysis using the Multitrapper method.

I would like to confirm whether the following preprocessing pipeline is reasonable and in the correct order for this kind of analysis:

  1. Crop/remove non-experimental data
  2. Annotate bad segments / obvious artifacts, e.g., coughing, jaw clenching, swallowing, etc.
  3. Apply notch filtering (50 Hz and 100Hz) to remove line noise and band-pass filtering using appropriate low (0.1 Hz) and high cutoffs (135 Hz)
  4. Identify bad channels based on visual inspection, PSD (i.e. channels that have extremely low or high PSD compared to normal channels), variance, impedance cutoff either ≤ 5 kΩ or ≤ 20 kΩ, and any other criteria you would recommend
  5. Run ICA and remove artifact-related components
  6. Apply average reference (the online recording reference was FCz)
  7. Interpolate bad channels
  8. Epoch the data
  9. Reject epochs exceeding ±100 µV

I would specifically like your advice on the following:

  • Is this overall pipeline appropriate for PSD analysis in MNE-Python?
  • Should the average reference be applied before or after ICA in this case?
  • Should bad channels be interpolated before or after average reference?
  • Is there anything else I should consider when marking a channel as bad, besides visual inspection, PSD, variance, and impedance?
  • For this type of EEG dataset, is epoch rejection at ±100 µV a reasonable threshold, or would you recommend a different criterion?

Additional question regarding electrode impedance:

  • During acquisition, we aimed to keep electrode impedances below 20 kΩ. However, in one subject, three channels had impedances above 20 kΩ, while all other channels and subjects were below this threshold.

    When visually inspecting the EEG data, these channels do not appear obviously noisy, and their PSD also seems reasonable.

    My questions are:

    1. Is it necessary to apply the same impedance cutoff (e.g., ≤20 kΩ) consistently across all subjects during preprocessing?

    2. If a channel exceeds the impedance threshold but the EEG signal appears acceptable upon visual inspection and PSD analysis, should the channel necessarily be marked as bad?

    3. What is the standard practice in EEG preprocessing regarding impedance thresholds recorded during acquisition versus decisions based on the actual signal quality?

    4. How much weight should be given to recorded impedance values compared with signal characteristics such as visual inspection, variance, PSD, and neighboring channel behavior when deciding whether to reject a channel?

I would appreciate any guidance, corrections, or references to best practices.

Thank you.

Hi Vicky! Overall, I think your pipeline is very well thought out and largely appropriate, but I have some comments that you might want to consider.

First, you should apply average reference before ICA (see here). If you marked bad channels, these will be automatically excluded. Second, I recommend interpolating channels after the average referencing step. It is not going to be a huge difference, but if you interpolate before average reference, you slighty overcount the surrounding channels. Regarding your questions on marking bad channels and epoch rejection, I think these are good defaults, but you might have to tweak the ±100 µV threshold based on the actual data.

I cannot say much about the role of electrode impedances in rejection criteria, since I usually (almost never, OK, never) have these measurements. Even if I had impedances, I wouldn’t give them too much weight unless they are extremely large. However, since it would be nice to consistently apply rejection criteria, maybe increasing the cutoff to e.g. ≤50 kΩ (or 30/40?) would make sense?

Do be careful about rejecting channels based on their PSD, if the PSD is the thing you wish to study and draw conclusions on. You may be tempted to eliminate channels because their PSD does not agree with your hypothesis. Visual inspection is a great catch-all for eliminating bad channels, if you have the time to do it.

Thank you for the clarification.

Just to confirm, is the following ordering what you would recommend?

  1. Mark/remove bad channels
  2. Apply average reference
  3. Run and apply ICA
  4. Interpolate bad channels

I noticed that another MNE discussion @richard suggested that average referencing should be done after ICA (See here), since doing it before ICA can smear artifacts across channels. I wanted to confirm which ordering you recommend for this case.

Thank you.

Yes, that’s what I (and the EEGLAB people) would recommend, and I disagree with @richard here. ICA should have no problem separating an artifact smeared over many channels due to average reference – that’s literally the point of ICA.

Thank you, that is helpful. I agree that PSD should not be the only criterion for rejecting a channel, especially since PSD is also the main quantity we want to analyze. In my case, I use the PSD only as a screening step: when one channel looks clearly different from the others, I then inspect the raw data and its time-domain behavior before deciding whether to mark it bad.

For example, in the attached plot (low-pass, high-pass and notch filters already applied to the data), TP9 appears to be a clear outlier compared with the other channels, even though its impedance was below 5 kΩ. I often notice this pattern in my data for TP9/TP10, so I suspect these temporal/ear channels may be more prone to contamination from local muscle activity? Since the raw signal also looks abnormal, I exclude the channel based on the raw inspection rather than the PSD alone.

I have also attached two raw-data examples from the same recording.

In the first image, TP9 shows a large transient/high-amplitude deflection that appears quite different from the surrounding channels. In the second image, however, TP9 appears much more similar to the neighboring channels and does not look obviously abnormal.

Given these observations, I am unsure how to interpret the elevated PSD at TP9. Would you consider it more likely that the increased power is driven by intermittent artifacts (e.g., movement, electrode instability, temporal muscle activity, etc.) that occur during parts of the recording, rather than genuine cortical activity?

More generally, when a channel appears as a PSD outlier but only shows occasional abnormal events in the raw signal while looking relatively normal during the rest of the recording, would you typically mark the entire channel as bad, or would you instead annotate/reject the affected segments and retain the channel?

I would be interested in hearing how others would approach this situation.

Yeah, that’s a good approach – if a channel is a clear outlier (either way above or below “normal” channels), I think it is pretty safe to remove it.

Here’s an example PSD from one of my own datasets:

There’s one channel way below the others, and another one way above the others, so I removed both of them. There’s also a third channel with unusually high power, which I also removed, but that’s debatable (and not the point here, as this is a 64-channel recording, so it is much easier to discard a few channels than in a 32-channel recording).

Regarding whether to remove the entire channel or mark artifact segments, I usually discard the entire channel if its activity is clearly bad during most of the recording. If there are just a few artifact segments, I prefer mark those, but usually that’s not the case, and it is sometimes not easy to tell why a channel is clearly different from others in the PSD, but looks “normal” in the time course. In that case, I’d go over the entire signal and try to find those artifacts.

The rather “wavy” appearance of TP9 in your PSD (especially at higher frequencies) makes me think that you probably missed a large-ish artifact that’s causing that PSD (it could be that big deflection you showed us). If that’s the case, you can mark it with an annotation that starts with “bad” (I use “bad_segment”), and then recompute the PSD (as it will discard annotations starting with “bad” by default). If the PSD then looks better, you can keep the channel. If not, then I’d simply discard it and not investigate further.

Thank you very much @cbrnr for the detailed and prompt response. I really appreciate it.

One more question: would you recommend preprocessing each subject separately by hand, or using a single loop-based script with the same general pipeline for all subjects, plus subject-specific QC steps such as bad-channel marking, annotations, and ICA component rejection?

Thanks for the insight! That makes a lot of sense regarding ICA’s mathematical ability to untangle artifacts even if they are smeared by an average reference.

Interestingly, I came across a paper that used the EEGLAB toolbox where the researchers took the exact opposite approach. They handled bad channels and ran ICA before applying the common average reference. Here is the snippet from their methods section:

‘…To remove ocular and muscle artifacts, Independent Component Analysis (ICA) was applied… After artifact removal and channel interpolation, the cleaned EEG was re-referenced to the common average across all channels.’ DOI: [https://doi.org/10.3390/bs15091213\].

It seems like even within the EEGLAB and broader EEG community, the exact pipeline sequence varies quite a bit. Why is there no absolute standard guideline for this specific order?

If we do choose to follow your recommendation and apply an average reference before running ICA, we must account for the resulting rank deficiency in our code. While simply marking channels as ‘bad’ drops them from the data matrix without losing rank for the remaining channels, operations like average referencing or interpolation introduce a linear dependency that creates true rank deficiency.

In MNE-Python, running ICA with default settings on average-referenced data can cause the algorithm to struggle or split components arbitrarily because it expects full rank. To fix this, we have to explicitly calculate the true data rank of our remaining good channels and reduce the number of components by 1:

# we subtract 1 to handle the rank deficiency caused by the average reference

data_rank = mne.compute_rank(raw_filtered)
n_components = data_rank['eeg'] - 1

#Initialize ICA with the corrected component count

ica = mne.preprocessing.ICA(n_components=n_components, method='picard', random_state=42)
ica.fit(raw_filtered)

Is the necessity of manually managing rank settings like this one of the main reasons some researchers lean toward doing the average reference after ICA instead, just to keep the data at full rank during the decomposition?

If the only purpose of ICA is to remove ocular (or other artifact) components, meaning that you compute the ICA, zero out some components, and immediately transform back to channel space, then the reference does not matter at all. The recommendation to perform re-referencing before ICA is only important if you plan to continue working with sources, i.e., the unmixing matrix, which has been computed with a specific reference. So if you know what you are doing, you can also re-reference after ICA (see e.g. here).

Another point to consider is that even though the reference should not influence ICA decomposition, many people still recommend using average referenced data because it is straightforward to compute and interpret (see e.g., here, here, and here).

Regarding data rank, you are correct that average referencing reduces the rank by 1. In most cases, MNE should detect the data rank correctly, so no manual intervention is required, but to be on the safe side you can pass n_components as you proposed. However, if you also marked and interpolated bad channels, the rank will be even lower (it is based on the number of good channels), so you would have to adjust it like this:

n_components=raw.info["nchan"] - n_bads - 1

Here, n_bads is the number of bad channels that have been interpolated. That’s what I’ve been doing in my pipelines.