How many ICA components represent blink artifacts?

Hi all. I have a bandpass filtered (1-48 Hz) Epochs object (60 epochs, 16 channels, 512 samples) and I am trying to implement automatic blink artifact correction using ICA. Since I have only eeg channels, I am using correlation of ICA sources with Fp1 channel to find the components that represent blink artifact and then exclude these components. This is the code I am using:

# Reject bad epochs based on peak-to-peak amplitude (for huge artifacts)
epochs.drop_bad(reject = {'eeg': 1000e-6})
            
# Preallocate reconstructed epochs
epochs_recons = epochs.copy()
            
# Fit ICA to epochs
ica = mne.preprocessing.ICA(random_state = 100)
ica.fit(epochs)
            
# Find blink artifact component, exclude it, and reconstruct epochs
 _, eog_scores = ica.find_bads_eog(epochs, ch_name = 'Fp1', measure = 'zscore')
ica.exclude = list([np.argmax(abs(eog_scores))])
ica.apply(epochs_recons)

# Reject bad epochs based on peak-to-peak amplitude (for smaller artifacts)
epochs_recons.drop_bad(reject = {'eeg': 100e-6})

The thing is, I have noticed by looking at ica.plot_components(), that for some recordings there appear to be only an ICA component catching blink artifacts (Figure 1), while for others there are two (Figure 2). The code I shared above only excludes the ICA component that shows the highest correlation with Fp1, so, my question is the following: In cases like the one shown in Figure 2, should I exclude the first two components to get rid of the blink artifacts or would that increase the risk of removing valuable brain activity information and thus, it is preferable to remove only one ICA component?

Figure 1: only one ICA component seems to catch blink artifact activity

Figure 2: two ICA components seem to catch blink artifact activity

Thanks in advance,
Eduardo

That’s also something I have experienced. Sometimes there’s just one IC, and sometimes there’s two or even three.

In your specific example in Figure 2 there are two ICs that look almost the same (except for polarity). This could be due to almost rank-deficient data that enters the ICA. You might want to try to reduce PCA dimensionality by one before doing ICA and see if that changes anything.

In addition, I’m actually wondering whether ICA002 in Figure 1 and ICA004 are ocular components. They could very well represent horizontal eye movement, but you’d have to double-check the IC time course and PSD.

1 Like

Hey @cbrnr, thanks for commenting. For the particular subject in Figure 2, using 15 components instead of 16 produces the following topo map:

Nonetheless, since I am trying to implement automated blink artifact removal, I do not want to select the number of components manually. Should I then proceed with 16 components and remove ICA000 and ICA001 in Figure 2 to get rid of the blink artifact? I am worried I could remove meaningful EEG content.

This looks better, the duplicate ICs are now likely combined in ICA000 (see here for a more detailed explanation).

Is your data average referenced? Then you should definitely remove one PC. If not, you might want to try a different ICA algorithm (I recommend Picard). I still think that ICA003 could be an ocular component though.

1 Like

You should not simply remove a fixed, pre-specified selection of components.

Instead, you can use our automated EOG artifact detection, as is described in the ICA tutorial

Regarding how many components to fit – you can try to pass n_components=0.99 to ICA; this will only fit as many components as are required to explain 99% of variance. Any remaining dimensions of the data will be ignored during the ICA fit. This could potentially help.

1 Like

Hey @richard, thanks for stopping by. As you can read in the opening comment, I am not removing a fixed number of ICA components. I am using correlation between ICA sources and Fp1 EEG channel (I do not have eog channels) to find the component that best captures the blink artifact. So it is an automated procedure for each subject.

Regarding your comment about n_components and explained variance, I will consider it. Thanks again.

1 Like

I’m sorry, I initially missed the part where you’re using ica.find_bads_eog(). This is good practice, then.

However, it appears you’re running this artifact detection on the same set of epochs you wish to clean, and I assume these are time-locked to an experimental event?

We recommend creating epochs that specifically capture EOG artifacts and pass those to ica.find_bads_eog() instead. These epochs can be created via mne.preprocessing.create_eog_epochs() This will typically produce better results than using your “regular” epochs.

I created my epochs from a resting state recording using mne.make_fixed_length_events.

I think I get your point @richard . Correct me if I am wrong, the idea behind passing the epochs found by mne.preprocessing.create_eog_epochs() to ica.find_bads_eog() instead of passing my “regular” epochs is that ICA will have an “easier job” finding which components represent eog artifacts because the epochs contained in mne.preprocessing.create_eog_epochs() have been already identified as artifactual as compared to my “regular” epochs. The code could be then modified like this:

# Reject bad epochs based on peak-to-peak amplitude (for huge artifacts)
epochs.drop_bad(reject = {'eeg': 1000e-6})
            
# Preallocate reconstructed epochs
epochs_recons = epochs.copy()
            
# Fit ICA to epochs
ica = mne.preprocessing.ICA(random_state = 100)
ica.fit(epochs)

# Identify epochs containing eog artifacts
epochs_eog = mne.preprocessing.create_eog_epochs(raw, ch_name = 'Fp1')
            
# Find blink artifact component, exclude it, and reconstruct epochs
 _, eog_scores = ica.find_bads_eog(epochs_eog , ch_name = 'Fp1', measure = 'zscore')
ica.exclude = list([np.argmax(abs(eog_scores))])
ica.apply(epochs_recons)

# Reject bad epochs based on peak-to-peak amplitude (for smaller artifacts)
epochs_recons.drop_bad(reject = {'eeg': 100e-6})

I suspect the results will be pretty similar since I checked the component my previous code identified as eog artifact and it looked correct in terms of time series and topography. However, I will apply your suggestion into my pipeline since you think is best practice.

1 Like