Number of components for ICA (and processing)

Hello,

A few questions about ICA and the number of ICs to use.


First a brief summary of how ICA works as I understand it (from my reading, your excellent tutorials and your help on the forum):

ICA estimates signals from independent sources from a set of recordings. In other words, ICA isolates particular signals (blinks, muscle activity) that are statistically independent and non-Gaussian. For example, it is possible to separate a voice (analogous to an EEG artifact) from a musical recording made by several microphones (analogous to EEG channels).

First, linear correlations are removed from the data to facilitate the separation of independent sources, a step called “whitening” which makes the dimensions (channels) statistically independent of each other. PCA is used to reduce the dimensionality of the data, while preserving as much variance as possible. If n_pca_components = None, all dimensions are preserved (e.g., 64 dimensions for 64 EEG channels, ref here).

Next, ICA decomposes the data into n Independent Components (ICs) specified by the user through the n_components argument. Manually or automatically, the ICs to be excluded are selected using the IC topographies and plots.

Finally, the data is reconstructed from the original data, excluding undesirable ICs.


How do I choose the optimal number of ICs for EEG data? I read in the forum that you start by using the same number of components as EEG channels, and then if necessary you gradually reduce and observe the results. On the ICA function doc, it says that for MEG (but not EEG) data, it’s common to use n_components < n_channels (e.g., n_components = 40 for 306 channels) to find significant artifacts only, while reconstructing the data using the 306 PCA components.

I understand that too many ICs can isolate noise as an independent source and/or lose sources of interest by over-splitting them; while too few ICs could miss important sources. Is this reasoning correct?

→ Concretely with my data, I first performed a PCA for each subject to determine the minimum number of components to select such that it explains more than 80% of cumulative explained variance (mean 41.5, range 31-47). Then, I performed ICA on my epochs (because have long epochs of 9.4s which leaving a lot of data + 1Hz filter recommended) with 40 and 64 ICs to confirm that 40 components are sufficient to detect blinks (and economize computational resources and time). ICA works very well in both cases.

Given that in my situation, the ICA is performed only to detect eye blinks, which are significant artifacts, and that the signal reconstruction uses the residual PCAs, what do you think about this approach using n_component = 40? What would you recommend?

Fine attached an illustration of my data with ICA.

Thanks for your clarifications, always very instructive.

Johan

Simply leave the default of n_components=None, which will pick as many components as your data’s rank.

Best wishes,
Richard

1 Like

Just a minor correction, whitening does not make “the dimensions (channels) statistically independent of each other” – it merely decorrelates the channels.

Otherwise, your approach sounds reasonable and I agree with @richard to just compute all ICs, since your application is finding ocular ICs. These will always be among the first few ICs, so it doesn’t really matter if you compute all ICs or only e.g. 20.

I would not, however, reduce (compress) your data with PCA prior to ICA, but rather keep all PCA components, because this might negatively impact ICA decomposition (it might be necessary in specific situations though).

1 Like

@richard and @cbrnr, thank for your comments !

I note your recommendation and minor correction for whitening. I therefore use n_component = None, to use all available components.

You’re right about the PCA. My idea was simply to determine prior to the ICA a minimum number of components based on some justifiable logic; and then without taking into account the PCA performed, all the data were used for the ICA adjustment as you recommend.

Thanks,
Best wishes,

Johan

2 Likes

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.