What happens when you gives "fake" or interpolated channel to an ICA algorithme (and especially the MNE one)

  • MNE 1.3.1
  • Windows 10

Hello,

Some context: I am working with a 4 channels muse headset, tried an ICA, I then obtained 4 components that are mixtures between artifact and brain signals. Each of the components seems to have too much brain signal for me to discard it.

From there I wondered: What if you artificially add a new channel (for example via interpolation) in order to obtain a new component out of the ICA. It looks wrong as you do not add any new information for the ICA so the components will still be a “blending” of the same channels but you will surely get a new component and “reroll” the “mix percentage” for all of the components (and maybe get one that is more distinctly an artifact or a brain signal ?).

My question is: is artificially adding a channel (via interpolation for example) before performing ICA can gives interesting results or all the components found must be interpreted as bad and should not be looked at ? How MNE ICA function will handle these cases ? if it is wrong, what is the theory behind ?
And lastly, what would change if the artificial channel added is the resulting of:

  • an interpolation or a linear combination from another channel
  • an exact copy of another channel time signal.
  • an exact copy of another channel but with a white noise added
  • a polynomial approximation with a sliding windows on another channel => via a savitzky-golay filter that has been applied on another channel and we take the resulting filtered signal.

I tried quickly these case with MNE ICA to see its behavior and have some empirical observations to illustrate:

  • It always worked (no error), even for the copied channel + white noise (even though it break the “each source signal have non-Gaussian distributions” ICA prior assumption) or the polynomial approximation (does this one break the stationary assumption ?). For the channel + white noise, it successfully gave me a component that looks like the white noise added source.

  • For the 2 first tests, It gave me the variance warning but not for the white noise or the polynomial approximation:
    RuntimeWarning: Using n_components=5 (resulting in n_components_=5) may lead to an unstable mixing matrix estimation because the ratio between the largest (2.3) and smallest (1e-24) variances is too large (> 1e6); consider setting n_components=0.999999 or an integer <= 4

  • In each group of 5 ICs resulting from these tests, there seems to have the same 2 components that looks the same across experiment and that looks correlated to each others.

1 Like

First, ICA will not be able to separate brain sources from only four channels. In my experience, you need at least 32 channels, if not 64. If your goal is to “just” find ocular sources, then fewer electrodes (around 20) also work often, but definitely not four.

Adding artificial channels (via interpolation) before ICA won’t work as you expect (or hope), because you are not adding new information. If anything, you are making it more difficult for the algorithm to work, because the data matrix is now rank deficient. Usually, this is not problematic, because the first step of ICA is PCA, which retains only uncorrelated components (but sometimes, this automatism can fail and you need to manually specify how many PCA components to retain).

Adding a white noise channel also doesn’t help in separating brain and artifact sources, except that you will get a component corresponding to white noise (one non-Gaussian source is allowed).

1 Like

Thanks you for your answer !
But beware as my thirst for knowledge about the ICA magic is not quenched:

First, ICA will not be able to separate brain sources from only four channels. In my experience, you need at least 32 channels, if not 64. If your goal is to “just” find ocular sources, then fewer electrodes (around 20) also work often, but definitely not four.

I think the same intuitively but because:

  • I saw a youtube video of Arnaud Delorme from EEGLAB applying an ICA on the same 4 channel muse cap (https://www.youtube.com/watch?v=H6-e3tNT9EQ) and getting a component looking enough like an ocular artifact to be discarded. But it is a tutorial and the dataset used may be a simple one.

  • ICA looks like a bit like wizardry and especially at the “independent components = sum of blending ratio of each information provided by each channels”.
    Even If I have 4 channels only and there are >15 sources to find in the channel recording that make >90% of the variance of the entire signal recorded.
    Case 1: Can I, with 4 channels, get an independent component out of 4 that will be :
    IC1 = w1 * Chan1 + w2 * Chan2 + w3 * Chan3 + w4 * Chan4
    (w1, w2, w3, w4 being weight from the mixing matrix, Chan1, Chan2, Chan3, Chan4 are informations from respective channels)
    And it would somehow correspond (we cannot be sure) at the source level to =>
    ICA1 = 20 % of >14 various sources + 80 % of one source (example: an ocular artifact).
    => Then, in this case it would be the Jackpot! Because this component would be labelised as an ocular artifact by ICLabel and I can discard this component as It captured mainly an ocular artifact.
    Case 2:
    I don’t hit this miraculous blending ratio of channels after an ICA. I don’t get a component that is mainly constituted of one source.
    Can I add another fake channel that will gives 0 additional information but will re-roll all the blending ratio and gives me new components containing different amount of real sources ?
    Even though we cannot know the “real” source amount contained in each component, the question is : can I possibly get new results if I run ICLabel on these 5 new components ?

=> In summary, I know that more channels = better components, and that under a critical amount of channels nobody do ICA. But, by giving it a try, is it theoretically possible to hit what I call a blunding-ratio jackpot ?

Adding artificial channels (via interpolation) before ICA won’t work as you expect (or hope), because you are not adding new information.

The resulting components are still legit ? Having a rank deficiency in the channel matrix, will gives resulting components of a less quality ? or the ICA algorithm will only take more computation time but at the end these component are as legit as the ones resulting from an ICA applied on the original square channel matrix ? If they are different, at which level they are ?

If anything, you are making it more difficult for the algorithm to work, because the data matrix is now rank deficient. Usually, this is not problematic, because the first step of ICA is PCA, which retains only uncorrelated components (but sometimes, this automatism can fail and you need to manually specify how many PCA components to retain).

Interesting, If I remind well, EEGLAB would throw you an error in case of a rank-deficient matrix but MNE will still try to perform the PCA to resolve the problem itself and only throw an error if the PCA does not resolve.
But if the components resulting from an ICA performed on a rank-deficient channel matrix are of less quality, MNE ICA function should perhaps print a warning that the matrix is rank-deficiency ?

Adding a white noise channel also doesn’t help in separating brain and artifact sources, except that you will get a component corresponding to white noise (one non-Gaussian source is allowed).

Ok so one non-gaussian source won’t make an error and will gives an IC corresponding to it (even though it just slow down the ICA). What happens if there are more than one ? Error or because it cannot “differenciate” 2 non-gaussian sources it will make one component containing both of them ?

Thank you in advance for your time, I ask for a lot but it is just that I feel like that in order to have answers to these questions myself I would have to understand in great depth all these theoretical pdf’s on ICA :skull:.

1 Like

I haven’t worked with datasets with such a low number of channels before, but the YouTube tutorial you linked shows that removing one component from a four-channel EEG set does remove a lot of noise (but not eye blinks). However, you don’t know how much brain activity is also removed, but this is always going to be a trade off.

Yes, they are still legit. It might be the case that ICA cannot separate sources that well for rank-deficient data, see e.g. here and here.

I think this should already be the case, because it prints the number of components.

There won’t be an error, but ICA will not be able to resolve the sources. The basic assumption is that the mixture (the EEG) is Gaussian according to the Central Limit Theorem, and ICA tries to find sources that are maximally non-Gaussian (see e.g. here for a nice explanation).

2 Likes

I haven’t worked with datasets with such a low number of channels before, but the YouTube tutorial you linked shows that removing one component from a four-channel EEG set does remove a lot of noise (but not eye blinks). However, you don’t know how much brain activity is also removed, but this is always going to be a trade off.

Indeed, so I was wondering if the ratio artifact / brain signal from components could be influenced via adding fake channels because it would force the ICA to return as many additional components as there are additional fake channels.
If all the components returned this way are legit, maybe they will contained different ratio of artifact / brain signal and that could lead to a more beneficial IC discarding trade-off.

It might be the case that ICA cannot separate sources that well for rank-deficient data, see e.g. here and here.

=> This result is what I got when I empirically tried to add various fake channels before an ICA

“The noise above is most likely due to instability in the ICA decomposition algorithm, which here forced to create two components compensating for each others’ activity. We are not sure that removing the single blink component below is preferable to removing the two very noisy components above since we have not run any formal comparison. Our reasoning is that the two components above tend to make other components noisy, so the PCA dimension reduction solution is preferable.”

So adding fake channel make the ICA instable and we end up with 2 components with opposed polarities + all the others components are noisy. And the qualification “noisy” expresses “components are less independent sources and more a mix of all the sources with “balanced” blending ratio” that are closer to gaussian mixture.

There won’t be an error, but ICA will not be able to resolve the sources.

Ok so we’ll end up with an instable ICA that will return the same thing mentioned above.

3 Likes

Not sure if this will help, but you can think of ICA as a system of linear equations.
If you have an equation:
2x + 3y = 17
You can’t find both x and y. But if you have two equations, you can.

2x+3y=17
x+5y=19

What you have proposed above is essentially adding a multiple of the first equation to try to solve for both variables:

2x+3y=17
4x+6y=34

So you haven’t added any information, and haven’t helped solve for the two variables.

2 Likes

Not sure if this will help, but you can think of ICA as a system of linear equations.
If you have an equation:
2x + 3y = 17
You can’t find both x and y. But if you have two equations, you can.

It is a nice way to describe it indeed but there is one difference.
Afterwards my problem was mainly that in the case of an ICA, if you add a “multiple of one of the equations” (aka a “fake” equation that does not bring any new information) you will have new results (because the resulting ICs found will be different) unlike the equation system where adding a redundant equation won’t change the results.

According to the link shared by Clemens Brunner, in the case of an ICA, adding a fake equation/channel will disturb the algorithm and gives components of a less quality (hence the new results).