I am running ICA in order to process EEG data, and I would like to know the variance for each component that is removed. I can access this information if I create an HTML report, but is there a way to export this information to a dataframe or table?
Hi, please take a look at the following example, which should answer your question.
# %%
import mne
sample_dir = mne.datasets.sample.data_path()
sample_fname = sample_dir / 'MEG' / 'sample' / 'sample_audvis_raw.fif'
raw = (
mne.io.read_raw_fif(sample_fname)
.crop(tmax=60)
.pick_types(eeg=True)
.load_data()
)
# %% Fit ICA
ica = mne.preprocessing.ICA(n_components=15, method='picard')
ica.fit(raw)
# %% Retrieve explained variance
# unitize variances explained by PCA components, so the values sum to 1
pca_explained_variances = ica.pca_explained_variance_ / ica.pca_explained_variance_.sum()
# Now extract the variances for those components that were used to perform ICA
ica_explained_variances = pca_explained_variances[:ica.n_components_]
for idx, var in enumerate(ica_explained_variances):
print(
f'Explained variance for ICA component {idx}: '
f'{round(100 * var, 1)}%'
)
Explained variance for ICA component 0: 66.9%
Explained variance for ICA component 1: 11.3%
Explained variance for ICA component 2: 3.4%
Explained variance for ICA component 3: 2.4%
Explained variance for ICA component 4: 1.9%
Explained variance for ICA component 5: 1.6%
Explained variance for ICA component 6: 1.3%
Explained variance for ICA component 7: 1.1%
Explained variance for ICA component 8: 1.0%
Explained variance for ICA component 9: 0.8%
Explained variance for ICA component 10: 0.7%
Explained variance for ICA component 11: 0.6%
Explained variance for ICA component 12: 0.6%
Explained variance for ICA component 13: 0.6%
Explained variance for ICA component 14: 0.5%
@agramfort just pointed out on GitHub that this approach is actually mathematically incorrect. Weâre working on something that will make it easy for users to directly retrieve the explained variance from the ICA object after fitting. Iâll let you know when itâs ready (end of this week). If youâre curious, you can track our progress here:
Thanks for that function! I would have a question regarding the interpretation of those values. As is mentioned in the documentation, because ica components are not orthogonal, negative explained variances are possible. Based, on the formula how the variance is computed:
that would occur when the variance of the data after removal of the components has more variance than the original data (so mean_var_diff > mean_var_orig). Intuitively this sounds, as if it wouldnât be a good idea to remove those components then (if they increase the variance).
Do you happen to have more insights under which circumstances these negative variances can occur, what that means and what to do about it?
Hello, itâs actually quite possible that the removal of one specific component increases variance of the reconstructed data. The algorithm implemented here is the one used by EEGLAB; I suggest you read up on it here:
Search that page for âpvafâ, which is the metric weâre calculating.
Thanks! Yeah, I read that, unfortunately, they donât give much info on the howâs and whyâs of this phenomenon/algorithm. But, I guess if itâs rather normal, there is no reason to worry.