EEG pre-processing ICA failure due to memory load

mugwuffin · August 18, 2021, 11:25am

Python version: 3.9.5
MNE version: 0.23.0
Windows 10, Anaconda, Jupyter Notebook
Intel i7-5600U CPU @ 2.60GHz, 8Gb RAM

Hi all,

I am trying to pre-process some EEG .bdf files with MNE, the data is 32 EEG channels (4 EOG channels) the trials are long at approximately 3800 seconds and include 1440 events. I have successfully filtered (0.05Hz / 30Hz BP), downsampled to 256Hz, re-referenced and epoched. However I want to perform ICA for artefact and blink detection using @richard 's Pybrain pipeline with specific ICA epoch array filtered at 2Hz - 30Hz bandpass and downsampled to 256Hz. But I am running into memory issues because of the number of time points I think and it crashes my process, even if decimating 5,10 or 20 times. I have tried Jupyter notebooks, VSCode and Google Colab with the same issues. A colleague has previously used EEGLAB to process so I am following the same pipeline to hopefully extract similar ERP’s but cannot get passed this point.

Do you have any thoughts/ideas please? I am truly stuck!

# bandpass filter 2Hz to 30Hz for improved ICA

raw_ica = raw.copy().filter(l_freq=2, h_freq=30)

Filtering raw data in 1 contiguous segment
Setting up band-pass filter from 2 - 30 Hz

FIR filter parameters
---------------------
Designing a one-pass, zero-phase, non-causal bandpass filter:
- Windowed time-domain design (firwin) method
- Hamming window with 0.0194 passband ripple and 53 dB stopband attenuation
- Lower passband edge: 2.00
- Lower transition bandwidth: 2.00 Hz (-6 dB cutoff frequency: 1.00 Hz)
- Upper passband edge: 30.00 Hz
- Upper transition bandwidth: 7.50 Hz (-6 dB cutoff frequency: 33.75 Hz)
- Filter length: 423 samples (1.652 sec)

# Epoch raw_ica

def CreateICAEpochs(raw_ica, tmin=-0.2, tmax=1200, baseline=(None, 0)): # Decimate by 10 to reduce memory load
    epochs = mne.Epochs(raw_ica,
                    events=events,
                    event_id=event_id,
                    tmin=tmin,
                    tmax=tmax,
                    baseline=baseline,
                    decim=10,
                    preload=False)                # No preload because the files are too big for memory
    return epochs

epochs_ica = CreateICAEpochs(raw_ica)
epochs_ica.info

Not setting metadata
Not setting metadata
1440 matching events found
Setting baseline interval to [-0.1953125, 0.0] sec
Applying baseline correction (mode: mean)
0 projection items activated
<ipython-input-21-93a84fcd77b1>:4: RuntimeWarning: The measurement information indicates a low-pass frequency of 30 Hz. The decim=10 parameter will result in a sampling frequency of 25.6 Hz, which can cause aliasing artifacts.

*|Measurement date|November 21, 2018 10:34:27 GMT|*
*| --- | --- |*
*|Experimenter|Unknown|*
*|Participant|Unknown|*
*|Digitized points|35 points|*
*|Good channels|0 magnetometer, 0 gradiometer, and 32 EEG channels|*
*|Bad channels||*
*|EOG channels|blink_1, blink_2, blink_3, blink_4|*
*|ECG channels|Not available|*
*|Sampling frequency|25.60 Hz|*
*|Highpass|2.00 Hz|*
*|Lowpass30.00 Hz|*

n_components = 32  # One for each channel, no bads

method = 'picard'

max_iter = 1000  # high iterations, can be changed

fit_params = dict(fastica_it=5) # runs 5 fast_ica before fitting picard

random_state = 36

ica = mne.preprocessing.ICA(n_components=n_components,
                            method=method,
                            max_iter=max_iter,
                            fit_params=fit_params,
                            random_state=random_state)

ica.fit(epochs_ica)

Fitting ICA to data using 32 channels (please be patient, this may take a while)
Loading data for 1440 events and 307252 original time points …
:12: RuntimeWarning: The epochs you passed to ICA.fit() were baseline-corrected. However, we suggest to fit ICA only on data that has been high-pass filtered, but NOT baseline-corrected.
501 bad epochs dropped

TypeError: __init__() missing 1 required positional argument: 'dtype'

**---------------------------------------------------------------------------** **MemoryError** Traceback (most recent call last) **MemoryError** : Unable to allocate 6.88 GiB for an array with shape (28851714, 32) and data type float64 The above exception was the direct cause of the following exception: **TypeError** Traceback (most recent call last) **<ipython-input-22-79f412e57723>** in <module> 10 fit_params **=** fit_params **,** 11 random_state=random_state) **---> 12** ica **.** fit **(** epochs_ica **)** **<decorator-gen-420>** in fit **(self, inst, picks, start, stop, decim, reject, flat, tstep, reject_by_annotation, verbose)** **~\anaconda3\envs\mne\lib\site-packages\mne\preprocessing\ica.py** in fit **(self, inst, picks, start, stop, decim, reject, flat, tstep, reject_by_annotation, verbose)** 572 **else** **:** 573 **assert** isinstance **(** inst **,** BaseEpochs **)** **--> 574** self **.** _fit_epochs **(** inst **,** picks **,** decim **,** verbose **)** 575 576 **# sort ICA components by explained variance** **~\anaconda3\envs\mne\lib\site-packages\mne\preprocessing\ica.py** in _fit_epochs **(self, epochs, picks, decim, verbose)** 634 **# more from _pre_whiten)** 635 data **=** np **.** hstack **(** data **)** **--> 636** self **.** _fit **(** data **,** **'epochs'** **)** 637 638 **return** self **~\anaconda3\envs\mne\lib\site-packages\mne\preprocessing\ica.py** in _fit **(self, data, fit_type)** 705 706 pca **=** _PCA **(** n_components **=** self **.** _max_pca_components **,** whiten **=** **True** **)** **--> 707** data **=** pca **.** fit_transform **(** data **.** T **)** 708 use_ev **=** pca **.** explained_variance_ratio_ 709 n_pca **=** self **.** n_pca_components **~\anaconda3\envs\mne\lib\site-packages\mne\utils\numerics.py** in fit_transform **(self, X, y)** 815 **def** fit_transform **(** self **,** X **,** y **=** **None** **)** **:** 816 X **=** X **.** copy **(** **)** **--> 817** U **,** S **,** _ **=** self **.** _fit **(** X **)** 818 U **=** U **[** **:** **,** **:** self **.** n_components_ **]** 819 **~\anaconda3\envs\mne\lib\site-packages\mne\utils\numerics.py** in _fit **(self, X)** 853 X **-=** self **.** mean_ 854 **--> 855** U **,** S **,** V **=** _safe_svd **(** X **,** full_matrices **=** **False** **)** 856 **# flip eigenvectors' sign to enforce deterministic output** 857 U **,** V **=** svd_flip **(** U **,** V **)** **~\anaconda3\envs\mne\lib\site-packages\mne\fixes.py** in _safe_svd **(A, **kwargs)** 67 **raise** ValueError **(** **'Cannot set overwrite_a=True with this function'** **)** 68 **try** **:** **---> 69 ****return****** linalg **.** svd **(** A **,** ****** kwargs **)** 70 **except** np **.** linalg **.** LinAlgError **as** exp **:** 71 **from** **.** utils **import** warn **~\anaconda3\envs\mne\lib\site-packages\scipy\linalg\decomp_svd.py** in svd **(a, full_matrices, compute_uv, overwrite_a, check_finite, lapack_driver)** 125 126 **# perform decomposition** **--> 127 **u, s, v, info = gesXd(a1, compute_uv=compute_uv, lwork=lwork,**** 128 full_matrices=full_matrices, overwrite_a=overwrite_a) 129 **TypeError** : __init__() missing 1 required positional argument: 'dtype'

richard · August 18, 2021, 11:57am

Hello,

I would say that a computer with this little memory is unsuitable for the EEG processing you have in mind; isn’t there any way you can upgrade the RAM to have at least 12, better 16 GB of memory?

Otherwise the only other proposals I have are:

decimate more (probably not really advisable though)
make the epochs shorter (probably not suitable for your analysis?)
only use a subset of epochs for fitting ICA (I guess this would be my preferred approach!), e.g.:
```
 ica.fit(epochs_ica[:100])  # only use the first 100 epochs for fitting
```
try using SSP instead of ICA for artifact rejection (might work well for EOG removal)

You’re also dropping a lot of “bad” epochs, is that intentional?

mugwuffin · August 18, 2021, 2:46pm

Thanks for the quick reply Richard.

Unfortunately no my ram is not upgradeable. I assumed it was ok, not great but ok.

I will look at the reduced ICA fit and SSP options, thanks.

I’m not sure why 510 epochs are being dropped. Is there a way to assess where in the processing that is taking place? Or why they are being dropped? It is not intentional.

drammock · August 18, 2021, 3:09pm

epochs.drop_log tells you the reason why each epoch was considered for dropping. A summary is available via epochs.plot_drop_log()

richard · August 18, 2021, 3:19pm

You’d want to call

epochs_ica.drop_bad()

and then proceed as @drammock suggested:

epochs_ica.plot_drop_log()

balandongiv · August 19, 2021, 4:08pm

mugwuffin.

Regarding your hardware spec.

You also can consider to run your analysis using Google Colab

mugwuffin · August 19, 2021, 6:12pm

Thanks, I did try Google Colab with this pipeline but again it crashes and failed on memory. As far as I understand the free tier of colab includes 13Gb and they have closed the hack to crash the runtime and be upgraded to 25Gb. I guess I could pony up the £8.10 for colab pro.

richard · August 19, 2021, 6:48pm

Have you tried my suggestion to only use a subset of epochs for fitting ICA? Since your epochs are extremely long, even a handful of them should be sufficient to allow ICA to single out EOG and heartbeat artifacts. In fact, even less than a single epoch might be sufficient, as – if I’m understanding correctly – one epoch is 1200 seconds long, which is 20 minutes. Seriously, try feeding only a single epoch to ICA and see what happens – it might do the trick for you!

mugwuffin · August 19, 2021, 7:08pm

I couldn’t do it today unfortunately as was at work, but will absolutely try tomorrow and report back. But I think you have uncovered a mistake in your last comment @richard. My epochs are supposed to be -200ms to 1200ms! Not 1200s. oops. What an idiot. That might help

I was trying to understand why my epoch.fif files are so big as well, 9 x 2Gb.

Back to the code.

richard · August 19, 2021, 7:19pm

Hehe, good luck!!

mugwuffin · August 20, 2021, 9:20am

I can report back that everything is back to normal, it was that silly time mistake. Thanks for the help and some good tips for me going forward. Great forum.

Topic		Replies	Views
Filtering and ICA memory issues Mailing List Archive (read-only) list-archive	10	239	November 28, 2015
Check for ICA (toward Epochs Rejection) Support & Discussions preprocessing , ica	27	3592	February 25, 2021
Preprocessing long EEG recordings Support & Discussions preprocessing , eeg , visualization , ica	2	182	October 25, 2024
Preprocessing EEG data for mental imagery offline analyses Support & Discussions preprocessing , eeg , visualization	13	1278	September 4, 2022
Cannot find ecg artifacts during ica Support & Discussions preprocessing , meg , ica	38	960	July 13, 2022

EEG pre-processing ICA failure due to memory load

Related topics