Ideas on data preprocessing to calculate correlations on two sets of time-frequency data

  • MNE version: 1.3.dev()
  • operating system: Windows 11

Hi all, I’m working with 50 sets of data generated by 50 participants and trying to calculate correlation of time-frequency (i.e., psd with time granularity) between two people. Given that we want to preserve the time granularity and making sure that everyone has the same amount of data, I’m trying to repair instead of dropping the epochs during preprocessing. I am wondering if anyone has suggestions on the approach to go about it.

This is our current approach -

  1. Filter data between 1-40 hz
    raw.set_eeg_reference('average', projection=True)
    raw.filter(l_freq=para['l_freq'],h_freq=para['h_freq'],picks = 'data',verbose = False,n_jobs = 'cuda')
    raw.notch_filter(freqs=para['notch'],picks = 'data',verbose = False,n_jobs = 'cuda')
  1. Use Iclabel to exclude potential artifacts. Here, we preserved only components labeled brain given that components that are labeled as others seem to be capturing lots of noise.
 ica = ICA(
        n_components=15,
        max_iter="auto",
        method="infomax",
        random_state=para['ICA']['seed'],
        fit_params=dict(extended=True),
    )

    ica.fit(raw)
    ic_labels = label_components(epoch, ica, method="iclabel")
    exclude_idx = [idx for idx, label in enumerate(labels) if label not in ["brain"]]
    ica.apply(raw, exclude=exclude_idx)
  1. Use ransac to further process the data
annot = mne.Annotations(onset=list(range(0,int(raw.times[-1]),2)), duration= 0,description=para['EPOCH']['MarkName'])
    raw.set_annotations(annot,verbose = 'ERROR')
    events,event_key = mne.events_from_annotations(raw,verbose = 'ERROR')
    epoch = mne.Epochs(raw,preload=True,events = events,event_id=event_key, baseline=(0, 0),tmin = 0.,tmax = para['EPOCH']['DurationTime'],verbose = 'ERROR')
    ransac = Ransac(n_resample = int(raw.info['sfreq']),n_jobs = para['n_jobs'],verbose = False)
    epoch = ransac.fit_transform(epoch)

I’m wondering if adding in find_bad_eog_epochs (using fp1) or find_bad_ecg_epochs will help further improve the data quality. However, it seems redundant given that we’ve already have iclabel in place.

ICLabel will already label the ocular and cardiac components.
find_bad_eog, find_bad_ecg and find_bads_muscle are different approaches to find and label those components. You could use them as a secondary labeling method to validate the labels provided by ICLabel.

What I would change is:

  • increase the number of components. Reducing the number of components to 15 will speed-up fitting, but increasing it might help separating noise from brain signals.
  • check after the filter if you still have 50 Hz present in your data. With a (1, 40) Hz bandpass filter, I don’t expect any 50 Hz to remain. If this is the case, I would drop the notch filter.

Mathieu

1 Like