Power bands and missing data

Hi everyone,

I am dealing with a noisy dataset in which some EEG datapoints are np.NAN where the signal is bad. I want to compute the power in frequency bands by channel (alpha, theta, beta…) in 10 seconds windows, so I use mne.Epochs and mne.time_frequency.psd_multitaper.

How does psd_multitaper deal with np.NANs in the data? I do not get any error or warning when giving an input with np.NANs to psd_multitaper (besides a “divide by 0 encountered in log10”), but I notice that the power in some channels sometimes is np.NAN. For reference, this is a snippet of the code I am using for one frequency band between freq_min and freq_max.

psds, _ = mne.time_frequency.psd_multitaper(epochs, fmin=freq_min, fmax=freq_max)
psds = 10 * np.log10(psds)
# psds has shape (n_epochs, n_channels, n_freqs)
psds = psds.mean(axis=2) # get the power by channel 

Also, what would be a good way to deal with bad portions of the EEG data where the signal is missing (np.NAN) ?

Thank you!

it depends on how much of the data is bad. A common approach is to annotate the bad spans in the continuous data, and use reject_by_annotation when doing the epoching. However, with 10-second epochs you risk losing quite a lot of good data this way.

If the NaNs occur on just one or two channels at a time, then you could try interpolating. We don’t have an easy solution for this use case, but you could try interpolate_bads… it would probably involve extracting each NaN portion as a separate Raw object, marking the NaN channels as bad, interpolating, then replacing the NaN data in the original raw with the interpolated data. So, it’s a rather involved solution and also might introduce some discontinuities at the points where the interpolation starts/ends.

Sorry I don’t have any better suggestions :confused: