When computing a TFR using mne.time_frequency.tfr_multitaper (and possibly other related functions), I was surprised that (1) the function reported epochs as bad and (2) these bad epochs were dropped in-place from the mne.Epochs I passed to that function.
This drops two “bad” epochs. I have no idea why two epochs are bad though. Also, it doesn’t say which of the 80 epochs are bad. Most strikingly though is the fact that after the call to tfr_multitaper, the two bad epochs have been dropped from epochs.
I did notice that setting tmin and tmax to -1 and 2 (instead of -1.5 and 2.5), respectively, does not create any bad epochs. Again, I have no idea why.
In summary, I’d be glad if someone could explain (1) how bad epochs are defined and (2) why they are also dropped from the original epochs object.
It contains mostly ('IGNORED',) and () entries, but the first one is ('NO_DATA',) and the second to last one is ('TOO_SHORT',). Apparently, mne.Epochs does not immediately create (preload) the epoched data even though the raw data has already been preloaded. Then computing TFRs with tfr_multitaper triggers the preload, which then leads to the rejection of two epochs because the first one doesn’t have 1.5s of data before the event. I don’t really understand the second to last one, what does 'TOO_SHORT' refer to?
Yep, that’s the intended behavior. By default, preload=False. After Epochs creation, you can call Epochs.drop_bad() on both, preloaded and non-preloaded data. Or simply pass preload=True. You should get the same dropping behavior as the one you described.
This can happen at the very beginning and very end of your raw data stream: the time-locked event is so close to the beginning or end of your data that there are simply not enough samples to create an epoch of the requested duration.
The confusing part for me was that even though I had my raw data already preloaded, creating epochs ignored this setting because it has its own independent preload parameter. This makes sense of course, but an alternative behavior would be to re-use the preload state of the raw object I’m passing to mne.Epochs.
Sure I can try to improve the docs, I guess we could add a note to the preload description, which currently says:
Load all epochs from disk when creating the object or wait before accessing each epoch (more memory efficient but can be slower).
I guess “from disk” is confusing (because I assumed that if a raw object is already preloaded, data for epoching would be taken from memory). To avoid having to mention implementation details, I’d write “into memory” instead, then it doesn’t matter if the data is read from disk or memory as long as a new memory is allocated:
Load all epochs into memory when creating the object or wait before accessing each epoch (more memory efficient but can be slower).
I’d also add a note:
Epochs are preloaded based on the value of the preload argument only. Even if the raw argument has already been preloaded, the returned Epochs object will not be preloaded by default. This means that any bad epochs (e.g. epochs with not enough data) will not be dropped immediately, so the number of epochs displayed may change once the epochs are preloaded.
I’m also open to discuss switching the default value of preload to None and set it dependent on raw.preload. But this will cause some existing code to break as epochs rejection would then happen on creation by default if the raw data was preloaded.