If you have a question or issue with MNE-Python, please include the following info:
- MNE version: e.g. 1.7.0
- operating system: macOS 14
Hello MNE community,
I have questions about best practices for cropping data as part of pre-processing in conjunction with AutoReject, ICA.
The EEG data I am working with included annotations for pairs of stimuli_start and stimuli_stop, repeating over 100 times in the data. Outside of these annotations, participants were able to relax, move about, and even remove the EEG device. Given these out-of-stimuli segments have arguably very poor quality data of no value, as part of pre-processing we thought it would be good to crop them out and stitch the events back-to-back together. Here is an example of the cropping code
import mne
# assume data is read as mne.io.raw object as variable raw
# Extract events from annotations (stimuli_start=2, stimuli_end=3,
# experiment_end=4--> occurs only once at the end)
events, _ = mne.events_from_annotations(raw, event_id={'2': 2, '3': 3, '4': 4})
# Create an event list containing only stimulus start-end events
start_event_code = 2
end_event_code = 3
start_end_events = events[(events[:, 2] == start_event_code) | (events[:, 2] == end_event_code)]
# Iterate through stimulus start-end event pairs and extract data segments
# Define a list to store the cropped epochs
stimulus_raw = []
sampling_rate = 250
# Define a placeholder
start_idx = None
for ev in start_end_events:
# Mark the start and end of each stimulus
if ev[2] == start_event_code:
start_idx = ev[0]
elif ev[2] == end_event_code:
end_idx = ev[0]
# Create an epoch with the defined start and end time
tmin = math.floor(start_idx / sampling_rate * 10000) / 10000 # maybe need to round down to 3(?) decimal places to ensure event is captured when converting from sample to time
tmax = end_idx / sampling_rate # no need to round
epoch = raw_dropCh_montage.copy().crop(
tmin=tmin,
tmax=tmax
)
stimulus_raw.append(epoch)
# Concatenate the epoch list into a single Epochs object
raw_cropped = mne.concatenate_raws(stimulus_raw)
My questions are
- What would be reasons to avoid cropping the data as such before subsequent pre-processing steps?
- Will the steps of the typical pre-processing steps AutoReject and ICA done on MNE see sub-optimal performances as a result of this cropping? Are there better approaches to deal with this situation?
- If the cropping approach is sound but the code can be more efficient, suggestions are welcome! (It is quite computationally heavy at the moment)
Thank you!
Lek Hong