MNE-BIDS-Pipeline error: `raw_task-rest_run-None` with dataset lacking run entity

:red_question_mark: If you have a question or issue with MNE-Python, please include the following info:

  • MNE version: 1.11.0
  • MNE-BIDS-Pipeline version: 1.9.0
  • operating system: Ubuntu 24.04.1 LTS

Hi everyone,

I am currently trying to run the MNE-BIDS-Pipeline on an EEG dataset, but I keep encountering the following error:

sub-PD1001 ses-01 run-rest A critical error occurred. The error message was: 'raw_task-rest_run-None'

Aborting pipeline run. The traceback is:

  File "/home/jeongwoo/miniconda3/envs/eeg/lib/python3.11/site-packages/mne_bids_pipeline/steps/preprocessing/_01_data_quality.py", line 80, in assess_data_quality
    bids_path_in = in_files.pop(key)
                   ^^^^^^^^^^^^^^^^^

Dataset structure

My dataset is organized in BIDS format as follows:

/mnt/e/openneuro/pd_mortality/
└── sub-PD1021/
    └── ses-01/
        └── eeg/
            ├── sub-PD1021_ses-01_task-rest_eeg.vhdr
            ├── sub-PD1021_ses-01_task-rest_eeg.eeg
            ├── sub-PD1021_ses-01_task-rest_eeg.vmrk
            ├── sub-PD1021_ses-01_task-rest_events.tsv
            ├── sub-PD1021_ses-01_task-rest_channels.tsv
            ├── sub-PD1021_ses-01_space-CapTrak_electrodes.tsv
            └── sub-PD1021_ses-01_space-CapTrak_coordsystem.json

The dataset contains resting-state EEG data and originally there was no run entity in the filenames.

What I tried

I initially ran the pipeline as-is and encountered the error above.

Based on previous discussions suggesting that missing run entities might be the cause (plesase check this), I modified the dataset by adding run-01 to all relevant files:

sub-PD1001_ses-01_task-rest_run-01_eeg.vhdr
sub-PD1001_ses-01_task-rest_run-01_eeg.eeg
sub-PD1001_ses-01_task-rest_run-01_eeg.vmrk
sub-PD1001_ses-01_task-rest_run-01_events.tsv
sub-PD1001_ses-01_task-rest_run-01_channels.tsv

I also updated:

  • DataFile= and MarkerFile= in the .vhdr

  • DataFile= in the .vmrk

However, the same error still persisted.

Additional debugging

I then tested reading the data using mne_bids.read_raw_bids() with run="01", which resulted in:

ValueError: PosixPath('eeg/sub-PD1001_ses-01_task-rest_run-01_eeg.vhdr') is not in list.
Did you mean one of ['eeg/sub-PD1001_ses-01_task-rest_eeg.vhdr']?

This suggested that the dataset metadata was still pointing to the original filenames.

After checking sub-PD1001_ses-01_scans.tsv, I found that it still contained the old filename:

eeg/sub-PD1001_ses-01_task-rest_eeg.vhdr

After updating it to:

eeg/sub-PD1001_ses-01_task-rest_run-01_eeg.vhdr

both read_raw_bids() and the MNE-BIDS-Pipeline started working correctly.

My Config

bids_root = "/mnt/e/openneuro/test"
deriv_root = "/mnt/e/openneuro/test/derivatives"

sessions = ["01"]
task = "rest"
task_is_rest = True
runs = 'all'

crop_runs = (5, 235)
subjects = "all"

process_empty_room = False
process_rest = False


ch_types = ["eeg"]
data_type = 'eeg'

eog_channels = None
eeg_reference = 'average'

drop_channels = ['FT9', 'TP9', 'FT10', 'TP10']
analyze_channels = 'all'

plot_psd_for_runs = 'all'
random_state = 42

l_freq = 0.10
h_freq = 40.0

l_trans_bandwidth = 'auto'
h_trans_bandwidth = 'auto'

rest_epochs_duration = 5
rest_epochs_overlap = 0

epochs_tmin = 0
baseline = None

spatial_filter = 'ica'
ica_reject = "autoreject_local"

ica_algorithm = "extended_infomax"

ica_l_freq = 1.0
ica_max_iterations = 3000

ica_n_components = None

ica_ecg_threshold = 0.1
ica_eog_threshold = 3.0

reject = "autoreject_local"
reject_tmin = None
autoreject_n_interpolate = [4, 8, 16]

run_source_estimation = False
n_jobs = 1
log_level = "info"

Question

I would like to better understand the root cause of this issue.

  • What exactly causes the raw_task-rest_run-None error to occur in this situation?

  • Is this behavior expected when a dataset does not include a run entity?

Additionally, is there a way to use MNE-BIDS-Pipeline with the original dataset structure (i.e., without adding a run entity and modifying scans.tsv)?

Or is adding run and updating all related metadata currently required for such datasets?

Thanks in advance!!