Preprocessing multiple bdf. files in one run

Dear all,

I am currently in the process of creating a pipeline for my laboratory. This pipeline, while great at working with singular .bdf’s needs to be converted into a format that allows me to run and produce the grand averages of 15-30 bdf’s (ideally both individually and amalgamated into one plot).

I was informed using a pkl file format with all the bdfs in is a good way of going about it but I am completely stumped as to how to start.

Please note this pipeline is also set in a BIDS format so working with that would be ideal.

Many thanks in advance for your assistance.

Hello @Btimm and welcome to the forum!

Whatever you do, don’t use pickles for anything you intend to archive or share. It’s not guaranteed that you’ll be able to open these things even with the next release of MNE-Python (or any other of your dependencies).

I’m not entirely sure as to what your actual question is, though. After processing each participant individually with MNE-Python, simply save the results as a .fif file. Once you have results for all participants, you can e.g. use mne.grand_average() to average evoked data.

For accessing BIDS data, be sure to check out MNE-BIDS (new release with compatibility with MNE-Python 0.24 should be out in a few days).

If you have any further questions, please feel free to ask.

Best wishes,

1 Like

Hello @richard, thank you for the prompt response and warm welcome!

Duly noted in regard to the pickle file, thank you for notifying me as it might explain an issue we were having in regards to one of our scripts saying (No module named ‘mne._digitization’), despite us explicitly importing the module for the pkl file.

The question I had for you all was more specific to whether I could process all of the participant bdfs at the same time but I now see it is more appropriate to do it on an individual basis.

I will be sure to read through the release notes in the next few days then.
Many thanks again,

yes, in general our data objects (Raw, Epochs, etc) are meant to hold one subject’s worth of data at a time, and our preprocessing functions are designed to handle one data object at a time. So if you have a pipeline step that aggregates over subjects, this is usually done by looping (or parallelizing) over subjects, and then after the loop (or all parallel jobs) are complete, running the aggregation step. It’s probably possible to hack our data structures to include multiple subjects but it is almost certainly not worth the extra effort and is likely to be error-prone.

1 Like