Using "n_jobs=-1" for faster parallel CPU processing does not increase speed

Many mne functions have the keyword argument “n_jobs” to parallelize processing. My machine (M1 Mac) has 10 cores, but no matter how I change the “n_jobs” kwarg, the processing speed is always the same.

What am I doing wrong? Please check this Jupyter notebook for exemplary reproducible code.

Please note, that this does not seem to be an M1 issue. I ran the same notebook on an old Intel iMac and also did not observe any performance enhancements: Jupyter Notebook Intel Chip.

  • MNE version: 1.3.0
  • operating system: macOS 13.2.1-arm64-arm-64bit /

Hello @moritz-gerster,

generally, it’s important to consider that parallelization of data processing often comes with considerable initialization costs, i.e., chunks of data need to be copied to new memory segments and new processes need to be spawned; and after processing is done, the results need to be collected and glued back together. This means that for operations that don’t take long on a single core (i.e., finish in less than a few seconds or sometimes even tens of seconds), trying to parallelize things may even lead to a slowdown.

In the examples you posted, we’re talking processing time of about a quarter of a second, which I would say is not suitable for going parallel (at least not with the machinery we currently use in MNE-Python).

Personally, I use parallelization esp. when doing MVPA, and it leads to significant speedups on my M1 Mac.

Best wishes,
Richard

1 Like

Thank you very much for this instructive answer @richard!

I will try to re-run the script with much larger data.

Short follow-up question: What is MVPA?

“Decoding” / machine learning.

See for example this temporal generalization example:
https://mne.tools/stable/auto_tutorials/machine-learning/50_decoding.html#temporal-generalization

That’s an operation where we’d usually go parallel for a huge speedup. See the (admittedly not super easy-to-read) code in MNE-BIDS-Pipeline: