update:
cluster is back after maintenance. and after some back and forth, I was able to reproduce the issue also with the script from the tutorial. However, it seems to be a question regarding parallel processing as the settings of the bash script to distribute it to the cluster make all the difference.
here’s the .py
script I use (it’s basically just a c&p from the tutorial combined with the set_log_level('WARNING')
command:
csp_demo.py
import numpy as np
import matplotlib.pyplot as plt
import mne
from mne import Epochs, create_info, events_from_annotations
from mne.io import concatenate_raws, read_raw_edf
from mne.datasets import eegbci
from mne.decoding import CSP
from mne.time_frequency import AverageTFR
from sklearn.discriminant_analysis import LinearDiscriminantAnalysis
from sklearn.model_selection import StratifiedKFold, cross_val_score
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import LabelEncoder
old_log_level = mne.set_log_level('WARNING', return_old_level=True)
event_id = dict(hands=2, feet=3) # motor imagery: hands vs feet
subject = 1
runs = [6, 10, 14]
raw_fnames = eegbci.load_data(subject, runs)
raw = concatenate_raws([read_raw_edf(f) for f in raw_fnames])
# Extract information from the raw file
sfreq = raw.info['sfreq']
events, _ = events_from_annotations(raw, event_id=dict(T1=2, T2=3))
raw.pick_types(meg=False, eeg=True, stim=False, eog=False, exclude='bads')
raw.load_data()
# Assemble the classifier using scikit-learn pipeline
clf = make_pipeline(CSP(n_components=4, reg=None, log=True, norm_trace=False),
LinearDiscriminantAnalysis())
n_splits = 5 # how many folds to use for cross-validation
cv = StratifiedKFold(n_splits=n_splits, shuffle=True, random_state=42)
# Classification & time-frequency parameters
tmin, tmax = -.200, 2.000
n_cycles = 10. # how many complete cycles: used to define window size
min_freq = 5.
max_freq = 25.
n_freqs = 8 # how many frequency bins to use
# Assemble list of frequency range tuples
freqs = np.linspace(min_freq, max_freq, n_freqs) # assemble frequencies
freq_ranges = list(zip(freqs[:-1], freqs[1:])) # make freqs list of tuples
# Infer window spacing from the max freq and number of cycles to avoid gaps
window_spacing = (n_cycles / np.max(freqs) / 2.)
centered_w_times = np.arange(tmin, tmax, window_spacing)[1:]
n_windows = len(centered_w_times)
# Instantiate label encoder
le = LabelEncoder()
# init scores
freq_scores = np.zeros((n_freqs - 1,))
# Loop through each frequency range of interest
for freq, (fmin, fmax) in enumerate(freq_ranges):
# Infer window size based on the frequency being used
w_size = n_cycles / ((fmax + fmin) / 2.) # in seconds
# Apply band-pass filter to isolate the specified frequencies
raw_filter = raw.copy().filter(fmin, fmax, n_jobs=1, fir_design='firwin',
skip_by_annotation='edge')
# Extract epochs from filtered data, padded by window size
epochs = Epochs(raw_filter, events, event_id, tmin - w_size, tmax + w_size,
proj=False, baseline=None, preload=True)
epochs.drop_bad()
y = le.fit_transform(epochs.events[:, 2])
X = epochs.get_data()
# Save mean scores over folds for each frequency and time window
freq_scores[freq] = np.mean(cross_val_score(estimator=clf, X=X, y=y,
scoring='roc_auc', cv=cv,
n_jobs=-2), axis=0)
And here is the bash script that I use to have SLURM distribute it to the cluster:
csp_demo.sh
#!/bin/bash -l
# specify the indexes of the job array elements
#SBATCH --array=0-1
# Standard output and error:
#SBATCH -o ./job.out.%j # Standard output, %A = job ID, %a = job array index
#SBATCH -e ./job.err.%j # Standard error, %A = job ID, %a = job array index
# Initial working directory:
#SBATCH -D /ptmp/fklotzsche/Experiments/vMemEcc/
# Job Name:
#SBATCH -J hansi_ftw
# Queue (Partition):
#SBATCH --partition=general
# Number of nodes and MPI tasks per node:
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=10
# |
# |
#SBATCH --time=00:05:00
module load anaconda/3
conda activate mne
# Run the program:
srun python ./vme_analysis/Analyses_Py/csp_demo.py $SLURM_ARRAY_TASK_ID
I found out that if I set --cpus-per-task
to something larger than 1, I would get the issue (ie., unwanted output from compute_covariance()
). If I just have it run on a single CPU, the output is suppressed.
So I guess it’s related to the multi-core processing. I’m quite a newbie to SLURM and parallelization in general, so I’d be happy about input from someone with more experience if there’s something I can change at my end (apart from using only 1 CPU, which probably makes it quite a bit slower ).
Thanks!