Classifier either returns very high (close to 1) or very low accuracies (close to 0)

Hello to everyone,

MNE-Python version: 0.23.0 & MNE_bids version: 0.8.

I am working with the MEG dataset provided by Rathee et al. (2021). In short, it contains 2 sessions of MEG-recordings (each aprox. 30 minutes long) of 17 subjects who were asked to perform MI (hands / feets) and CI (word generation / subtraction) tasks.
Data: Link
Publication describing the data: Link

My goal is to train a binary classifier to distinguish between the different events using the MEG signal. Specifically, this means that for each subject and each possible combination of events (e.g. Hand vs Feet Imagery), I want to train a binary classifier which predicts the label of the event (in this case either “Hand Imagery” or “Feet Imagery”. For training I am using the data from session 1 and for testing the data from session 2.
This procedure has also been conducted by the authors themselves, the link to their code can be found at the end of the post. They performed their analysis in Matlab using fieldtrip, I am trying to reproduce it in mne-python.

I am facing the following issue: across different combinations of subjects, events and filterbanks, my classifier sometimes returns very high accuracies (e.g., close to 100%), and sometimes very low accuracies (e.g., close to 0%). Since this is a binary classification task, I would expect the performance to never drop significantly below 50%. However, since I know the original results (in the paper, they never achieved better accuracy than 94% for inter-class classification), I also assume that 100% accuracy is way too high.
I posted my code below for one subject, one filterbank and one event combination. As mentioned above, the link to the original code provided in Matlab can be found at the end of this post. I would be very grateful for any idea where this weird behavior might be stemming from.

My code:

from mne.decoding import CSP
from sklearn.svm import SVC
from sklearn.metrics import accuracy_score
import mne_bids 
import mne
import numpy as np

# path_to_data = r"D:\data\rathee2021\MEG_BIDS"
path_to_data = r"C:/Users/vof7rng/data/MEG_BIDS"
sessions = ["1", "2"]
sensorSel = "grad"
subject = "6"
combo = ('Both Hand Imagery', 'Subtraction Imagery')
filterbank = {'lowpass_freq': [8, 14], 'highpass_freq': [12, 30]}
features_s1s2 = []
labels_s1s2 = []

## Prepare data, perform FBCSP 
for session in sessions: 
    # read in raw data
    bids_path = mne_bids.BIDSPath(subject=subject,
    raw = (mne_bids.read_raw_bids(bids_path)
    # epoch data 
    events, event_id = mne.events_from_annotations(raw) 
    tmin = -2
    tmax = 5
    baseline = None
    epochs = mne.Epochs(raw.copy(),
                        events = events,
                        tmin = tmin,
                        tmax = tmax,
                        baseline = baseline,
                        event_id = event_id,
                        proj = False,
                        preload = True)
    # subselect event combination of interest 
    epochs = epochs[combo]
    # resample epochs
    epochs = epochs.resample(250)
    # get event labels
    labels =[:, 2]
    # extract spatio-temporal features for each frequency band
    features = []
    for lfreq, hfreq in zip(filterbank["lowpass_freq"], filterbank["highpass_freq"]):
        # filter data
        method = "iir"
        epochs_filt = epochs.copy().load_data().filter(lfreq, hfreq, method=method)
        # crop epochs
        epochs_filt = epochs_filt.crop(tmin=0.5, tmax=3.5)
        # extract csp features
        csp = CSP(n_components=6, log=True, norm_trace=True)
        X_fbcsp = csp.fit_transform(epochs_filt.get_data(), labels)
        # append features
    # concatenate features - only use first and last CSP feature    
    ft = []
    for nBand, band in enumerate(features):
        x = np.column_stack((band[:, 0], band[:, 5]))
    X_tr = np.concatenate(ft, axis=1)
## Intersession Analysis 
X_tr_inter = features_s1s2[0]
X_ts_inter = features_s1s2[1]

y_tr_inter = labels_s1s2[0]
y_ts_inter = labels_s1s2[1]

# svc
svc = SVC(), y_tr_inter)
y_pred = svc.predict(X_ts_inter)
acc_inter = accuracy_score(y_ts_inter, y_pred)


Original Code (provided by the authors using Matlab):
Script 1: Data preparing, filtering, FBCSP*

Script 2: Training of both an intraclass and an interclass classifier (for my problem, only the interclass classifier is of interest

this line:

epochs = epochs.load_data().filter(lfreq, hfreq, method=method)

is problematic. You overwrite the original epochs so in the end you have no signal.

It should be:

epochs_filt = epochs.copy().load_data().filter(lfreq, hfreq, method=method)


1 Like

Thanks for catching that, I updated my code accordingly (also in the post). Sadly, it still returns 0% accuracy, so even though this was definitely an issue, it doesn’t seem to be the only one causing my problem.

what you do is wrong

the CSP should be in a pipeline and not be applied on the full data of all subjects

you need to assemble the full X and y and then only do a cross_val_score with a pipeline


1 Like

I don’t think I am applying the CSP on the full data of all subjects, instead I am only applying it on one session of one individual subject at each time.
Also, I can’t see how I could apply the CSP in a pipeline since I am performing filterbank filtering before it first - i.e. I am extracting CSP features for each frequency range and then concatenate those features for prediction. In mne, is there anyway to perform the filterbank filtering in a pipeline?
In addition, cross_val_score does not seem like the right choice to me for the intersession condition, since I have a clear definition what my training data (data from session 1) and what my testing data (data from session 2) is.

use cross_val_score and a
as cv object to keep one session out.


I will try that, thank you! Could you maybe explain to me why it makes a difference if I apply the CSP in a pipeline with cross validation instead of just doing it manually?

Also, I still can’t see how I could combine extracting csp features for each frequency range individually with a pipeline.

you don’t do it write by doing the CV manually. You should fit_transform on train and transform on test.

I don’t see a transform that this is wrong


Okay, I changed this part of my code to the following:

    csp = CSP(n_components=6, log=True)
    X_fbcsp_tr = csp.fit_transform(epochs_filt_s1.get_data(),[:,2])

    # s2
    X_fbcsp_ts = csp.transform(epochs_filt_s2.get_data())

Is this how you meant it?
In the original MATLAB code, they do the following:

                % Apply CSP
                cfg_csp.method = 'csp';
                cfg_csp.csp.classlabels = data_tr_toi_comb_bp.trialinfo;
                cfg_csp.csp.numfilters  = 6;
                comp_csp_tr = ft_componentanalysis(cfg_csp, data_tr_toi_comb_bp);
                data_ts_toi_comb_bp = rmfield(data_ts_toi_comb_bp,'trialinfo');
                cfg           = [];
                cfg.method = 'csp';
                cfg.unmixing  = comp_csp_tr.unmixing;
                cfg.topolabel = comp_csp_tr.topolabel;
                cfg.topo = comp_csp_tr.topo;
                comp_csp_ts = ft_componentanalysis(cfg, data_ts_toi_comb_bp);

Do you think that my mne implementation is roughly equivalent? I couldn’t find a way to pass the unmixing matrix in mne like they do in fieldtrip,so I am not quite sure.

@tempMEG The documentation of ft_componentanalysis says:

Instead of specifying a component analysis method, you can also specify
a previously computed unmixing matrix, which will be used to estimate the
component timecourses in this data.

So the part of matlab code that you pasted in the post above first computes and then applies the CSP components (transforms the data using the component unmixing matrix). In sklearn API (and mne follows that standard) this is .fit() and .transform().

I couldn’t find a way to pass the unmixing matrix in mne like they do in fieldtrip,so I am not quite sure.

In mne you don’t have to pass unmixing matrix because it is stored in the CSP object whose .transform() method you use (or .fit_transform() if you want to fit and apply the CSP in one line).

1 Like

Thanks @mmagnuski!
Sadly I still can’t replicate their original results, but I guess the issue must lie somewhere else, so I will keep looking :slight_smile:

1 Like