When to perform split of data into training and test sets (classifier is returning below chance accuracies)

tempMEG · September 1, 2021, 8:19am

Hello to everyone,

MNE-Python version: 0.23.0
MNE_bids version: 0.8
operating system: Windows 10
IDE: Spyder

I am working with the MEG dataset provided by Rathee et al. (2021). In short, it contains roughly 60 minutes of MEG-recordings (among other channels suchs as EOG and ECG) for 17 subjects who were asked to perform MI (hands / feets) and CI (word generation / subtraction) tasks.
Data: Link
Publication describing the data: Link

My goal is to train a binary classifier for each subject and each combination of stimuli which can predict the stimulus’ category (e.g., in epoch X subject 1 was imaging moving their hand), just like the authors did in their publication describing the data (see link above).

Currently, my classifier is returning very low below chance accuracies (e.g., 0.2 when chance level is 0.5). My guess is that I am performing my split of the data into train and test data at the wrong point during my analysis. Before, I only split it right before training the SVC model, which resulted in very high accuracies (up to 100%). However, it doesn’t feel right to preprocess the test data together with the training data since in an online BCI setting, this wouldn’t be happening either.
I would greatly appreciate any advice/thoughts/input on this. Thanks!

This is my current workflow for training a classifier for a single subject and a single condition:

Read in raw data for subject X
For each frequency band combination:
2.1 Filter the raw data in the given frequency range
2.2 Epoch the raw data
2.3 Subselect the epochs of interest (i.e., epochs where either event 1 or event 2 was the stimuli presented) + subselect time frame 0.5-3.5 seconds fromt the epochs
2.4 Split the filtered and epoched data into training and testing data sets
2.5 For training and test data, perform individually:

Scale data
Perform PCA
Perform CSP
Add CSP features to collection of all extracted CSP features from different frequency bands

Fit SVC with training data (i.e., the FBCSP features)
Predict ylabels for test data with SVC
Compare for accuracy

The actual code:

from sklearn.model_selection import train_test_split
from mne.decoding import Scaler, CSP, UnsupervisedSpatialFilter
from sklearn.svm import SVC
from sklearn.decomposition import PCA
import numpy as np
import mne_helperfunctions as hf
from itertools import combinations
from sklearn.metrics import accuracy_score


# read in data
subject = "20"
session = "1"
ChanSel = "grad"
raw = hf.read_in_data(subject, session).pick_types(ChanSel).drop_channels(['MEG1733', 'MEG2333']).load_data()

event_names = ["Both Hand Imagery",
       "Both Feet Imagery",
       "Word Generation Imagery",
       "Subtraction Imagery"]
nComb = list(combinations((event_names), 2))
event = nComb[0]

lowpass_freq = [8, 14]
highpass_freq = [12, 30]


# FBCSP
features_train = []
features_test = []

# extract spatio-temporal features for each frequency band
for iBand in range(len(highpass_freq)):

    # filter data
    lfreq = lowpass_freq[iBand]
    hfreq = highpass_freq[iBand]
    raw_filt = raw.copy().filter(lfreq, hfreq, method="iir")

    # epoch data
    data_events, data_event_id = hf.return_fixed_events(raw_filt)
    epochs = hf.epoch_data(raw_filt, data_events, data_event_id)

    # subselect data
    epochs = epochs[event]
    epochs = epochs.crop(tmin=0.5, tmax=3.5)

    # split data
    X_train, X_test, y_train, y_test = train_test_split(epochs.get_data(), epochs.events[:, 2], test_size=0.2, random_state=43)
    
    # scale data
    scaler = Scaler(epochs.info)
    X_train_scaled = scaler.fit_transform(X_train)
    X_test_scaled = scaler.fit_transform(X_test)
    
    # PCA
    pca = UnsupervisedSpatialFilter(PCA(.99))
    X_train_pca = pca.fit_transform(X_train_scaled)
    X_test_pca = pca.fit_transform(X_test_scaled)

    # extract csp features
    csp = CSP(n_components=6, reg=0.1, log=True)
    X_train_csp = csp.fit_transform(X_train_pca, y_train)
    X_test_csp = csp.fit_transform(X_test_pca, y_test)

    features_train.append(X_train_csp)
    features_test.append(X_test_csp)


# concatenate features from different frequency bands
X_train_feat = np.concatenate(features_train, axis=1)
X_test_feat = np.concatenate(features_test, axis=1)


# SVC
svc = SVC(kernel="rbf")
svc.fit(X_train_feat, y_train)
y_pred = svc.predict(X_test_feat)
print("Accuracy:", accuracy_score(y_test, y_pred))

mpcoll · September 3, 2021, 8:07pm

You are right that it is not adequate to process the training and test data together but it’s also a problem to use different processing parameters (e.g. scaling mean and sd, pca solution, csp filters) for each of the splits because you can get very different parameters in the test data which will interfere with classification. You should learn the parameters on the training split and apply them to the test data.

You can do this easily by wrapping all your steps in a sklearn pipeline.

something like:

from sklearn.pipeline import Pipeline

clf = Pipeline([('scaler',  Scaler(epochs.info)), ('pca', UnsupervisedSpatialFilter(PCA(.99))), ('csp', CSP(n_components=6, reg=0.1, log=True)])

X_train = clf.fit_transform(X_train)
X_test =  clf.transform(X_test)

Topic		Replies	Views
Classifier either returns very high (close to 1) or very low accuracies (close to 0) Support & Discussions meg , machine-learning	10	338	September 22, 2021
Eeg data from diffrent subjects Support & Discussions meg , eeg , statistics , epochs	1	308	July 17, 2021
access timecourses via matlab Mailing List Archive (read-only) list-archive	1	128	January 28, 2009
question about classification between events and normalization Mailing List Archive (read-only) list-archive	2	160	December 5, 2018
Questions about the sample data set Support & Discussions	6	370	November 7, 2023

When to perform split of data into training and test sets (classifier is returning below chance accuracies)

Related topics