Is there a built in way with SlidingEstimator to get the scores for every epoch on the test set when decoding?

Steve_Sizzou · February 24, 2023, 3:31pm

Here’s what my decoding approach looks like so far:

# shape of X and y:
X.shape
(4797, 64, 226)
y.shape
(4797,)

 rlr = RidgeClassifier(max_iter=1500)
    grid = {'alpha': [1e-3, 1e-2, 0.01, 0.1, 1, 5]}
    cv = RepeatedKFold(n_splits=4, n_repeats=2, random_state=1)
    rlr = GridSearchCV(rlr, grid, scoring='roc_auc', cv=cv, n_jobs=-1)
    clf = make_pipeline(StandardScaler(), rlr)
    time_decod = SlidingEstimator(clf, n_jobs=1, scoring='roc_auc', verbose=True).score(X, y)
    scores = cross_val_multiscore(time_decod, X, y, cv=4, n_jobs=1)

In the above, the performance of the above modeling procedure can be seen in cross_val_multiscore which has 4 fold cv, so iterating across each of the CV folds it builds a model on 75% of the data and evaluates it on the remaining 25%. And it returns the mean AUC for each fold and each time point. I would however like to get the predicted probabilities for each epoch at each time point, for each of the folds. So it should return an X.shape[0] by X.shape[2] array. Presumably something like this would be accomplished by ditching the 4 folds and using a leave one out approach, which should then return the probabilities for each epoch. However this would be computationally costly. I would like to just limit the model building/test to 4 times and not X.shape[0] times.

I could ditch the sliding estimator and, for each time point, diving the data into 4 groups, iterating through these and assigning the 75% to train and the 25& to test and running:

clf.fit(X_train, y_train)
predictions = clf.predict(X_test)

and then append the predictions for each test set. But this means I wouldn’t take advantage of the awesome SlidingEstimator and assuming it can be used to do this which would save a lot of code. Is there a way? Thanks!

Steve_Sizzou · February 25, 2023, 6:42am

I already found the solution:

from sklearn.model_selection import cross_val_predict

rlr = RidgeClassifier(max_iter=1500)
grid = {'alpha': [1e-3, 1e-2, 0.01, 0.1, 1, 5]}
cv = RepeatedKFold(n_splits=4, n_repeats=2, random_state=1)
rlr = GridSearchCV(rlr, grid, scoring='roc_auc', cv=cv, n_jobs=-1)
clf = make_pipeline(StandardScaler(), rlr)
time_decod = SlidingEstimator(clf, n_jobs=1, scoring='roc_auc', verbose=True).score(X, y)
# scores = cross_val_multiscore(time_decod, X, y, cv=4, n_jobs=1)
predictions = cross_val_predict(time_decod,X, y, cv=4, method='decision_function')

Pretty simple in the end!

Topic		Replies	Views
SlidingEstimator on time-frequency matrix + bootstrap: run faster? Support & Discussions	1	180	June 7, 2021
How to get the decision value of each time point in each epoch? Support & Discussions machine-learning	1	635	March 28, 2021
How to change the time duration of an sliding window when doing decoding? Support & Discussions	3	531	July 9, 2021
some problems with sliding time window when decoding Support & Discussions	3	304	December 13, 2021
Is it possible to run MNE/SkLearn on a GPU, and if so can you tell me where? Support & Discussions	6	335	March 10, 2023

Is there a built in way with SlidingEstimator to get the scores for every epoch on the test set when decoding?

Related topics