Is there a built in way with SlidingEstimator to get the scores for every epoch on the test set when decoding?

Here’s what my decoding approach looks like so far:

# shape of X and y:
X.shape
(4797, 64, 226)
y.shape
(4797,)
 rlr = RidgeClassifier(max_iter=1500)
    grid = {'alpha': [1e-3, 1e-2, 0.01, 0.1, 1, 5]}
    cv = RepeatedKFold(n_splits=4, n_repeats=2, random_state=1)
    rlr = GridSearchCV(rlr, grid, scoring='roc_auc', cv=cv, n_jobs=-1)
    clf = make_pipeline(StandardScaler(), rlr)
    time_decod = SlidingEstimator(clf, n_jobs=1, scoring='roc_auc', verbose=True).score(X, y)
    scores = cross_val_multiscore(time_decod, X, y, cv=4, n_jobs=1)

In the above, the performance of the above modeling procedure can be seen in cross_val_multiscore which has 4 fold cv, so iterating across each of the CV folds it builds a model on 75% of the data and evaluates it on the remaining 25%. And it returns the mean AUC for each fold and each time point. I would however like to get the predicted probabilities for each epoch at each time point, for each of the folds. So it should return an X.shape[0] by X.shape[2] array. Presumably something like this would be accomplished by ditching the 4 folds and using a leave one out approach, which should then return the probabilities for each epoch. However this would be computationally costly. I would like to just limit the model building/test to 4 times and not X.shape[0] times.

I could ditch the sliding estimator and, for each time point, diving the data into 4 groups, iterating through these and assigning the 75% to train and the 25& to test and running:

clf.fit(X_train, y_train)
predictions = clf.predict(X_test)

and then append the predictions for each test set. But this means I wouldn’t take advantage of the awesome SlidingEstimator and assuming it can be used to do this which would save a lot of code. Is there a way? Thanks!

1 Like

I already found the solution:

from sklearn.model_selection import cross_val_predict

rlr = RidgeClassifier(max_iter=1500)
grid = {'alpha': [1e-3, 1e-2, 0.01, 0.1, 1, 5]}
cv = RepeatedKFold(n_splits=4, n_repeats=2, random_state=1)
rlr = GridSearchCV(rlr, grid, scoring='roc_auc', cv=cv, n_jobs=-1)
clf = make_pipeline(StandardScaler(), rlr)
time_decod = SlidingEstimator(clf, n_jobs=1, scoring='roc_auc', verbose=True).score(X, y)
# scores = cross_val_multiscore(time_decod, X, y, cv=4, n_jobs=1)
predictions = cross_val_predict(time_decod,X, y, cv=4, method='decision_function')

Pretty simple in the end!

1 Like