Get the scores of each trial not only each split

YuZhou · April 29, 2021, 3:57am

MNE-Python version: 0.24.dev0
operating system: linux

Hey! guys,

I’m wondering if there is a way to get the scores of each trial instead of each split when I use mne.decoding.cross_val_multiscore(). At first, I use the leave one out cross-validation method to get scores of each trial, but the computation cost too much and not preferred compared with the KFold or StratifiedKFold (n_splits = 5 or 10). I’m wondering if there is a way to get scores of each trial when using StratifiedKFold cross-validation method.
Here is my code snippet:

cv = StratifiedKFold(n_splits=10)
clf = make_pipeline(StandardScaler(), LinearDiscriminantAnalysis(solver='eigen', shrinkage='auto'), verbose=True)
time_decod_col = SlidingEstimator(clf, n_jobs=-1, scoring=None, verbose=True)
scores_fea_col = cross_val_multiscore(time_decod_col, epos, labels_fea_col, cv=cv, n_jobs=-1)

Best

drammock · April 30, 2021, 3:24pm

@agramfort probably knows all the best tricks for doing this.

agramfort · May 2, 2021, 2:48pm

what you want is what sklearn calls cross_val_predict
unfortunately this is no cross_val_multipredict

if you want this I fear you need to do your for loops by hand
or try to make it happen that we add a cross_val_multipredict in mne

Alex

YuZhou · May 3, 2021, 8:37am

Hi! Alex,

Thank you soooo much for your reply, and I tried the cross_val_predict you mentioned, here is the modified code:

> cv = StratifiedKFold(n_splits=10)
> clf = make_pipeline(LinearDiscriminantAnalysis(solver='eigen', shrinkage='auto'), verbose=True)
> time_decod_col = SlidingEstimator(clf, n_jobs=-1, scoring=None, verbose=True)
>time_decod_col.fit(epochs_data_train, labels_fea_col)
> dvalues_fea_col = cross_val_predict(time_decod_col, epos, labels_fea_col, cv=cv, n_jobs=-1, method='decision_function')

And the shape of dvalues_fea_col is (n_epochs/samples, n_time_points), this is exactly what I want! I’m gonna to try other methods of cross_val_predict(), like ‘predict_proba’ and so on to see if the shapes of the results will be what I want. And my another concern is whether the orders of the epochs/events in the obtained dvalues npy file will be different from the original ones(the orders in the experiment), because according to the visualization of the StratifiedKFold behavior:

it seems like the order of epochs in testing set are not the same as the original ones. But the Kfold might not have this issue:

So maybe I should try Kfold instead of StratifiedKFold, do you have any thoughts about that? And thank you again, your suggestion really help me a lot!

agramfort · May 4, 2021, 9:30am

@YuZhou it will not work to just call cross_val_predict. That’s why we wrote the cross_val_multiscore.

Either someone has the time to write it or you do your cross-validation by hand

for t_idx in range(n_times:
    y_preds = cross_val_predict(clf, X=X[:, :, t_idx], y, ...)

something like this

Alex

YuZhou · May 4, 2021, 10:11am

Yeah, I get your point. But before I called the cross_val_predict(), I do this: time_decod_col.fit(epochs_data_train, labels_fea_col),
and then
dvalues_fea_col = cross_val_predict(time_decod_col, epos, labels_fea_col, cv=cv, n_jobs=-1, method='decision_function'). And following is the shape of outputs when using different method arguments:

you can see that 625 is the number of time points, 115 is the number of epochs, 3 is the permutation times (can be ignored). I think the reason I can get these time-resolved results is I fit the SlidingEstimator first, and then use the cross_val_predict(), so it worked. But if I do not fit the SlidingEstimator, the cross_val_predict() might not work, just like the problem I met with the decision function method of SlidingEstimator. Back then I’m so confused why the decision function method could not work to gave me the d values of each time point though the documents already state that the SlidingEstimator do have this method, and it turned out that I did not use the .fit to the SlidingEstimator. I don’t know if I made my points clear

agramfort · May 4, 2021, 11:18am

I am a bit lost.

share a script that uses the sample data that I can just run to see what you want to do.

A

Topic		Replies	Views
Temporal Generalization - Different results with and without using cross validation ✉️ Mailing List Archive (read-only) list-archive	5	239	April 5, 2020
How to get the decision value of each time point in each epoch? 💬 Support & Discussions machine-learning	1	633	March 28, 2021
mne.decoding.cross_val_multiscore produces PermissionError: [Errno 13] 💬 Support & Discussions	1	158	October 26, 2023
Is there a built in way with SlidingEstimator to get the scores for every epoch on the test set when decoding? 💬 Support & Discussions	1	267	February 25, 2023
SlidingEstimator on time-frequency matrix + bootstrap: run faster? 💬 Support & Discussions	1	180	June 7, 2021

Get the scores of each trial not only each split

Related topics