Training and test classifies using different datasets

YuZhou · February 22, 2022, 2:37pm

MNE version: 0.24.0
operating system: Linux

Hey! guys,
Recently I’m using the temporal generalization method to perform some decoding work. Specifically, I need to train classifiers for each time point using data in stage A, then test the classifiers in stage B. Here are some code snippets:

n_cvsplits = 8
cv = StratifiedKFold(n_splits=n_cvsplits, shuffle=True, random_state=42)
n_jobs = -2

clf = make_pipeline(
    StandardScaler(), # Z-score data, because gradiometers and magnetometers have different scales
    LogisticRegression(random_state=42, n_jobs=n_jobs, max_iter=1000))

globals()[f'slid_task_{t}_{r}_fea_{f}_{o}'] = GeneralizingEstimator(clf, n_jobs=n_jobs, scoring='roc_auc', verbose=True)

globals()[f'slid_task_{t}_{r}_fea_{f}_{o}'].fit(globals()[f'arr_epo_data_reorder_avg_windows_trans_{r}_{o}'], globals()[f'labels_learn_fea_{f}_obj_{o}'])

globals()[f'dvalue_per_task_{t}_{r}_fea_{f}_{o}'] = globals()[f'slid_task_{t}_{r}_fea_{f}_{o}'].decision_function(globals()[f'arr_epo_data_reorder_avg_windows_trans_per_{r}_{o}'])

and the problem I met was the processes of transforming estimators seemed not finished, but no error messages were prompted and the result’s shape was correct. Here is what the estimating process look like:

Is there anything wrong with it? if there is, what can I do to fix this?
Thank you very much!
Best

mmagnuski · March 14, 2022, 5:25pm

Hi, @YuZhou,
do you mean that the progressbar does not seem to complete but the outputs look correct?
Could you create a simple reproducible example for the problem you experience?

BTW, I think its better to just use dictionaries than assign to globals(), your code will be easier to read (so for example slid_task[f'{t}_{r}_fea_{f}_{o}'] = ...).

YuZhou · March 27, 2022, 8:10am

Hi! Mikolaj,
Thanks a lot for your reply and suggestions! Your explanation of the problem is exactly what I meant. I’m not sure how can I upload the data example to you guys.
After I posted this problem and before you replied to me, I kind of find an alternative solution to get the roc_auc. Instead of calculating the roc_auc from the values obtained from the decision_fucntion(), I defined the score=‘roc_auc’ in the estimator, then use the .score method to get the rouc_auc values.

clf = make_pipeline(
            StandardScaler(), # Z-score data, because gradiometers and magnetometers have different scales
            LogisticRegression(random_state=42, n_jobs=n_jobs, max_iter=1000))
slid = GeneralizingEstimator(clf, n_jobs=n_jobs, scoring='roc_auc', verbose=True)
slid.fit(training_data, training_labels)
score = slid.score(test_data, test_label)

and as to the dictionaries suggestions, I will start to use them instead of globals(), thank you so much!

Best
Yu

Topic		Replies	Views
Using mne.decoding.GeneralizingEstimator fit() to predict categorical variables Support & Discussions machine-learning	5	412	September 6, 2022
Multicollinearity in Temporal Generalization Analysis in the case modif moving time window Support & Discussions	1	18	April 3, 2025
Temporal generalization decoding on binned data Support & Discussions	4	408	January 21, 2022
Will the test data set undergo all the operation in clf? Support & Discussions machine-learning	1	232	May 6, 2022
Temporal Generalization - Different results with and without using cross validation Mailing List Archive (read-only) list-archive	5	240	April 5, 2020

Training and test classifies using different datasets

Related topics