Is it possible to run MNE/SkLearn on a GPU, and if so can you tell me where?

I’m running some decoding models which are taking about 2 hours per participant. Naturally, I would like this to be fast. I’ve seen mixed reports of whether you can run SKlearn on a GPU, and in turn this wold affect whether or not you can run the SlidingEstimator on a GPU… Do you know if it is possible and if so can you recommend a platform? Thanks!

are you already using n_jobs = -1?

what is your classifier?

Alex

I’m using n_jobs = -1 yes. I’m using Ridge, this is my approach:

from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import StandardScaler
from mne.decoding import SlidingEstimator
from sklearn.linear_model import RidgeClassifier
from sklearn.model_selection import  GridSearchCV, cross_val_predict, RepeatedStratifiedKFold


rlr = RidgeClassifier(max_iter=1000)
grid = {'alpha': [1e-3, 1e-2, 0.01, 0.1, 1, 5]}
cv = RepeatedStratifiedKFold(n_splits=3, n_repeats=2, random_state=1)
rlr = GridSearchCV(rlr, grid, scoring='roc_auc', cv=cv, n_jobs=-1)
clf = make_pipeline(StandardScaler(), rlr)
time_decode = SlidingEstimator(clf, n_jobs=1, scoring='roc_auc', verbose=True)
predictions = cross_val_predict(time_decode, X, y, cv=4, method='decision_function')

you should use https://scikit-learn.org/stable/modules/generated/sklearn.linear_model.RidgeClassifierCV.html

and not a grid search for Ridge

Alex

2 Likes

Thanks! That appears to have speeded things up a bit! I also found that wrapping the model in joblib.parallel_backend also improved the speed, like this:

def ridgeReturnProbabilitiesCV(X, y):

    cv = RepeatedStratifiedKFold(n_splits=3, n_repeats=2, random_state=1)

    rlr = RidgeClassifierCV(alphas=[1e-3, 1e-2, 0.01, 0.1, 1, 5],
                            scoring='roc_auc',
                            cv=cv)

    clf = make_pipeline(StandardScaler(), rlr)
    time_decode = SlidingEstimator(clf, n_jobs=1, scoring='roc_auc', verbose=True)
    probabilities = cross_val_predict(time_decode, X, y, cv=4, method='decision_function')

    return probabilities

import joblib
tic = time.time()
with joblib.parallel_backend(backend='loky', n_jobs=8):
    probabilities = ridgeReturnProbabilitiesCV(X, y)
toc = time.time() - tic
print(participant, 'took', toc / 60, 'minutes.')

I believe you could achieve a similar result, yet simplify your code, by passing n_jobs to cross_val_predict().

Best wishes,
Richard

1 Like

nice one, thanks!