You are right, the documentation is not good here.
To use any other scorer that ships with scikit-learn, you can simply pass its name as a string, e.g. in your case:
scoring='f1'
To get a list of all scorers that can be used that way, use the sklearn.metrics.get_scorer_names()
function, which will produce output like this:
['accuracy',
'adjusted_mutual_info_score',
'adjusted_rand_score',
'average_precision',
'balanced_accuracy',
'completeness_score',
'explained_variance',
'f1',
'f1_macro',
'f1_micro',
'f1_samples',
'f1_weighted',
'fowlkes_mallows_score',
'homogeneity_score',
'jaccard',
'jaccard_macro',
'jaccard_micro',
'jaccard_samples',
'jaccard_weighted',
'matthews_corrcoef',
'max_error',
'mutual_info_score',
'neg_brier_score',
'neg_log_loss',
'neg_mean_absolute_error',
'neg_mean_absolute_percentage_error',
'neg_mean_gamma_deviance',
'neg_mean_poisson_deviance',
'neg_mean_squared_error',
'neg_mean_squared_log_error',
'neg_median_absolute_error',
'neg_root_mean_squared_error',
'normalized_mutual_info_score',
'precision',
'precision_macro',
'precision_micro',
'precision_samples',
'precision_weighted',
'r2',
'rand_score',
'recall',
'recall_macro',
'recall_micro',
'recall_samples',
'recall_weighted',
'roc_auc',
'roc_auc_ovo',
'roc_auc_ovo_weighted',
'roc_auc_ovr',
'roc_auc_ovr_weighted',
'top_k_accuracy',
'v_measure_score']
When you pass these strings to sklearn.metrics.check_scoring()
, you can see that internally, make_scorer()
will be called:
# %%
from sklearn.linear_model import LogisticRegression
import sklearn.metrics as skm
estimator = LogisticRegression(solver='liblinear')
skm.check_scoring(
estimator,
'f1'
)
returns:
make_scorer(f1_score, average=binary)
Which, I believe, should always be the preferred method of doing things over manually passing a function to make_scorer()
, as it ensures that reasonable defaults are being set.
I hope this clarifies things a bit.
It would be great if you’d be willing to improve our documentation. Let us know if you’re interested, and we can guide you.
Best wishes,
Richard