I assumed that I should get the same result and the only difference would
be the diagonal results where the diagonal training scores will be all 1.
However, the general results are quite different (you can still see some
fade underlying pattern similar in both).
Any idea of why plotting scores using cross-validation v.s. only
plotting fitting/training scores will give different results?
This is my understanding of what should be going on: in the training case
without using any cross-validation, on each time point, there was a
classifier/decoder that was trained by seeing all EEG channels' data over
all epochs at that train time point, therefore it would give a perfect
score on the same test time point. However, a different time point
(testing times) has different data that can be seen as a test set for this
decoder. Right? (Even if there was an autocorrelation between EEG data over
time and still see some meaningful pattern in time generalization matrix,
it means that EEG data had task-related information over time which is
still meaningful).
*you make the cross-validation folds random (shuffle=True). It means*
*that everytime you run the code you will get a different value. This the
inherent*
*variance of the statistic that cross-validation reports.*
*To avoid this randomness (which should not be neglected) you can fix the
random state eg*
Which I gather switches the reference channel ? I can do that. But how do I, instead of swithing, create the average of M1 and M2, AND know that it worked?
Also, is there an easy way to find what the current reference being used is??
Thanks a lot for the helpful information. I have run my code according to
your suggestion but I want to make sure I understood your point correctly.
As you suggested, the difference in results might be due to "shuffle=True".
However, I do not understand why you suggested "*StratifiedShuffleSplit* "?
what would be the difference between doing "cv = *StratifiedKFold*
*(n_splits=5*, shuffle=True, random_state=42) or cv =
*StratifiedKFold**(*n_splits=5,
*shuffle=False*)" vs "cv = *StratifiedShuffleSplit*(*n_splits=1000*,
random_state=42)"? Why not only setting random state and just using "cv =
*StratifiedShuffleSplit*(*n_splits=5*, random_state=42)" with the same
number of splits?