I am currently investigating an experimental condition of a single subject across 4 sessions. When I looked at the Variance explained values, I noticed that there are strong fluctuations there. On 2 out of 4 sessions it is ~2% and on the other two sessions it is ~60%. What could have gone wrong here? What parameters could I adjust to get better results?
Hi Nils, I would suspect there’s a problem with the regularization here. Ensure to calculate the covariance based on the noise of your data, i.e. you would want to use the pre-stimulus baseline only:
Hi Richard, I tried that one (with different amounts of pre-stimulus baseline), but it seems to have no big influence on the outcome. Maybe some few % up or down.
This is the output for the four different days. The biggest difference between good sessions (2, 3) and bad sessions (1, 4) is the drop in noise levels at higher Eigenvalues.
For sessions 1 and 4, the determined rank seems to be incorrect. The dashed red vertical line should appear just where the signal starts to drop. The “step” we’re seeing in these two plots is also sort of unusual. Did you process all sessions exactly the same? Are different electrodes marked as bad in each session? Did you run ICA and removed different numbers of components per session?
(All of this would be fine, just trying to narrow down why the ranks are different. The next step, then, would be to try and get MNE to use the respectively correct ranks in its calculations)
We leave out sessions 2 and 3 for now, as things appear to be working for them already. We focus on 1 and 4.
compute_covariance allows for the explicit specification of a rank. Could you try, maybe starting with session 4 (where the plot is much clearer IMHO), to find a rank value that would lead to the vertical dashed line in the SVD plot to be placed just at the tip of the graph, right before it starts to drop off?
and keep adjusting the rank value until the SVD plot looks correct. And once you’ve gotten there, continue running your source analysis for this session, and see if it changes anything.
Great! So the explained variance is in a reasonable range now?
Empirical rank estimation is extremely tricky, as it involves thresholding. You can use mne.compute_rank() to manually control the thresholding parameters, specifically, tol and tol_kind.
In any case, it remains imperative to always check the SVD plots to ensure the estimated rank actually matches the respective data.
Roughly the same range as for the better sessions.
I’ll give that a try, thanks.
Btw: I saw that 56 would also give stable good results for session 2 and 3. Would it be appropriate to use a rank that is lower than needed if the explained variance remains unchanged? Otherwise, one could possibly also use a minimum over all sessions.
I suspect that you would be at the risk of losing a little bit if information if you assume a rank that is lower than the actual rank of the data; however, contrary to using a rank that is too high, it won’t totally break your analysis. @larsoner is more knowledgeable on this, so I’d rely on his judgment here
Yes by setting rank to low for some data you throw away the lowest-variance components. By setting rank too high you include some components that are numerically zero and it can blow up your result.
In your case if you’re going to use different covariances for each day, I would use different (correct) rank values for each day as well.