EEGLAB reader

External Email - Use Caution

Hi all,

I am trying to reproduce in MNE some analyses published in
https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0030135
using EEGLAB and I am slowly tracking down sources of differences between
the results I obtain with these two libraries.

The EEGLAB and the MNE readers for .set files seems to give different
results. I am using the file km81.set for this example, which can be
downloaded here: ftp://sccn.ucsd.edu/pub/mica_release.zip

Python/MNE code:

eeg =
mne.io.read_epochs_eeglab('/home/christian/Documents/mica_release/datasets/km81.set')
n_epochs, n_chan, n_sample = eeg.get_data().shape
eeg_data = eeg.get_data().reshape((n_chan, n_sample*n_epochs))
eeg_data *= 1e6
print(eeg_data[:5, :5])
print(eeg_data.shape)
print(sorted(eeg_data[:, 0]))
print(np.min(eeg_data, axis=1)[:5])
print(np.max(eeg_data, axis=1)[:5])
print(np.min(eeg_data, axis=1).shape)

Python/MNE output:

[[ -30.3018589 -9.46370029 -32.11343384 -80.96838379 -112.64376831]
[ 17.98744011 13.38890266 8.96499634 8.95211601 10.52659035]
[ 1.70007288 1.08449709 -1.92835653 -1.52061117 3.14573812]
[ 22.36253166 19.08609772 14.59503269 14.00534725 16.24861908]
[ -11.30813694 -12.02319813 -9.68249226 -6.36183262 -6.76167011]]
(71, 310450)
[-54.222572326660156, -49.53323745727539, -38.230918884277344,
-30.919052124023434, -30.30185890197754, -28.795389175415036,
-26.802810668945312, -23.46408462524414, -19.828861236572266,
-19.07419776916504, -16.326366424560547, -16.310237884521484,
-14.442460060119629, -14.286537170410154, -13.221666336059569,
-11.308136940002441, -10.6834716796875, -10.423746109008789,
-9.760689735412596, -8.87014102935791, -6.883561611175537,
-6.867288589477539, -5.7860164642333975, -4.9906535148620605,
-3.48313570022583, -3.4533519744873047, -3.3310370445251465,
-2.4730896949768066, -0.9696072936058044, -0.8216205835342406,
-0.40084442496299744, 0.22790522873401642, 0.28669825196266174,
1.3285683393478394, 1.4407401084899902, 1.7000728845596311,
1.7271562814712524, 1.8890516757965088, 3.3240826129913326,
3.567858934402466, 4.892752170562744, 4.926086902618408, 5.629435539245605,
5.7694478034973145, 6.5682663917541495, 7.257050514221191,
7.54573392868042, 7.608102798461913, 7.769186019897461, 7.779174804687499,
8.121392250061033, 9.502474784851074, 9.762967109680174, 9.94691467285156,
10.304577827453613, 10.690375328063965, 11.20960521697998,
13.293575286865234, 14.423436164855955, 15.029093742370604,
16.452854156494137, 17.035533905029297, 17.987440109252926,
18.829505920410156, 22.084123611450195, 22.362531661987305,
24.502143859863278, 26.561006546020504, 36.645851135253906,
48.55062484741211, 121.60424041748047]
[-162.69863892 -201.91339111 -130.68704224 -348.22705078 -198.54916382]
[291.74282837 204.80189514 195.11305237 277.12097168 262.78338623]
(71,)

MATLAB/EEGLAB code:

EEG =
pop_loadset('/home/christian/Documents/mica_release/datasets/km81.set');
data = reshape(EEG.data,nchans,EEG.pnts*EEG.trials);
size(data)
data(1:5, 1:5)
sort(data(:, 1))'
min(data(1:5, :)')
max(data(1:5, :)')
size(min(data(:, :)'))

MATLAB/EEGLAB output:

ans =
          71 310450
ans =

  5?5 single matrix

  -30.3019 -9.4637 -32.1134 -80.9684 -112.6438
  -10.2374 -5.0511 -1.8697 -1.6044 -1.1974
  -20.4854 -9.7555 -1.0124 4.8327 9.6969
  -25.3829 -13.1146 -3.3851 2.5596 7.3826
   -3.8909 2.9585 6.2366 3.6442 0.2817

ans =

  -30.3019 -26.0869 -25.9951 -25.8994 -25.3829 -24.8852 -24.4983
-23.8505 -23.5649 -23.0836
  -22.0849 -22.0227 -21.2673 -20.4854 -20.4535 -19.2276 -18.0331
-17.7991 -17.4316 -17.3945
  -17.2824 -16.0406 -15.9822 -15.1589 -15.0862 -14.9993 -14.7288
-14.5398 -13.8370 -13.3654
  -13.3466 -13.1214 -12.9615 -11.1150 -10.2374 -9.3882 -8.3259
-8.2251 -7.8012 -7.5449
   -6.9841 -6.8029 -6.7175 -6.5904 -6.5436 -6.0545 -5.6778
-5.5130 -5.0071 -4.9592
   -4.3509 -4.3206 -4.0873 -3.8909 -3.4353 -3.3442 -3.1263
-3.0407 -2.7963 -2.7472
   -2.0239 -1.5894 -1.4584 -1.2007 -1.1818 -0.6294 -0.5136
-0.1163 1.2653 2.0582
    2.5924

ans =
-441.1615 -282.5767 -82.9421 -99.8472 -145.7018

ans =
  449.5460 152.6343 77.8951 76.6568 343.0334

ans =
     1 71

As can be seen, the first samples of the first channel have same values for
the two readers but then it gets different (as seen by the max/min values
of this channel being different between the two code). The other channels
also don't have the same values (even for their first samples). It is not
due to swapped channels, as shown by the fact that the sorted values of the
first sample of the 71 channels are not the same.

At this point, I am not sure if these differences are due to:
- me not using the library correctly (although this code seems pretty
minimal and I made a diligent effort in looking for errors in my code)
- some under-the-hood assumptions that are different between the two
readers (e.g., some preprocessing done automatically like re-referencing or
filtering)
- a bug in one of the two readers

Any ideas?

Best,

Christian
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.nmr.mgh.harvard.edu/pipermail/mne_analysis/attachments/20190927/c0f6a287/attachment-0001.html

External Email - Use Caution

eeg_data = eeg.get_data().reshape((n_chan, n_sample*n_epochs))

...

data = reshape(EEG.data,nchans,EEG.pnts*EEG.trials);

Have you thought about 1) the dimensions of each thing being reshaped, and
2) how `reshape` handles array order in each language?

For the reshape step:

1. NumPy by default uses C order, meaning the last dimension changes the
fastest
2. MATLAB stores data in F/Fortran order, meaning the first dimension
changes the fastest

So, given the same input dimension ordering, the reshaping is not going to
be equivalent here, unless you pass order='F' to np.reshape (or do
something to get MATLAB to change how it reorders).

Regarding input dimension ordering, the dimensions of the Python array are
(n_epochs, n_channels, n_times). Not sure what shape EEGLAB gives you.
Depending on that, you might need to `.transpose([...])` in Python to get
it to the right dimension order, *and *a `np.reshape(..., order='F')` to
then reorder it the same way.

Maybe you've thought about these things and accounted for them already, but
calling eeg.get_data().reshape(n_channels,n_sample*n_epochs) looks a bit
suspicious -- I'm guessing you want the data for each channel across time
(epochs stacked horizontally, basically). To get this you would want to add
a transpose step:

eeg.get_data().transpose([1, 0, 2]).reshape(n_chan, n_epochs*n_sample)
# or just .reshape(n_chan, -1)

So I suspect there are some dimension-order / reshape-order problems
popping up in your code.

Eric
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.nmr.mgh.harvard.edu/pipermail/mne_analysis/attachments/20190927/160ae10e/attachment.html

External Email - Use Caution

Thanks Eric. I got tripped down by the quasi direct translation of code
from Matlab to Numpy. Your intuition was right, it was an issue related to
differences in reshape ordering. Worked well (and I think it makes the code
more readable) with np.hstack(eeg.get_data()) instead of .reshape().

Best wishes,

Christian

Le ven. 27 sept. 2019 ? 12:50, Eric Larson <larson.eric.d at gmail.com> a
?crit :

        External Email - Use Caution

eeg_data = eeg.get_data().reshape((n_chan, n_sample*n_epochs))

...

data = reshape(EEG.data,nchans,EEG.pnts*EEG.trials);

Have you thought about 1) the dimensions of each thing being reshaped, and
2) how `reshape` handles array order in each language?

For the reshape step:

1. NumPy by default uses C order, meaning the last dimension changes the
fastest
2. MATLAB stores data in F/Fortran order, meaning the first dimension
changes the fastest

So, given the same input dimension ordering, the reshaping is not going to
be equivalent here, unless you pass order='F' to np.reshape (or do
something to get MATLAB to change how it reorders).

Regarding input dimension ordering, the dimensions of the Python array are
(n_epochs, n_channels, n_times). Not sure what shape EEGLAB gives you.
Depending on that, you might need to `.transpose([...])` in Python to get
it to the right dimension order, *and *a `np.reshape(..., order='F')` to
then reorder it the same way.

Maybe you've thought about these things and accounted for them already,
but calling eeg.get_data().reshape(n_channels,n_sample*n_epochs) looks a
bit suspicious -- I'm guessing you want the data for each channel across
time (epochs stacked horizontally, basically). To get this you would want
to add a transpose step:

eeg.get_data().transpose([1, 0, 2]).reshape(n_chan, n_epochs*n_sample) # or just .reshape(n_chan, -1)

So I suspect there are some dimension-order / reshape-order problems
popping up in your code.

Eric

_______________________________________________
Mne_analysis mailing list
Mne_analysis at nmr.mgh.harvard.edu
Mne_analysis Info Page

-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.nmr.mgh.harvard.edu/pipermail/mne_analysis/attachments/20190927/37446f34/attachment.html

External Email - Use Caution

hi Christian,

FYI we did this replication effort in:

https://link.springer.com/chapter/10.1007/978-3-319-53547-0_27
https://hal.archives-ouvertes.fr/hal-01451432 (free pdf)

contact me directly if you need some code snippets.

Alex