ERP XDAWN Decoding

Dear MNE experts:

I have one issue while computing an XDAWN. I have tried XDAWN on ERP datasets. I thought the problem is due to gbk encoding. But I can not add it in the fname = os.path.join(eeg_path,‘010101_1.set’).
May I ask how to solve this ‘gbk’ problem?

Best wishes
YuTong`

type or paste code here
import mne
import os
import numpy as np
import matplotlib.pyplot as plt

from sklearn.model_selection import StratifiedKFold
from sklearn.pipeline import make_pipeline
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report, confusion_matrix
from sklearn.preprocessing import MinMaxScaler

from mne import io, pick_types, read_events, Epochs, EvokedArray, create_info
from mne.preprocessing import Xdawn
from mne.decoding import Vectorizer


print(__doc__)

data_path = data_path = 'C:\WM_DATA\DATA\DATA'
eeg_path = data_path 
fname = os.path.join(eeg_path,'010101_1.set')
tmin, tmax = -0.1, 0.3
event_id = {'B9(size4_left_noc)/71': 46, 'B5(size2_left_noc)/61': 55, 'B3(size1_right_noc)/53': 56, 'B2(size1_left_c)/52': 59, 'B1(size1_left_noc)/51': 59, 'B8(size2_right_c)/64': 48, 'B4(size1_right_c)/54': 56, 'B6(size2_left_c)/62': 56, 'B12(size4_right_c)/74': 19, 'B10(size4_left_c)/72': 49}
n_filter = 3

# Setup for reading the raw data
raw = mne.io.read_epochs_eeglab(fname)
raw.filter(1, 20, fir_design='firwin')
events = read_events(fname)

picks = pick_types(raw.info, meg=False, eeg=True, stim=False, eog=False,
                   exclude='bads')

epochs = Epochs(raw, events, event_id, tmin, tmax, proj=False,
                picks=picks, baseline=None, preload=True,
                verbose=False)

# Create classification pipeline
clf = make_pipeline(Xdawn(n_components=n_filter),
                    Vectorizer(),
                    MinMaxScaler(),
                    LogisticRegression(penalty='l1', solver='liblinear',
                                       multi_class='auto'))

# Get the labels
labels = epochs.events[:, -1]

# Cross validator
cv = StratifiedKFold(n_splits=10, shuffle=True, random_state=42)

# Do cross-validation
preds = np.empty(len(labels))
for train, test in cv.split(epochs, labels):
    clf.fit(epochs[train], labels[train])
    preds[test] = clf.predict(epochs[test])

# Classification report
target_names = ['aud_l', 'aud_r', 'vis_l', 'vis_r']
report = classification_report(labels, preds, target_names=target_names)
print(report)

# Normalized confusion matrix
cm = confusion_matrix(labels, preds)
cm_normalized = cm.astype(float) / cm.sum(axis=1)[:, np.newaxis]

# Plot confusion matrix
fig, ax = plt.subplots(1)
im = ax.imshow(cm_normalized, interpolation='nearest', cmap=plt.cm.Blues)
ax.set(title='Normalized Confusion matrix')
fig.colorbar(im)
tick_marks = np.arange(len(target_names))
plt.xticks(tick_marks, target_names, rotation=45)
plt.yticks(tick_marks, target_names)
fig.tight_layout()
ax.set(ylabel='True label', xlabel='Predicted label')

Module created for script run in IPython
Extracting parameters from C:\WM_DATA\DATA\DATA\010101_1.set…
Not setting metadata
613 matching events found
c:\users\yyt.spyder-py3\xdawn.py:27: RuntimeWarning: At least one epoch has multiple events. Only the latency of the first event will be retained.
raw = mne.io.read_epochs_eeglab(fname)
C:\Users\yyt\mne-python\1.1.1_0\lib\site-packages\mne\io\eeglab\eeglab.py:149: DeprecationWarning: The truth value of an empty array is ambiguous. Returning False, but in future this will result in an error. Use array.size > 0 to check that an array is not empty.
if d.get(“type”, None) != ‘FID’:
C:\Users\yyt\mne-python\1.1.1_0\lib\site-packages\mne\io\eeglab\eeglab.py:149: DeprecationWarning: The truth value of an empty array is ambiguous. Returning False, but in future this will result in an error. Use array.size > 0 to check that an array is not empty.
if d.get(“type”, None) != ‘FID’:
No baseline correction applied
0 projection items activated
Ready.
Setting up band-pass filter from 1 - 20 Hz

FIR filter parameters

Designing a one-pass, zero-phase, non-causal bandpass filter:

  • Windowed time-domain design (firwin) method
  • Hamming window with 0.0194 passband ripple and 53 dB stopband attenuation
  • Lower passband edge: 1.00
  • Lower transition bandwidth: 1.00 Hz (-6 dB cutoff frequency: 0.50 Hz)
  • Upper passband edge: 20.00 Hz
  • Upper transition bandwidth: 5.00 Hz (-6 dB cutoff frequency: 22.50 Hz)
  • Filter length: 3301 samples (3.301 sec)

c:\users\yyt.spyder-py3\xdawn.py:28: RuntimeWarning: filter_length (3301) is longer than the signal (1400), distortion is likely. Reduce filter length or filter a longer signal.
raw.filter(1, 20, fir_design=‘firwin’)
[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done 1 out of 1 | elapsed: 0.0s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 2 out of 2 | elapsed: 0.0s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 3 out of 3 | elapsed: 0.0s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 4 out of 4 | elapsed: 0.0s remaining: 0.0s
[Parallel(n_jobs=1)]: Done 17777 out of 17777 | elapsed: 7.4s finished
c:\users\yyt.spyder-py3\xdawn.py:29: RuntimeWarning: This filename (C:\WM_DATA\DATA\DATA\010101_1.set) does not conform to MNE naming conventions. All events files should end with .eve, -eve.fif, -eve.fif.gz, -eve.lst, -eve.txt, _eve.fif, _eve.fif.gz, _eve.lst, _eve.txt or -annot.fif
events = read_events(fname)
Traceback (most recent call last):

File “C:\Users\yyt\mne-python\1.1.1_0\lib\site-packages\spyder_kernels\py3compat.py”, line 356, in compat_exec
exec(code, globals, locals)

File “c:\users\yyt.spyder-py3\xdawn.py”, line 29, in
events = read_events(fname)

File “”, line 12, in read_events

File “C:\Users\yyt\mne-python\1.1.1_0\lib\site-packages\mne\event.py”, line 266, in read_events
lines = np.loadtxt(filename, dtype=np.float64).astype(int)

File “C:\Users\yyt\mne-python\1.1.1_0\lib\site-packages\numpy\lib\npyio.py”, line 1098, in loadtxt
first_line = next(fh)

UnicodeDecodeError: ‘gbk’ codec can’t decode byte 0xca in position 188: illegal multibyte sequence

Hello @hp20072929 and welcome to the forum!

read_events() cannot work with a .set file.

The EEGLAB reader automatically creates annotations, which can be converted to events:

raw = mne.io.read_raw_eeglab(fname)
events, event_id = mne.events_from_annotations(raw)

I’m also not sure why you’re using read_epochs_eeglab() to read raw data? I’m surprised it even works…

Best wishes,
Richard

Thank you so much, Richard.
After changing, I have a new problem with mat file. And may I ask if would it be possible to run XDAWN on Event-related potential dataset?

Best regards
YuTong

import mne
import os
import numpy as np
import matplotlib.pyplot as plt

from sklearn.model_selection import StratifiedKFold
from sklearn.pipeline import make_pipeline
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import classification_report, confusion_matrix
from sklearn.preprocessing import MinMaxScaler

from mne import io, pick_types, read_events, Epochs, EvokedArray, create_info
from mne.preprocessing import Xdawn
from mne.decoding import Vectorizer


print(__doc__)

data_path = data_path = 'C:\WM_DATA\\raw'
eeg_path = data_path 
fname = os.path.join(eeg_path,'010101_1.vhdr')
tmin, tmax = -0.1, 0.3
event_id = {'B9(size4_left_noc)/71': 46, 'B5(size2_left_noc)/61': 55, 'B3(size1_right_noc)/53': 56, 'B2(size1_left_c)/52': 59, 'B1(size1_left_noc)/51': 59, 'B8(size2_right_c)/64': 48, 'B4(size1_right_c)/54': 56, 'B6(size2_left_c)/62': 56, 'B12(size4_right_c)/74': 19, 'B10(size4_left_c)/72': 49}
n_filter = 3

# Setup for reading the raw data
raw = mne.io.read_raw_eeglab(fname)

events, event_id = mne.events_from_annotations(raw)

picks = pick_types(raw.info, meg=False, eeg=True, stim=False, eog=False,
                   exclude='bads')

epochs = Epochs(raw, events, event_id, tmin, tmax, proj=False,
                picks=picks, baseline=None, preload=True,
                verbose=False)

# Create classification pipeline
clf = make_pipeline(Xdawn(n_components=n_filter),
                    Vectorizer(),
                    MinMaxScaler(),
                    LogisticRegression(penalty='l1', solver='liblinear',
                                       multi_class='auto'))

# Get the labels
labels = epochs.events[:, -1]

# Cross validator
cv = StratifiedKFold(n_splits=10, shuffle=True, random_state=42)

# Do cross-validation
preds = np.empty(len(labels))
for train, test in cv.split(epochs, labels):
    clf.fit(epochs[train], labels[train])
    preds[test] = clf.predict(epochs[test])

# Classification report
target_names = ['aud_l', 'aud_r', 'vis_l', 'vis_r']
report = classification_report(labels, preds, target_names=target_names)
print(report)

# Normalized confusion matrix
cm = confusion_matrix(labels, preds)
cm_normalized = cm.astype(float) / cm.sum(axis=1)[:, np.newaxis]

# Plot confusion matrix
fig, ax = plt.subplots(1)
im = ax.imshow(cm_normalized, interpolation='nearest', cmap=plt.cm.Blues)
ax.set(title='Normalized Confusion matrix')
fig.colorbar(im)
tick_marks = np.arange(len(target_names))
plt.xticks(tick_marks, target_names, rotation=45)
plt.yticks(tick_marks, target_names)
fig.tight_layout()
ax.set(ylabel='True label', xlabel='Predicted label')
runfile('C:/Users/yyt/.spyder-py3/11_15.py', wdir='C:/Users/yyt/.spyder-py3')
Module created for script run in IPython
Traceback (most recent call last):

  File "C:\Users\yyt\mne-python\1.1.1_0\lib\site-packages\spyder_kernels\py3compat.py", line 356, in compat_exec
    exec(code, globals, locals)

  File "c:\users\yyt\.spyder-py3\11_15.py", line 27, in <module>
    raw = mne.io.read_raw_eeglab(fname)

  File "C:\Users\yyt\mne-python\1.1.1_0\lib\site-packages\mne\io\eeglab\eeglab.py", line 259, in read_raw_eeglab
    return RawEEGLAB(input_fname=input_fname, preload=preload,

  File "<decorator-gen-277>", line 12, in __init__

  File "C:\Users\yyt\mne-python\1.1.1_0\lib\site-packages\mne\io\eeglab\eeglab.py", line 358, in __init__
    eeg = _check_load_mat(input_fname, uint16_codec)

  File "C:\Users\yyt\mne-python\1.1.1_0\lib\site-packages\mne\io\eeglab\eeglab.py", line 60, in _check_load_mat
    eeg = _readmat(fname, uint16_codec=uint16_codec)

  File "C:\Users\yyt\mne-python\1.1.1_0\lib\site-packages\mne\io\eeglab\_eeglab.py", line 82, in _readmat
    return read_mat(fname, uint16_codec=uint16_codec)

  File "C:\Users\yyt\mne-python\1.1.1_0\lib\site-packages\pymatreader\pymatreader.py", line 87, in read_mat
    mjv, _ = matfile_version(fid)

  File "C:\Users\yyt\mne-python\1.1.1_0\lib\site-packages\scipy\io\matlab\_miobase.py", line 223, in matfile_version
    return _get_matfile_version(fileobj)

  File "C:\Users\yyt\mne-python\1.1.1_0\lib\site-packages\scipy\io\matlab\_miobase.py", line 251, in _get_matfile_version
    raise ValueError('Unknown mat file type, version %s, %s' % ret)

ValueError: Unknown mat file type, version 10, 68

The file you’re trying to load ends with .vhdr. This is a BrainVision, not an EEGLAB file.

thanks for the awesome information.

You are still trying to load a BrainVision file (fname = os.path.join(eeg_path,'010101_1.vhdr')) with the EEGLab reader raw = mne.io.read_raw_eeglab(fame), which can not work. Please read it with the correct reader, in this case: raw = mne.io.read_raw_brainvision(fname) or with the general reader raw = mne.io.read_raw(fname) which infers the data format from the file extension.

Thanks so much.
After solving the format.
I went into a problem: MemoryError: Unable to allocate 9.88 GiB for an array with shape (401, 3307889) and data type float64

runfile('C:/Users/yyt/.spyder-py3/11_15.py', wdir='C:/Users/yyt/.spyder-py3')
Module created for script run in IPython
Reading C:\WM_DATA\raw_set\01_1.fdt
Used Annotations descriptions: ['S  5', 'S  6', 'S 11', 'S 12', 'S 21', 'S 22', 'S 23', 'S 24', 'S 31', 'S 32', 'S 33', 'S 34', 'S 41', 'S 42', 'S 43', 'S 44', 'S 51', 'S 52', 'S 53', 'S 54', 'S 61', 'S 62', 'S 63', 'S 64', 'S 71', 'S 72', 'S 73', 'S 74', 'S 77', 'S 88', 'S 99', 'boundary']
c:\users\yyt\.spyder-py3\11_15.py:27: RuntimeWarning: The data contains 'boundary' events, indicating data discontinuities. Be cautious of filtering and epoching around these events.
  raw = mne.io.read_raw_eeglab(fname)
C:\Users\yyt\mne-python\1.1.1_0\lib\site-packages\sklearn\model_selection\_split.py:684: UserWarning: The least populated class in y has only 1 members, which is less than n_splits=10.
  warnings.warn(
Computing rank from data with rank='full'
    EEG: rank 31 from info
Reducing data rank from 31 -> 31
Estimating covariance using EMPIRICAL
Done.
Traceback (most recent call last):

  File "C:\Users\yyt\mne-python\1.1.1_0\lib\site-packages\spyder_kernels\py3compat.py", line 356, in compat_exec
    exec(code, globals, locals)

  File "c:\users\yyt\.spyder-py3\11_15.py", line 54, in <module>
    clf.fit(epochs[train], labels[train])

  File "C:\Users\yyt\mne-python\1.1.1_0\lib\site-packages\sklearn\pipeline.py", line 378, in fit
    Xt = self._fit(X, y, **fit_params_steps)

  File "C:\Users\yyt\mne-python\1.1.1_0\lib\site-packages\sklearn\pipeline.py", line 336, in _fit
    X, fitted_transformer = fit_transform_one_cached(

  File "C:\Users\yyt\mne-python\1.1.1_0\lib\site-packages\joblib\memory.py", line 349, in __call__
    return self.func(*args, **kwargs)

  File "C:\Users\yyt\mne-python\1.1.1_0\lib\site-packages\sklearn\pipeline.py", line 870, in _fit_transform_one
    res = transformer.fit_transform(X, y, **fit_params)

  File "C:\Users\yyt\mne-python\1.1.1_0\lib\site-packages\mne\decoding\mixin.py", line 33, in fit_transform
    return self.fit(X, y, **fit_params).transform(X)

  File "C:\Users\yyt\mne-python\1.1.1_0\lib\site-packages\mne\preprocessing\xdawn.py", line 460, in fit
    filters, patterns, evokeds = _fit_xdawn(

  File "C:\Users\yyt\mne-python\1.1.1_0\lib\site-packages\mne\preprocessing\xdawn.py", line 171, in _fit_xdawn
    evokeds, toeplitzs = _least_square_evoked(

  File "C:\Users\yyt\mne-python\1.1.1_0\lib\site-packages\mne\preprocessing\xdawn.py", line 87, in _least_square_evoked
    toeplitz.append(linalg.toeplitz(trig[0:window], trig))

  File "C:\Users\yyt\mne-python\1.1.1_0\lib\site-packages\scipy\linalg\_special_matrices.py", line 199, in toeplitz
    return as_strided(vals[len(c)-1:], shape=out_shp, strides=(-n, n)).copy()

MemoryError: Unable to allocate 9.88 GiB for an array with shape (401, 3307889) and data type float64

Best regards
YuTong

The error is self-explanatory. You ran out of RAM thus it could not allocate 9.88 Gb of RAM for this operation. Either you optimize your code/RAM usage to decrease the memory consumption (for instance making sure you don’t store large arrays you don’t use in a variable), or you run your code on a computer with more RAM.

Thanks.
In fact, this computer has 16GB RAM.
I have two ideas:

  1. Modify virtual memory (not working)
  2. Numpy uses lower precision when defining arrays. Reduced from float64 to float32.
    My concern is whether would this affect the results a lot.
    Best wishes
    YuTong

You should just use a computer with more RAM or use less data. Anything else will just eat up much of your time without a promise of success.

Best wishes,
Richard

there is also a preload=False option to read_raw_brainvision(). It allows you to load the raw metadata, then crop to something shorter (or perform epoching), then load the data samples.

Thanks, everyone.
It is working perfectly.
Best wishes

What did you do to fix it?

Generally, I re-run the preprocessing.