Found array with dim 3. Estimator expected <= 2.

Version : mne-1.0.3

I have encountered this error every time i run a pipeline with validate as True. The accuracy of the code significantly changes with this and its not accurate.
here is the code

‘’’
raw= mne.io.read_raw_edf(‘ST7011J0-PSG.edf’,stim_channel=‘auto’,preload=True)
annot = mne.read_annotations(‘ST7011JP-Hypnogram.edf’)
raw.set_annotations(annot, emit_warning=True)
events, _ = mne.events_from_annotations(raw, event_id=event_id, chunk_duration=30.)
tmax = 30. - 1. / raw.info[‘sfreq’] # tmax in included
epochs=mne.Epochs(raw=raw, events=events,event_id=event_id, tmin=0., tmax=tmax, baseline=None)
‘’’
**

**
def eeg_power_band(epochs):
#EEG relative power band feature extraction.

This function takes an ``mne.Epochs`` object and creates EEG features based
on relative power in specific frequency bands that are compatible with
scikit-learn.

Parameters
----------
epochs : Epochs
    The data.

Returns
-------
X : numpy array of shape [n_samples, 5]
    Transformed data.
"""

# specific frequency bands
FREQ_BANDS = {"delta": [0.1, 4.5],
              "theta": [4.5, 8.5],
              "alpha": [8.5, 11.5],
              "sigma": [11.5, 15.5],
              "beta": [15.5, 30]}

psds, freqs = psd_welch(epochs_train, picks='eeg', fmin=0.5, fmax=30.)
psds, freqs = psd_welch(epochs_test, picks='eeg', fmin=0.5, fmax=30.)
# Normalize the PSDs
psds /= np.sum(psds, axis=-1, keepdims=True)

global X
X = []

for fmin, fmax in FREQ_BANDS.values():
    psds_band = psds[:, :, (freqs >= fmin) & (freqs < fmax)].mean(axis=-1)
    X.append(psds_band.reshape(len(psds), -1))

return np.concatenate(X, axis=1)
return Xer(l_freq=0, h_freq=30)

**

**
pipe=make_pipeline(FunctionTransformer(eeg_power_band,validate=True),RandomForestClassifier(n_estimators=20,criterion=‘entropy’,class_weight=‘balanced’, random_state=42))
**

**
yt_train = epochs_train.events[:, 2]
pipe.fit(epochs_train, yt_train)
**

could you please try to make this example reproducible, and as short as possible? Right now it loads in files without providing download links for others to get those files, and uses modules/functions/classes that aren’t imported in your posted code. There are also issues with formatting and indentation:

  • the way to start and end code blocks on this forum is three backticks (this: ```)
  • you seem to be defining a function, but lines following the def line aren’t indented
  • your function has two return lines

There may be other things that also need fixing, this is just what I noticed on a quick read through. Try to make it so that users can copy the whole code block, paste it into a new python interpreter, and have it run to completion (or yield the error that you’re asking about).

Sure. Sorry for the incomplete code

Here it is


import os

import mne
import numpy as np
from sklearn.metrics import confusion_matrix
from sklearn.metrics import classification_report
from mne.time_frequency import psd_welch
from sklearn.pipeline import make_pipeline
from sklearn.preprocessing import FunctionTransformer
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
import matplotlib.pyplot as plt

Here is the code for reading edf file and creating epochs

raw= mne.io.read_raw_edf(‘ST7011J0-PSG.edf’,stim_channel=‘auto’,preload=True)
annot = mne.read_annotations(‘ST7011JP-Hypnogram.edf’)
raw.set_annotations(annot, emit_warning=True)
events, _ = mne.events_from_annotations(raw, event_id=event_id, chunk_duration=30.)
tmax = 30. - 1. / raw.info[‘sfreq’] # tmax in included
epochs=mne.Epochs(raw=raw, events=events,event_id=event_id, tmin=0., tmax=tmax, baseline=None)

Here is the function for calcualting power spectral density

def eeg_power_band(epochs):
    """EEG relative power band feature extraction.

    This function takes an ``mne.Epochs`` object and creates EEG features based
    on relative power in specific frequency bands that are compatible with
    scikit-learn.

    Parameters
    ----------
    epochs : Epochs
        The data.

    Returns
    -------
    X : numpy array of shape [n_samples, 5]
        Transformed data.
    """
    # specific frequency bands
    FREQ_BANDS = {"delta": [0.5, 4.5],
                  "theta": [4.5, 8.5],
                  "alpha": [8.5, 11.5],
                  "sigma": [11.5, 15.5],
                  "beta": [15.5, 30]}

    psds, freqs = psd_welch(epochs, fmin=0.5, fmax=30.)
    # Normalize the PSDs
    psds /= np.sum(psds, axis=-1, keepdims=True)
    

    X = []
    for fmin, fmax in FREQ_BANDS.values():
        psds_band = psds[:, :, (freqs >= fmin) & (freqs < fmax)].mean(axis=-1)
        X.append(psds_band.reshape(len(psds), -1))

    return np.concatenate(X, axis=1)

Here is the pipeline for the function and the data

pipe=make_pipeline(FunctionTransformer(eeg_power_band,validate=True),RandomForestClassifier(n_estimators=20,criterion=‘entropy’,class_weight=‘balanced’, random_state=42))
yt_train = epochs.events[:, 2]
pipe.fit(epochs, yt_train)

Now when I try to fit the pipeline with validate as True I am getting this error as

Found array with dim 3. Estimator expected <= 2.

Thank you for the response :slight_smile:

1 Like

thanks! There are still some problems though:

  1. we don’t have access to the files you’re using (ST7011J0-PSG.edf and ST7011JP-Hypnogram.edf). They look like physionet sleep data; can they be downloaded with from mne.datasets.sleep_physionet.age import fetch_data? If so, include the lines of code that would do that. If not, change your example to use files that are built-in to MNE-Python, or provide us with download links.
  2. The second code block has smart-quotes instead of straight quotes, so it’s not copy-paste-runnable
  3. When I fix the quotes and substitute the files for ones that are already distributed with MNE-Python, running the second code block yields an error: NameError: name 'event_id' is not defined.

In light of that, I’ll repeat myself:

You should test this by opening a fresh python interpreter from within a new, empty folder, copy-pasting code directly from your (draft) post, and seeing if the code runs.

As a side note: your code looks fairly similar to this tutorial:
https://mne.tools/dev/auto_tutorials/clinical/60_sleep.html

It would help if you mentioned that up front, and tell us what exactly is different between your code and that tutorial. For example, if the only thing that is different is the data files you’re using, that is really helpful information that makes it much faster for us to figure out what is wrong. Alternatively, if your code fails with the same data files used in the tutorial, then it’s way easier for us to debug the problem with files that we already have (because they’re distributed with MNE-Python).

1 Like

Yes event id is missing.

here it is

event_id = {'Sleep stage W': 1,
                              'Sleep stage 1': 2,
                              'Sleep stage 2': 3,
                              'Sleep stage 3': 4,
                              'Sleep stage 4': 4,
                              'Sleep stage R': 5}

The data download command is as follows

wget -r -N -c -np https://physionet.org/files/sleep-edfx/1.0.0/

Yes the code is same as the link you have mentioned.

Sorry for this chaos. I’ll keep this things in mind

OK, so please clarify then:

  • if you run the tutorial as written does it work?
  • if you run the tutorial as written, but only change the data files used does it work?

Also, I think your wget command will download the entire dataset? Can you provide a command to just get the specific data files needed? (this is why we have things like mne.datasets.sleep_physionet.age.fetch_data — MNE 1.1.dev0 documentation, why not use that?)

I reran the tutorial with ‘‘validate=true’’ and it gave the error as

ValueError: Expected 2D array, got scalar array instead:
array=<Epochs |  841 events (good & bad), 0 - 29.99 sec, baseline off, ~12 kB, data not loaded,
 'Sleep stage W': 188
 'Sleep stage 1': 58
 'Sleep stage 2': 250
 'Sleep stage 3/4': 220
 'Sleep stage R': 125>.
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

Yes the same error is persisting with the other datasets too.

I had the downloaded the dataset using wget.

you can download it manully by below link under sleep telementry folder

https://physionet.org/content/sleep-edfx/1.0.0/