Alpha/Beta ratio (PSD) for single epochs EEG for specific electrodes

Hello!
I wonder if it is possible to do the following:
-divide my raw data (edf) into epochs (I am trying to set 0.2 s interval, but it still returns 4s as seen in the result DF),
-for each epoch:
-get PSD (Welch) in Alpha band (for 2 channels),
-get PSD (Welch) in Beta band (for 2 other channels),
- get PSD (Alpha)/PSD(Beta),
-save all data in a dataframe.
Sounds rather simple but I spent hours trying (and reading rutorials) and now feel stuck and desperate.

I am trying ti use MNE 0.24.1
I will be really thankful for any help.
Best regards,
Arianna

Hello and welcome to the forum.

Could you share some code snippets on how you create the epochs and how you compute the PSD? This way we can spot your mistakes and help you fix them.
In the meantime, I suggest looking into:
https://mne.tools/stable/auto_tutorials/intro/10_overview.html?highlight=epochs#epoching-continuous-data
and
https://mne.tools/stable/generated/mne.time_frequency.psd_welch.html?highlight=psd#mne.time_frequency.psd_welch

Best,
Mathieu

Dear Mathieu, thank you for your answer!
I am trying to write some code according to the tutorials you mentioned, but still get the result I can’t interprete properly and use further. I’ve alredy tried mne.time_frequency.psd_welch() and mne.time_frequency.psd_array_welch() - and I feel I really need help with it.

Here is my code snippets

`#to get epochs
epochs = mne.make_fixed_length_epochs(raw,
                                      duration=1.0, 
                                      preload=True, 
                                      reject_by_annotation=True, 
                                      proj=True, overlap=0.0, id=1, 
                                      verbose=None)`
# what I get
Not setting metadata
Not setting metadata
925 matching events found
No baseline correction applied
0 projection items activated
Loading data for 925 events and 250 original time points ...
0 bad epochs dropped

(Excuse me for this question, but I failed to find any info on what original time points may refer to. Could you please help with it? When I try the duration of an epoch = 1, I get 250 original time points, and when I change the duration to 0.1 I get 25. It obviously is connected with signal length - they seem to be equal. And I have to manipulate n_fft so that it is kept less than signal length - however, I am really far from understanding what I actually do. It only looks like some proportion for me - is it about ms, I mean, I get 250 ms for 1000 ms (as 1 s) and 25 ms for 100 ms (as 0.1 s)?..

In addition, when I try both of the functions above with n_fft = 256 I get the following error (and I couldn’t find any info about signal length and how to deal with it)
ValueError: If n_per_seg is None n_fft is not allowed to be > n_times. If you want zero-padding, you have to set n_per_seg to relevant length. Got n_fft of 256 while signal length is 250.
I wonder what relevant length may be - and how to set it properly if I am going to work with really short times, something like 100 ms as the idea is to use the code for alpha/beta ratios in a neurofeedback project)

# to get data (frequensies and other) for epochs
epochs_data = epochs.get_data()
print(epochs_data)
# what I get
[[ 4.89546426e-06  5.35417966e-06  9.02390286e-06 ... -2.74979240e-06
   -4.89046426e-06 -3.05560266e-06]
  [-7.48985153e-06 -3.66722320e-06 -9.14930799e-07 ... -2.29107700e-06
   -4.73755913e-06 -1.52655133e-06]
  [-8.56018746e-06 -4.43174886e-06 -2.29107700e-06 ... -1.50405133e-07
    3.08310266e-07  2.60188726e-06]
# to get a mean PSD per epoch for alpha and betaI tried  mne.time_frequency.psd_welch()

a_mean_psd_per_epoch = mne.time_frequency.psd_welch(epochs, fmin = 8, fmax = 12, tmin=None, tmax=None, n_fft=128, n_overlap=0, n_per_seg=None, picks=None, proj=False, n_jobs=1, reject_by_annotation=True, average='mean', window='hamming', verbose=None) # alpha band as from 8 to 12

print(a_mean_psd_per_epoch)

I guess I get an array of digits in scientific notation that should be organized as (n_epochs, n_channels, n_freqs), but I feel confused. I expected to see something like (from…time to …time fore epochs, some info on channels for channels, from…Hz to… Hz gfor frequences).
And still I wonder about how to get mean PSD for each epoch (I feel really ashamed when asking, but I’ve spent last several days trying and I feel I realy need help.)

What I tried in addition was mne.time_frequency.psd_array_welch()


a_mean_psd_per_epoch = mne.time_frequency.psd_array_welch(epochs_data, sfreq, fmin=8, fmax=12, n_fft=128, n_overlap=0, n_per_seg=None, n_jobs=1, average='mean', window='hamming', verbose=None) # alpha band as from 8 to 12

b_mean_psd_per_epoch = mne.time_frequency.psd_array_welch(epochs_data, sfreq, fmin=12, fmax=30, n_fft=128, n_overlap=0, n_per_seg=None, n_jobs=1, average='mean', window='hamming', verbose=None) # beta band as from 12 to 30

print(b_mean_psd_per_epoch)

# what I get

(array([[[2.65289317e-13, 1.09139047e-14, 6.59360158e-17, ...,
         2.45742191e-17, 1.94163313e-17, 1.79952153e-17],
        [2.09750394e-13, 1.82353709e-14, 7.30799291e-17, ...,
         2.49602267e-17, 1.98825327e-17, 1.78651880e-17],
        [1.62082601e-13, 2.85777765e-14, 2.15560421e-16, ...,
         1.51339585e-16, 1.22775214e-16, 1.06054944e-16],
        ...,
        [2.24421155e-13, 2.89985202e-15, 3.19432076e-17, ...,
         7.78661742e-18, 6.37864350e-18, 5.88329187e-18],
        [6.17609215e-15, 2.69767717e-15, 2.88291357e-18, ...,
         3.38447309e-18, 3.27000383e-18, 2.41686553e-18],
        [2.10243824e-13, 4.89146983e-15, 3.33343894e-16, ...,
         1.97866715e-16, 1.65439481e-16, 1.47644537e-16]],


And I guess I don’t really get the difference between mne.time_frequency.psd_welch() and mne.time_frequency.psd_array_welch()

All the code above was written to finally get alpha/beta ratio for each epoch - and I hoped to do it like that:

#to get alpha/beta ratio

def get_ab_ratio(a_mean_psd_per_epoch, b_mean_psd_per_epoch):
  return a_mean_psd_per_epoch/ b_mean_psd_per_epoch

for epoch in a_mean_psd_per_epoch
epoch(get_ab_ratio)

I understand this text is already really long and seems to be overloaded with questions, but I have one more. Is it somehow possible to get a dataframe with epochs info, PSD for each of them, and ab ratio for each of them?

my colab
my edf

I will be really thankful for help.
Best regards,
Arianna

1 Like

Thank you for the detailed post. I don’t have time to run any code at the moment, so I will just give you some information and answers.


mne.make_fixed_length_epochs creates as many as possible epochs of length duration (in seconds) from your raw data; starting from the first sample and going all the way to the last. With overlap set to 0, it means each epoch will not overlap with its neighbor. Thus, if you have a raw instance that last 10 seconds and creates epochs of 1 second, you should get 10 epochs.

Now, the number of samples per epoch depends on your sampling rate raw.info['sfreq'], or epochs.info['sfreq'] (same value). Assuming a sampling rate of 1 000 Hz, a 1-second epoch will have 1000 points. You can check the underlying numpy array with epochs.get_data() which returns an array of size (n_epochs, n_channels, n_times). e.g. For 10 epochs on a 64 channels EEG recording sampled at 1 kHz:
epochs.get_data().shape will be equal to (10, 64, 1000).

For the original time points reference, I agree this is not very clear; and I can’t figure out what it exactly means from the top of my head.


For mne.time_frequency.psd_array_welch() and mne.time_frequency.psd_welch(), both do exactly the same operation, but one operates on a Numpy array while the other operates on the Epochs instance from MNE. You want to set the arguments n_fft, n_overlap and n_per_seg based on the number of time points per Epoch. In the example above, I have 1 000 points per epochs.

  • n_overlap:
    If I don’t want any overlap between my welch window (sliding window), n_overlap is set to 0.
    If I want 50% overlap between my welch window (sliding window), n_overlap is set to 500.

  • n_per_seg:
    Honestly, you can leave it at None. It will be equal to n_fft which is perfectly fine.

  • n_fft:
    That’s the length of the FFT window, and except if you want to use zero-padding (in which case you do have to explicitly provide n_per_seg), it has to be smaller than the number of time points per epoch. In my example, it has to be smaller than 1 000.

The error you got comes from this:

ValueError: If n_per_seg is None n_fft is not allowed to be > n_times. If you want zero-padding, you have to set n_per_seg to relevant length. Got n_fft of 256 while signal length is 250.

Each of your epoch seems to have 250 points; and you tried to use an FFT of length 256 which is too large.


I recommend you use larger epochs. 3 to 4 seconds seems more reasonable for frequency analysis. If you use short epochs, the frequency resolution will be really poor (e.g. bins for 8 Hz, 10 Hz, 12 Hz). If you use longer epochs, the frequency resolution will go up (e.g. bins for 8 Hz, 8.5 Hz, 9 Hz, …, 11.5 Hz, 12 Hz).

mne.time_frequency.psd_welch() returns 2 arrays: psds and freqs. If you print the second, you can see exactly which frequency resolution you achieved as it will return the list of frequencies corresponding to each bin.

I recommend using n_fft equal to the epoch length (signal length), e.g. n_fft = epochs.get_data().shape[-1].


Hope this already helps you figure out some of your mistakes, let me know which question(s) have been solved and which still need explanation.

Mathieu

Dear Mathieu, thank you for your answer!
Now I see something wrong was not with the code but with my understanding of the results I get.
I am really sorry for asking more questions, but now it seems I don’t really understand what result I get with mne.time_frequency.psd_welch() for my epochs.

So… How is the data in the array organized? What data refers to epochs, channels, frequences - and where can I get PSDs themselves? My ideal plan was to get something like ‘epoch number - PSDs for channels’, but what I get looks like a list of nested arrays with a couple of numbers in each.

I really feel confused and stuck and cant move further without help.
(Please excuse me if the questions sound silly - and thank you for answering.)

It is much worse with understanding ‘psds’.
In the tutorial it is said it should be shape (n_epochs, n_channels, n_freqs) - but… what about psds themselves? I guess I am missing something really important.

In addition, I can’t get how the info in the arrays is organized - what I see is nested arrays, consisting of pairs of digits, but I really hoped to get something like ‘epoch number - PSDs per channel’.

I tried to list the questions:
0. How is the data in the array organized? What data refers to epochs, channels, frequences - and where can I get PSDs themselves?

  1. Is the data in the first array given for each channel? How can I pick the desired channels then?
    2.How to extract the desired data (‘epoch, alpha PSD for this epoch, beta PSD for this epoch, desired channels’) from the first array? Can I turn it to a dataframe? (I neither can extract the first or the second array from the result of mne.time_frequency.psd_welch() nor extract any of the arrays of the first array (which looks like a list of nested arrays).
  2. For ‘freqs’ array (the second one), it looks like a frequency resolution you wrote about. (And I wonder what resolution is suffiient.)
  3. The last but not the least: if short epochs result in bad resolution with fft, does it mean frequency analysis is not a perfect idea for neurofeedback? Or can overlaps be helpful?
#what I get

(array([[[3.88640304e-12, 2.34495723e-12],
        [1.64162649e-11, 3.24981365e-12],
        [7.71104014e-12, 3.61718362e-12],
        ...,
        [9.71978865e-13, 1.34625189e-12],
        [6.04120394e-13, 1.89692352e-13],
        [6.63983613e-12, 7.76253773e-13]],

       [[2.84966344e-12, 1.00691392e-12],
        [7.99444668e-12, 2.54131756e-12],
        [5.43104050e-12, 1.75066241e-12],
        ...,
        [1.13612984e-12, 7.62474003e-13],
        [7.35408810e-13, 2.79836428e-13],
        [5.77699935e-11, 2.26034312e-11]],

       [[6.04688090e-13, 1.73415782e-13],
        [6.09194583e-14, 1.43521524e-13],
        [9.99039291e-13, 6.43016074e-14],
        ...,
        [2.88156025e-13, 9.30488195e-14],
        [1.58661821e-13, 2.77514934e-13],
        [1.51708381e-11, 7.59987069e-12]],

       ...,

       [[7.72640312e-12, 6.27162454e-12],
        [6.15849439e-12, 4.12478461e-12],
        [8.65730993e-13, 1.56909593e-13],
        ...,
        [7.96897548e-13, 4.13667767e-13],
        [1.16857848e-12, 6.48773318e-13],
        [3.91733006e-11, 2.46771462e-11]],

       [[8.51196114e-14, 1.24946387e-12],
        [5.28042746e-13, 1.31284002e-12],
        [3.49413591e-12, 1.48621219e-12],
        ...,
        [2.47174975e-13, 9.44466065e-13],
        [2.88489966e-14, 6.04828209e-13],
        [5.69097819e-12, 1.30041217e-11]],

       [[2.52572816e-83, 1.33466077e-83],
        [2.52572816e-83, 1.33466077e-83],
        [2.52572816e-83, 1.33466077e-83],
        ...,
        [2.52572816e-83, 1.33466077e-83],
        [2.52572816e-83, 1.33466077e-83],
        [2.52572816e-83, 1.33466077e-83]]]), array([ 9.765625, 11.71875 ]))

I will be really thankful for an answer.
Best regards,
Arianna

I think an example here will be clearer. I loaded one of my dataset (raw, un-preprocessed, 63 EEG channels + 2 AUX considered as EEG for this example, so 65 channels) sampled at 1 kHz.


Let’s start by creating fixed-length epochs:

IN: epochs = mne.make_fixed_length_epochs(raw, duration=1., overlap=0, preload=True)

Not setting metadata
260 matching events found
No baseline correction applied
0 projection items activated
Using data from preloaded Raw for 260 events and 1000 original time points ...
0 bad epochs dropped

Now I also understand the logger line:

Using data from preloaded Raw for 260 events and 1000 original time points …

mne.make_fixed_length_epochs created 260 epochs of 1 second, each with 1 000 samples (time points). Which is coherent with duration=1.0 with a sampling rate at 1 kHz. If I look at the shape of the data array in the Epoch object:

epochs.get_data().shape
Out: (260, 65, 1000)

260 epochs, 65 channels, 1 000 points.
If I retrieve the element in position [10, 4, 80], I get the sample from the 11th epoch, from the 5th channel, at the 81th position. (+1 because numpy arrays are 0-indexed).

epochs.get_data()[10, 4, 80]
Out: -0.003715330036357045

Now let’s run a PSD with the size of the FFT window set at the size of the number of points (1 000):

psds, freqs = psd_welch(epochs, fmin=8., fmax=13., n_fft=1000)
Effective window size : 1.000 (s)

freqs contains the frequency bin computed, which depends on the size of the FFT window n_fft:

freqs
Out: array([ 8.,  9., 10., 11., 12., 13.])

So I got a 1 Hz resolution.


For psds, the shape is:

psds.shape
Out: (260, 65, 6)

I have 260 epochs, 65 channels, and 6 frequency bin (8, 9, 10, 11, 12 and 13).
So now, if I want to retrieve the PSD of the second channel during the 10th epoch, I can retrieve it with:

psds[9, 1, :]
Out[24]: 
array([2.25663920e-12, 1.32043224e-13, 1.10263780e-13, 1.13033751e-13,
       2.29236803e-13, 5.92261626e-13])

N.B. Numpy array are 0-index, so 10th epoch will be at indices 9; 2nd channel will be at indices 1.
I get an array with 6 values, one for each frequency bin 8, 9, 10, 11, 12, 13.

from matplotlib import pyplot as plt
plt.plot(freqs, psds[9, 1, :])

On this example, on this channel and epoch I got a peak in alpha at 8 Hz (not that it means anything since this data is completely un-preprocessed, not even a bandpass filter.)


Now, this is very useful for offline analysis, but not so much for NFB. In NFB, you will have one acquisition window, usually small (0.5s, 1s). Welch PSD computation computes the PSD on multiple windows and averages across them. As you don’t have multiple window available at once for NFB, it isn’t a good fit.

I think what you are looking for is simply a window + regular FFT. Let’s assume your system sends:

  • window of 1 second
  • 64 channels
  • sampled at 1 000 Hz

And on each window, you want to compute a feedback value.

import numpy as np

fs = 1000
n_points = 1000  # 1 second at 1 kHz

window = np.array(...)  # shape (64, 1000)
fft_window = np.hanning(n_points)
window = np.multiply(window, fft_window)
fftval = np.abs(np.fft.rfft(window, axis=1) / n_points)

frequencies = np.fft.rfftfreq(n_points, 1.0/fs)
alpha_band = np.where(np.logical_and(frequencies>=8., frequencies<=13.))[0]
alpha = np.average(np.square(fftval[:, alpha_band]))

Disclaimer, I haven’t tested this, but it should return an average across all channels of the alpha-band power on the given window.

Dear Mathieu, thank you for the detailed answer!
I am feeleng really ashamed, but I still have more questions. In fact, it seems I don’t understand some key points - and get a value error trying to run the code.

  1. How do I get np.array for window = np.array(...) # shape (64, 1000) ? What does 1000 stand for? Time points, epoch duration?..

My idea was shape looks like a (number of channels, number of points - ) so I tried to get the array by reshaping epochs.get_data():
(My next thought was it was about duration, but I have not a slightest idea how to get an array for that, so I decided to try my luck with time points just to test how it would work with some array shaped properly.)

a= epochs.get_data()
a.shape 

# what I get
(925, 19, 250) (and it should mean 925 epochs, 19 channels, 250 points)

a = np.zeros( (925, 19, 250) )
result = a[0, :, :]
print(result.shape)

# what I get
(19, 250) and it should mean 19 channels, 250 points 

Even if the idea about time points was wrong, I still tried to run your code for my data (to test if it would work with some array shaped properly):

fs = 1000
n_points = 250  # 1 second at 1 kHz

window = result  # shape (19, 250)
fft_window = np.hanning(n_points)
window = np.multiply(window, fft_window)

fftval = np.abs(np.fft.rfft(window, axis=1) / n_points)

frequencies = np.fft.rfftfreq(n_points, 1.0/fs)

alpha_band = np.where(np.logical_and(frequencies>=8., frequencies<=13.))[0]
alpha = np.average(np.square(fftval[:, alpha_band]))

#I tried to add some code for beta
beta_band = np.where(np.logical_and(frequencies>=13., frequencies<=30.))[0]
beta = np.average(np.square(fftval[:, beta_band]))

#...and some code for alpha/beta ratio
ab_ratio = alpha/beta

# what I get
ValueError: operands could not be broadcast together with shapes (19,250) (1000,) 
  1. In addition, this should work for a case when I have only one epoch available for analysis at a time, but how to make it work for a number of epochs (if it is necessary to work with the data from EDF, not real time)?

Please excuse me for asking so many silly questions. I really appreciate your answers (and feel really happy and excited when get a responce, though, well, it means I am going to spend next few hours trying tjo understand the answer and producing some strange code that wouldn’t work). But I really hope to finally understand how it all works.
Thank yo very much for your answers.
Arianna
P.S. Coming back to PSDs: could you please explain the result you got?

psds[9, 1, :]
Out[24]: 
array([2.25663920e-12, 1.32043224e-13, 1.10263780e-13, 1.13033751e-13,
       2.29236803e-13, 5.92261626e-13])

I mean, I unfortunately can’t understand why it is an array - I really hoped to see a single number (for a mean PSD per epoch). Please excuse me for asking something that definetely seems obvious and going without saying for you. (My initial plan was to get a single number for alpha and then for beta, and then to find alpha/beta ratio for each epoch - but I see something is really wrong with my understanding of how things work. Is there a simple way to calculate alpha/beta ratio for each epoch still?..)

Let’s go in order.

1000, time points or duration are the same information. One is expressed in samples, the other in seconds. When I wrote window = np.array(...) I was expressing the fact that your neurofeedback window contains (64 channels, 1000 points). This can be called a single epoch, but let’s no do that as it’s confusing with the Epoch object from MNE.

In MNE, and what you don’t have in a neurofeedback system, an Epoch object contains multiple windows of the same size. The shape of the array is now (n_epochs, n_channels, n_points). In a neurofeedback system, typically, you will try in real-time to compute feedback from a single window.

For offline analysis, MNE methods, PSD,Epochs, … are great. But not so much for real-time processing of single window. If you do want to work with MNE structure, then this single window could be a Raw object with the underlying structure (n_channels, n_points).


For the output of the PSD in my example, we have 6 frequency bins: 8, 9, 10, 11, 12 and 13.
The array psds[9, 1, :] contains the corresponding power spectral density for those 6 frequency bins.

8 Hz -> 2.25663920e-12
9 Hz -> 1.32043224e-13
...

If you want an average across all bins:

import numpy as np
np.average(psds[9, 1, :])

I think you should consider joining the next office hours on Discord and ask some of your questions vocally to the devs. Some of your confusion might be resolved this way.

Mathieu

:100:
I think this would make a lot of sense. However, next office hour will only take place in January.

Dear Mathieu,
Thank you for your answer (and your patience)!
The thing is, our plan was to work with sample EDFs and then to apply the result to real time eeg processing (as there is no access to the device right now). That is why I am trying to make some working example for EDF first.
Speaking about fft for a single window, is it a way to make an np.array for working with EDF?.. Should I somehow apply a sliding window to raw?

And I will really appreciate any info on how to reach the devs and what I am expected to do for it.
Thank you very (very!) much.
Arianna

The dev team holds office hours every 2 weeks on Friday evening at 5 pm CET. You can connect and ask your question/share screen/etc… on the MNE Discord. You can find the link at the top right corner of the main website: MNE — MNE 0.24.1 documentation

For now, I feel like it’s a bit difficult to help you further. You have some misunderstandings and misconceptions on what are the different objects you are working with.

The thing is, our plan was to work with sample EDFs and then to apply the result to real time eeg processing

I do not get this sentence. Offline and real-time processing is very different. You can rarely apply the same processing used for offline analysis to an online sliding-window system.

I am not very familiar with the EDF format, but somewhere inside there is an array that holds the data. If you read the EDF file with MNE, it will create a Raw object. You can then access the numpy array with raw.get_data().

EDF reader: mne.io.read_raw_edf — MNE 0.24.1 documentation
Raw: mne.io.Raw — MNE 0.24.1 documentation
raw.get_data(): mne.io.Raw — MNE 0.24.1 documentation