Excluding the Baseline in Export data

FranziMNE · October 22, 2022, 10:31am

Hello

I have epoched EEG data from -0.2 to 3 sec and I want the data to be exported in to excel, whereby the data points get averaged for each second, resulting in three rows (1. row 0 to 1 sec, 2. row 1 to 2 sec, 3. row 2 to 3 sec). My baseline is supposed to be from -0.2 to 0 seconds. My problem is, that when I export the data, the data regarding the first row, not only contain the data points from 0 to 1 sec, but also include the baseline period which makes no sense. I cant cut my tmin to 0 insted of -0.2, because I need this period for the baseline correction. Any ideas? Can I tell python to start the export by 0 not -0.2?

 session = 'ses-Expo'
    data_path = 'C:\\Users\\Admin\\Desktop\\Projekt_Franziska\Studie 3\\Analyse\\EEG\\BIDS_ScriptNew'
    file_name_CSplusRemembered = data_path + '/' + subject + '/' +  session + '/' +  'eeg/' + subject + '_' + session + '_run01_' + 'tfr_data_CS+_remembered_slow.fif'
    data_CSplusRemembered_path = op.join(data_path,file_name_CSplusRemembered) 
    data_CSplusRemembered= mne.time_frequency.read_tfrs(data_CSplusRemembered_path)
    data_CSplusRemembered=data_CSplusRemembered[0]
    data_CSplusRemembered.pick_channels(ch_names=['Fp1', 'Fp2', 'F7', 'F3', 'Fz', 'F4', 'F8', 'FC5', 'FC1', 'FC2', 'FC6'])
    data_alpha_CSplusRemembered=data_CSplusRemembered.crop(tmin=-0.2, tmax=3,fmin=8, fmax=10, include_tmax=True)
    data_alpha_CSplusRemembered=data_alpha_CSplusRemembered.apply_baseline(baseline=(-0.2, 0), mode='zlogratio', verbose=None)
    alpha=np.mean(data_alpha_CSplusRemembered.data, axis=1)
    alpha=np.mean(alpha.data, axis=0)
    timeS=data_alpha_CSplusRemembered.times[0:]
    data = np.dstack((timeS,alpha))
    data = data.reshape(801, 2)
    index=data_alpha_CSplusRemembered.times[0:]
    columns=['time','value']
    
    df_data_CSplusR = pd.DataFrame(data=data ,    # values
                             index=index,
                             columns=columns)  # 1st row as the column names
    df_data_CSplusR['pptID'] = subject 
    df_data_CSplusR['condition'] = 'CPR'
    df_data_CSplusR['shock'] = 'P'
    df_data_CSplusR['memory'] = 'R'
    
    baseline_CPR = baseline_CPR.append(pd.DataFrame(data = df_data_CSplusR), ignore_index=True)
    
    del data_CSplusRemembered, data_alpha_CSplusRemembered, alpha, df_data_CSplusR

#Averages values to 1 value each sec
    baseline_CPR=baseline_CPR[baseline_CPR['time']!=0]
    baseline_CPR['Sec']= baseline_CPR['time'].apply(np.ceil)
    baseline_CPR_summary=pd.pivot_table(baseline_CPR,values='value',index=['pptID'],columns=['Sec'],aggfunc=np.mean)

Thank you
Franzi

mscheltienne · October 22, 2022, 10:57am

I’m not sure I am following, especially the averaging of datapoints in 3 rows. And your code is not super helpful in figuring it out.

That said, it seems like you are fine working with pandas DataFrame, thus I suggest you export your epochs to DataFrame and take it from there:

import numpy as np
from mne import create_info, EpochsArray


data = np.random.randn(10, 5, 3200)
info = create_info(5, sfreq=1000, ch_types="eeg")
epochs = EpochsArray(data, info, tmin=-0.2, baseline=(-0.2, 0))
epochs.crop(0, 3 - 1 / epochs.info["sfreq"])  # remove the first 200 ms before export
df = epochs.to_data_frame()

FranziMNE · October 22, 2022, 12:08pm

So data are averaged, one row one second. But the Baseline gets probably included in the first row, but should not

mscheltienne · October 22, 2022, 1:05pm

Do you mean you averaged both the epochs by time and by channel to get a single value per second?

FranziMNE · October 22, 2022, 1:19pm

Yes, so for each subject the data points for each second get averaged for a electrode cluster (in this code frontal electrodes) in 1 sec steps, but I want to avoid that the baseline gets included in this exported data

mscheltienne · October 22, 2022, 1:45pm

I don’t see a use-case for this, but here is a proposition using numpy (simpler for me than DataFrames ).

import numpy as np
from mne import create_info, EpochsArray


data = np.random.randn(10, 5, 3200)
info = create_info(5, sfreq=1000, ch_types="eeg")
epochs = EpochsArray(data, info, tmin=-0.2, baseline=(-0.2, 0))
epochs.crop(0, 3 - 1 / epochs.info["sfreq"])

data = epochs.get_data()
data = np.average(data, axis=(0, 1))  # average across epochs and channels
# split in one seconds
n_samples = int(1 * epochs.info["sfreq"])  # cast to int for index selection
data_split = [data[k*n_samples:(k+1)*n_samples] for k in range(3)]
# average each second together
data_avg = [np.average(data) for data in data_split]

The idea is:

Apply the baseline on (-0.2, 0)
Crop the epochs to remove the baseline period
Export to numpy array
Average by epochs / channels
Split per seconds and average by time

FranziMNE · October 22, 2022, 2:06pm

Thank you very much!! I think what already did the trick was to switch the lines, so first applying baseline, than crop

Topic		Replies	Views
Trimming epochs in mne-python Mailing List Archive (read-only) list-archive	2	266	December 22, 2013
epochs baseline problem (pandas dataframe indexing?) Mailing List Archive (read-only) list-archive	3	178	April 15, 2016
How is the baseline correction calculated? Mailing List Archive (read-only) list-archive	4	277	September 6, 2016
Accidentally export first trial instead of the average of the trials? Support & Discussions eeg	2	157	February 17, 2023
applying baseline to EEG data Mailing List Archive (read-only) list-archive	0	160	February 13, 2017

Excluding the Baseline in Export data

Related topics