Excluding the Baseline in Export data

Hello :slight_smile:

I have epoched EEG data from -0.2 to 3 sec and I want the data to be exported in to excel, whereby the data points get averaged for each second, resulting in three rows (1. row 0 to 1 sec, 2. row 1 to 2 sec, 3. row 2 to 3 sec). My baseline is supposed to be from -0.2 to 0 seconds. My problem is, that when I export the data, the data regarding the first row, not only contain the data points from 0 to 1 sec, but also include the baseline period which makes no sense. I cant cut my tmin to 0 insted of -0.2, because I need this period for the baseline correction. Any ideas? Can I tell python to start the export by 0 not -0.2?

 session = 'ses-Expo'
    data_path = 'C:\\Users\\Admin\\Desktop\\Projekt_Franziska\Studie 3\\Analyse\\EEG\\BIDS_ScriptNew'
    file_name_CSplusRemembered = data_path + '/' + subject + '/' +  session + '/' +  'eeg/' + subject + '_' + session + '_run01_' + 'tfr_data_CS+_remembered_slow.fif'
    data_CSplusRemembered_path = op.join(data_path,file_name_CSplusRemembered) 
    data_CSplusRemembered= mne.time_frequency.read_tfrs(data_CSplusRemembered_path)
    data_CSplusRemembered=data_CSplusRemembered[0]
    data_CSplusRemembered.pick_channels(ch_names=['Fp1', 'Fp2', 'F7', 'F3', 'Fz', 'F4', 'F8', 'FC5', 'FC1', 'FC2', 'FC6'])
    data_alpha_CSplusRemembered=data_CSplusRemembered.crop(tmin=-0.2, tmax=3,fmin=8, fmax=10, include_tmax=True)
    data_alpha_CSplusRemembered=data_alpha_CSplusRemembered.apply_baseline(baseline=(-0.2, 0), mode='zlogratio', verbose=None)
    alpha=np.mean(data_alpha_CSplusRemembered.data, axis=1)
    alpha=np.mean(alpha.data, axis=0)
    timeS=data_alpha_CSplusRemembered.times[0:]
    data = np.dstack((timeS,alpha))
    data = data.reshape(801, 2)
    index=data_alpha_CSplusRemembered.times[0:]
    columns=['time','value']
    
    df_data_CSplusR = pd.DataFrame(data=data ,    # values
                             index=index,
                             columns=columns)  # 1st row as the column names
    df_data_CSplusR['pptID'] = subject 
    df_data_CSplusR['condition'] = 'CPR'
    df_data_CSplusR['shock'] = 'P'
    df_data_CSplusR['memory'] = 'R'
    
    baseline_CPR = baseline_CPR.append(pd.DataFrame(data = df_data_CSplusR), ignore_index=True)
    
    del data_CSplusRemembered, data_alpha_CSplusRemembered, alpha, df_data_CSplusR

#Averages values to 1 value each sec
    baseline_CPR=baseline_CPR[baseline_CPR['time']!=0]
    baseline_CPR['Sec']= baseline_CPR['time'].apply(np.ceil)
    baseline_CPR_summary=pd.pivot_table(baseline_CPR,values='value',index=['pptID'],columns=['Sec'],aggfunc=np.mean)

Thank you :slight_smile:
Franzi

I’m not sure I am following, especially the averaging of datapoints in 3 rows. And your code is not super helpful in figuring it out.

That said, it seems like you are fine working with pandas DataFrame, thus I suggest you export your epochs to DataFrame and take it from there:

import numpy as np
from mne import create_info, EpochsArray


data = np.random.randn(10, 5, 3200)
info = create_info(5, sfreq=1000, ch_types="eeg")
epochs = EpochsArray(data, info, tmin=-0.2, baseline=(-0.2, 0))
epochs.crop(0, 3 - 1 / epochs.info["sfreq"])  # remove the first 200 ms before export
df = epochs.to_data_frame()


So data are averaged, one row one second. But the Baseline gets probably included in the first row, but should not

Do you mean you averaged both the epochs by time and by channel to get a single value per second?

Yes, so for each subject the data points for each second get averaged for a electrode cluster (in this code frontal electrodes) in 1 sec steps, but I want to avoid that the baseline gets included in this exported data

I don’t see a use-case for this, but here is a proposition using numpy (simpler for me than DataFrames :sweat_smile:).

import numpy as np
from mne import create_info, EpochsArray


data = np.random.randn(10, 5, 3200)
info = create_info(5, sfreq=1000, ch_types="eeg")
epochs = EpochsArray(data, info, tmin=-0.2, baseline=(-0.2, 0))
epochs.crop(0, 3 - 1 / epochs.info["sfreq"])

data = epochs.get_data()
data = np.average(data, axis=(0, 1))  # average across epochs and channels
# split in one seconds
n_samples = int(1 * epochs.info["sfreq"])  # cast to int for index selection
data_split = [data[k*n_samples:(k+1)*n_samples] for k in range(3)]
# average each second together
data_avg = [np.average(data) for data in data_split]

The idea is:

  • Apply the baseline on (-0.2, 0)
  • Crop the epochs to remove the baseline period
  • Export to numpy array
  • Average by epochs / channels
  • Split per seconds and average by time

Thank you very much!! I think what already did the trick was to switch the lines, so first applying baseline, than crop :see_no_evil:

1 Like