applying baseline extracted from another epoched object

Hi community and mostly @drammock,

I’d like to apply a baseline to my epoched object that is actually the baseline of another epoched object. I saw someone asked this two years ago and looked at the code you posted in a reply, but i’m pretty new to both eeg analysis and coding so i’m getting lost at the most important part.

As far as i understand you make one epoched object which has really long epochs that eventually include both the desired baseline, the actual time point of interest, and everything that falls inbetween. You apply baseline on this epoched object. Then you make an epoched object with the actual time point of interest at 0 and you DONT baseline it. Then you erase the ._data of the latter object. And then… you take the baselined data of the former object and insert it into the correct time-frame of the latter one ? I get lost at line 135 when you begin the most vital part of the process. I’d really appreciate if you can help me understand your script :raised_hands:

My stimuli are three-word phrases. I’m interested to look at the last word and i have triggers for the onset of those critical words. I want to baseline during silence, so before the onset of the audio. I also have triggers for that. I can provide data and code, but i think if i manage to understand what you did in your script i should be able to adapt it to my project.

Thanks for your help!
bissera

  • line 135: get the consonant, vowel, and word durations in the correct order (order is determined by the ev variable)
  • line 137: subset that duration dataframe to have only the non-dropped epochs
  • lines 138ff: for each line in that dataframe (AKA for each epoch), compute:
    • 140: the sample number of time=0 (the stim onset) in the baselined (stim-onset-aligned) epochs
    • 141-2: the sample number of the last sample we care about (stim onset + word duration + brain response period) in the baselined epochs
    • 143: the sample number of stim onset for the non-baselined (in your case, final-word-aligned) epochs
    • 144: the sample number of last sample we care about for the non-baselined epochs
    • 146-7: compute total number of samples-to-be-retained in each case
    • 148-9: make sure total number of samples-to-be-retained is within +/-1 for baselined vs non-baselined
    • 151-2: actually insert the baselined data into the unbaselined epochs object

is that enough to get you moving? If not feel free to ask more specific follow-up questions

1 Like

Hi Dan,

Thanks so much for describing what you do in your code! So sorry it’s taking me forever to get to this project. I think i have the advantage that within my long epochs i have the trigger used to make the epoch, which is at 0ms, and then i ALSO have the trigger for the time point of interest. I see that because when i call temp.get_annotations_per_epoch() i get the output

[(-5.175781257094059e-05, 0.0, ‘onset/stim/adv/dist/v/bigarré’),
(0.9800507812499291, 0.0, ‘CW/stim/adv/dist/v/bigarré’)],
[(4.6874999952706276e-05, 0.0, ‘onset/stim/adj/close/mg/brillante’),
(1.1560527343749527, 0.0, ‘CW/stim/adj/close/mg/brillante’)],

where the condition ‘onset’ is the onset of my phrase and the condition “CW” is the onset of the critical word i’m interested in. So i’ll try to somehow loop over the epoch object (i imagine that’s what next() does) and for each individual epoch make a new epoch based on the latter trigger (i hope with crop()). I’ll update this post once i have something that works.

bissera

1 Like

Hi Dan and others who might find this helpful,

Here what i did in the end.

Fist i import my long epochs, which have 500ms baseline in the silent period before the onset of the audio stimulus and last as long as my longest stimulus. They contain the trigger for the onset of the stimulus (at 0ms) and the trigger for the onset of the critical word (CW) which comes at a different time point for each stimulus. I actually accidentally have long epochs with the CW trigger at 0ms which doesn’t really make sense, but ended up being useful for creating the events for the new epochs.
Then i loop over the epochs and crop them by finding the time T at which the CW tirgger comes in in the current epoch and cropping at T-400 and T+700ms. Then i create the events for the new epochs (because otherwise the new epochs will have the information of the old ones, so the trigger codes and timing details of the onset epochs). Then i create the new epoched object using the above information and finally time shift it so that the onset of the critical word (which is now at 400ms after the beginning of the epoch) is at 0ms.

###############################
# open epochs and raw files
###############################
epopath = os.path.join(epodir + '/sub-' + subject + '_ses-' + session + '-long-epo.fif')
epochs = mne.read_epochs(epopath, preload=True, verbose=True)
epochsOn = epochs['onset']
epochsCW = epochs['CW']
datapath = os.path.join(eegdir + '/sub-' + subject + '_ses-' + session + '_postica_eeg.fif')
data = mne.io.read_raw_fif(datapath, allow_maxshield=True, preload=True, verbose=True)


###############################
# create a new epoched object
###############################

# create an empty array
cropped_epochs_array = np.ndarray(shape=(0, 258, 551)) # the size is 0 epochs, 258 channels, 551 samples long (which is 400+700=1100ms)

# crop current epochs
for i, _ in enumerate(epochsOn):
    CW_onset = epochsOn[i].get_annotations_per_epoch()[0][1][0] # use the epochs annotations object to find the time in the current epoch at which the CW begins
    tmin = CW_onset - .4 # 400ms before onset of CW
    tmax = CW_onset + .7 # 700ms after
    new_epoch = epochsOn[i].crop(tmin, tmax)._data # crop the current epoch to above times
    print(i, new_epoch.shape, tmin, CW_onset, tmax)
    cropped_epochs_array = np.concatenate((cropped_epochs_array, new_epoch)) # add cropped epoch to array


#### events for new epochs

# take the CW events for epochs that also exist in the onset events
listCW = epochsCW.events.tolist()
listOn = epochsOn.events.tolist()

# remove events that don't match between On and CW epochs
while len(listCW) != len(listOn):
    for idx, (eventCW, eventOn) in enumerate(zip(listCW, listOn)):
        if eventCW[-1] != eventOn[-1] + 1000:
            if eventCW[-1] == listOn[idx + 1][-1] + 1000:
                listOn.pop(idx)
            elif eventOn[-1] == listCW[idx + 1][-1] - 1000:
                listCW.pop(idx)
            break

# event lists have to be arrays 
eventsCW = np.array(listCW.copy())

# define event ids
# dictionary with event codes
evdf = pd.read_excel(os.path.join(datadir,'trigDict.xlsx'))
eventsdict = {}
for index, row in evdf.iterrows():
    eventsdict[row['condition']] = row['code']


# epochsarray needs a dictionary whith all and no redundant values for the events
evCodes = [code[2] for code in eventsCW]
evDictMod = {key: value for key, value in epochs.event_id.items() if value in evCodes}

# create the epochs object
cropped = mne.EpochsArray(cropped_epochs_array, info=epochsOn.info, events=eventsCW, event_id=evDictMod)

# shift time so that 0ms is at CW onset
cropped.shift_time(tshift=-0.4, relative=True)


###############################
# save work
###############################
silentpath = os.path.join(eegdir + '/sub-' + subject + '_ses-' + session + '-SB-epo.fif')
cropped.save(silentpath, overwrite=True)
2 Likes

Hi bissera,

Thank you for this post, your code is super helpful! I’m trying to do something similar to what you did with your EEG data. I have a couple of questions about this code snippet:

events for new epochs

take the CW events for epochs that also exist in the onset events

listCW = epochsCW.events.tolist()
listOn = epochsOn.events.tolist()

remove events that don’t match between On and CW epochs

while len(listCW) != len(listOn):
for idx, (eventCW, eventOn) in enumerate(zip(listCW, listOn)):
if eventCW[-1] != eventOn[-1] + 1000:
if eventCW[-1] == listOn[idx + 1][-1] + 1000:
listOn.pop(idx)
elif eventOn[-1] == listCW[idx + 1][-1] - 1000:
listCW.pop(idx)
break

Could you explain why the +1000 / -1000 is needed?

For me, the final list eventsCW has three columns. I’m assuming the first one is the sampling point, the second one the time point within the epoch (0 for the CW), and the third one is the name of the event. Is it the same for you?

Thank you very much in advance!
Best,
Elisabeth

Hi Elisabeth,

The +1000 thing has to do with how i’ve coded my triggers. When i get the raw file i only have the triggers for the onset of my auditory linguistic stimuli, and i have a coding system for them from 1 to 255. However, for each of them the critical word begins at a different time point after the onset of the auditory stimulus, and i’ve inserted those ‘new’ triggers at the correct time points and used the code trigger for onset of stimulus +1000.

The three columns in the events structure, as far as i understand, stand for sample number, something you mostly don’t care about (mine are always 0) and the trigger code (so the numerical code, which in the annotations structure will become the name of the event). Read here. I found this and this tutorial very helpful in understanding events and annotations respectively.

I’ve kind of updated my code since the last post on this thread, but in this code snippet i’m basically making sure that for the annotations structure, which i’m copying from another epochs file, i have the identical events. I have two lists of events, and i want to be super certain they correspond, and if there is an event missing from one (like i had to reject an epoch) then i remove it from the other too.

Below the current code, i’m not sure how useful it is, but happy to clarify as i realise my code is not very clear…

Best,
bissera

###############################
# where to find the data
###############################
study_root = os.path.join("/", "Users", "ivanova", "Documents", "pythonProjects", "eegProc")
outdir = os.path.join(study_root, "out")
eventsdir = os.path.join(outdir, 'EventFiles')
datadir = os.path.join(study_root, "Data")
studydir = os.path.join(outdir, "DynSyn_EEG_PC")
epochsdir = os.path.join(studydir, 'epochs-50+700/_final')
SBdir = (os.path.join(epochsdir, 'SB'))
CWALdir = (os.path.join(epochsdir, 'CW-aligned'))
Longdir = (os.path.join(epochsdir, 'Long'))
beforeZero = -.05         # CW-aligned : -.5      regular: .05  .1
afterZero = .7
type = 'SB'
savedir = SBdir


for ppt in [1]: # 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 22, 23, 24, 25

    ###############################
    # which subject to work with
    ###############################
    subject = str(ppt)
    session = '1'

    subdir = os.path.join(studydir + '/sub-' + subject)
    sesdir = os.path.join(subdir + '/ses-' + session)
    eegdir = os.path.join(sesdir + '/eeg')

    ###############################
    # open epochs and raw files
    ###############################
    epopath = os.path.join(Longdir + '/sub-' + subject + '_ses-' + session + '-long-epo.fif')
    epochs = mne.read_epochs(epopath, preload=True, verbose=True)['onset']
    datapath = os.path.join(eegdir + '/sub-' + subject + '_ses-' + session + '_postica_eeg.fif')
    data = mne.io.read_raw_fif(datapath, allow_maxshield=True, preload=True, verbose=True)


    ###############################
    # create a new epoched object
    ###############################

    # create an empty array
    cropped_epochs_array = np.ndarray(shape=(0, np.shape(epochs.get_data())[1], round(((-beforeZero+afterZero)*1000 / (1000/epochs.info['sfreq']))) +1 ))
    # cropped_epochs_array = np.ndarray(shape=(0, 73, 666)) # for special epochs BIOSEMI
    annotCw = []

    # crop current epochs
    for i, _ in enumerate(epochs):
        CW_onset = epochs[i].get_annotations_per_epoch()[0][1][0] # use the epochs annotations object to find the time in the current epoch at which the CW begins
        annotCw.append(epochs[i].get_annotations_per_epoch()[0][1])
        # tmin = CW_onset - .4 # 400ms before onset of CW
        tmin = CW_onset + beforeZero # FOR SPECIAL EPOCHS 700ms before onset of CW
        tmax = CW_onset + afterZero # 700ms after
        new_epoch = epochs[i].copy().crop(tmin, tmax)._data # crop the current epoch to above times
        print(" cropping epoch ", i+1, "of ", len(epochs))
        cropped_epochs_array = np.concatenate((cropped_epochs_array, new_epoch)) # add cropped epoch to array
        # if tmin < 0:
        #     issueEpochs.append([i, tmin, epochs[i].get_annotations_per_epoch()[0][1]])
    np.shape(cropped_epochs_array)

    # remove events that don't match between On and CW epochs
    # i want to keep the sample number and trigger code of the CW events because
    # this is what i'm actually cropping around. im using the events from the data and not from the epochs
    # because in the epochs there isnt a perfect overlap between CW/onset epochs that were rejected.
    evCw = [event for event in mne.find_events(data, verbose=False).tolist() if event[2] > 1000] # ALL CW events
    evOn = epochs.events.tolist() # onset events that exist in the epochs

    print("re-aligning event structure for new epochs")
    newCw = []
    k = 0
    for i, item in enumerate(evCw):
        i_on = i - k
        if i_on < len(evOn):
            if item[-1] == (evOn[i_on][-1] + 1000):
                newCw.append(item)
            else:
                print("misalignment removed for index ", i, "for events", item[-1], evOn[i_on][-1])
                k = k + 1
    # input("are the above equal? enter to continue")

    # check everything is running ok
    print("checking alignment of event structures:")
    temp = [[onset, cw] for onset, cw in zip(evOn, newCw)]
    for i, item in enumerate(temp):
        if item[0][2] != (item[1][2] - 1000):
            print("!!!!! MISALIGNMENT !!!!! at index ", i, "for events", item[0][2], item[1][2])
        else:
            print("good match at index ", i)
    # input("Happy? Press Enter to continue... ")
    eventsCW = np.array(newCw.copy())
    print(np.shape(cropped_epochs_array), np.shape(eventsCW), np.shape(annotCw))


    # define event ids
    # dictionary with event codes
    evdf = pd.read_excel(os.path.join(datadir,'trigDict.xlsx'))
    eventsdict = {}
    for index, row in evdf.iterrows():
        eventsdict[row['condition']] = row['code']

    # epochsarray needs a dictionary with all and no redundant values for the events
    conds = [ev[2] for ev in annotCw]
    evDictMod = {condition: eventsdict[condition[:len(condition)-2]] for condition in conds}
    # evCodes = []
    # for ev in evConds:
    #     event = ev.copy()
    #     event[2] = evDictMod[event[2]]
    #     evCodes.append(event)
    # evCodes = np.array(evCodes)
    # # evDictMod = {key: value for key, value in eventsdict.items() if value in evCodes}

    # convert the above array to an epochs object
    cropped = mne.EpochsArray(cropped_epochs_array, info=epochs.info, events=eventsCW, event_id=evDictMod)
    cropped.shift_time(tshift=beforeZero, relative=True) # adjust t=0 at onset of CW
    evoked = cropped.average().plot()
    evoked.savefig(os.path.join(SBdir,"TEMP_%s.png" %ppt))
    # input("is time shift ok on the figure? press enter to continue...")

    ###############################
    # save work
    ###############################
    silentpath = os.path.join(savedir + '/sub-' + subject + '_ses-' + session + f'-{type}-epo.fif')
    cropped.save(silentpath, overwrite=True)
    print('saved epochs for subject', subject)
2 Likes