Advice needed on salvaging data with unusual evoked baseline activity

Hi everyone,

This is not so much a MNE module related question, but more of a general MEG/EEG technical one. I am hoping to tap into the expertise and opinions of some of the more experienced EEG/MEG data handlers here.

Here, I have a sample clean data set of a participant, comprising of evoked plots of three different conditions:


volblinks and sponblinks are tests conditions, whereas blanks is a control condition generated by extracting random data segments from the sponblinks condition when nothing is happening (ie. resting state baseline).

However, I have a small subset of participants producing unusually high baseline activity:


At first, I thought it could have been the sensors picking up unusual level of noise that was distorting the data, thus this. But if that was the case, then I would expect sponblinks evoked plot to be as noisy as the blanks control. I am very puzzled as to why this is happening as visually inspecting the raw data, there isn’t any obvious bad data segments that could explain this.

My question here is, is there a way to salvage this data set with some sort of scaling function that I can use to diminish the blanks activity here to match the average amplitude of the sponblinks condition? Is this a legitimate move to clean this data set? Or does someone here have prior experience to this scenario and a solution to it?

what data high pass filtered during acquisition for all subjects?

MEG channels have DC offsets. High pass filters are the way to fit this

Hope this helps

The data was filtered using raw.filter(1,149).

Is 1 Hz not a good enough high pass filter?

it should clearly be enough. With such as highpass your data should clearly
be around 0 during baseline. There is something to understand


Did you apply some sort of cleaning operation before constructing the Evoked?

Could you share the corresponding butterfly plots too?

1 Like

Yes, I used ICA for artefact removal. These evoked plots were generated at the end of everything. But it is crucial to note that both the first and second plots above were generated using the same codes and gone through the same cleaning procedure. The only difference is that these two plots come from two different participants. So I’ve got the feeling that it has something to do within the individual data set rather than the process.

As for the butterfly plots:



Sorry, just for clarification, are these the butterfly plots that correspond to the problematic GFP figure above?

Yup, the butterfly plots are for the problematic evoked plot in the first post.

Hold on, sorry. The problematic plot above is based on all gradiometer sensors. Please refer to this one here to reflect the 12 gradiometers of interest, as reflected in the butterfly plots.


Ok, thanks! This looks all good to me. The GFP (actually, RMS! We fixed this in the latest development version) plots are correct. For the condition with the high GFP values, variability between sensors is largest, as can be seen in the butterfly plots. This is correctly reflected in the GFP traces.

Some people like to baseline correct their GFP plots, as to allow for a better visual inspection of differences in the time courses between conditions. But generally, the data you showed seems fine.

I see.

But do you have any idea why the variability is so drastic for the high GFP condition (blanks/3) but not in sponblinks/3? Because the data from these two conditions were collected using the same sensors in the same MEG session. Trials from blanks/3 were generated from blank (baseline) segments of the sponblinks/3 condition. The baseline set for both conditions are similar doing the epoch extraction phase.

The only logical conclusion I can make out of this is that during baseline inter-trial interval for some reason, these outlier participants are doing something consistently, just not sure what. Are there any other possible explanations? (ie. Resting state baseline noise that I am not aware of; MEG sensors getting funky occasionally). That will help me decide if I’m just better off discarding these data.

Can you be more explicit about the exact time periods used, please? Specifically, how large was the overlap between those epochs? What is the time-locked event of the blanks/3 epochs?

There are minimal-to-zero overlap between time periods of these two conditions.

Trials for blanks/3 were sampled every 6-9 seconds (time locking is based on this time stamp) of the sponblinks/3 data set. If a prospective blanks/3 time stamp has no sponblinks/3 event occurring 2 seconds before and 3 seconds after, then it is considered a valid blanks/3 trial.

In the butterfly plots you showed above, blanks/3 seems to be based on a single epoch only (Nave = 1), which could explain the larger variability, no?

Ah, all these conditions have gone through mne.grand_average prior so nave = 1 is actually the average of all epochs for these conditions.

Could you share the code you’re using to generate these results?

Ah, in the process of trying to extract and post the relevant codes here, I found an error that did not apply baseline correctly to GFP plots. After rectifying it, it looks better:


Nevertheless, it is still puzzling that the control baseline oscillatory activity is 2-3 times more intense than the experimental conditions for these outlier participants, especially when blanks/3 epochs were generated from sponblinks/3 data, which amplitude is a lot more subdued.

Would you attribute this to noise?

In case you were wondering how I extracted blanks/3 data from sponblinks/3, the excerpt of codes are:

## Add EOG events (blink) as spon blink data
eog_events_copied = mne.preprocessing.find_eog_events(raw, 998, ch_name='EOG001', reject_by_annotation=True) 
eog_events_copied[:,2] = np.str(block)+'96'
raw.add_events(eog_events_copied, stim_channel=SBStimChn)

## Create identical events array for Neuromag timestamp raw.first_samp correction
eog_events_copied2 = copy.deepcopy(eog_events_copied)

## Rectify lagged time stamps for eog_events collected by neuromag system
eog6 = copy.deepcopy(eog_events_copied2[:,0])

for subtract in range(len(eog6)):
    eog6[subtract] = eog6[subtract] - raw.first_samp

eog_events_copied2[:,0] = eog6    

## Add arbitrarily generated baseline trials to raw data      
## Attempt to create baseline events every 6-9 seconds.
jitter = 6
blankAt = 6*300
blankInterval = 6*300 
startedArray = False
tempcount = 0
for createBlanks in range(len(stim_raw)):
    ## Create baseline events that does not have a blink 2000ms before and 3000ms after.
    if createBlanks == blankAt:
        appendIt = True
        for extractEOG in range(len(eog_events_copied2)):
            if eog_events_copied2[extractEOG][0] >= blankAt - (2.*300.) and eog_events_copied2[extractEOG][0] <= blankAt:
                appendIt = False
                tempcount += 1
            if eog_events_copied2[extractEOG][0] <= blankAt + (3.*300) and eog_events_copied2[extractEOG][0] >= blankAt:
                appendIt = False
                tempcount += 1
        ## If the time stamp (createBlanks == blankAt) consisted of no blinks 2s before and 3s after
        ## append this time stamp as a blank/3 event.
        if appendIt == True:
           ## If no event has been created yet, create one and append first entry 
            if startedArray == False:
                eog_events_copied2[0][0] = blankAt
                eog_events_copied2[0][1] = 0
                eog_events_copied2[0][2] = np.str(block)+'95'
                thisArray = eog_events_copied2[0]
                generated_blanks =  np.array([thisArray])
                startedArray = True   
            ## For subsequent entries, just append data to the blanks/3 events array.    
            elif startedArray == True:
                eog_events_copied2[0][0] = blankAt
                eog_events_copied2[0][1] = 0
                eog_events_copied2[0][2] = np.str(block)+'95'
                thisArray = eog_events_copied2[0]
                generated_blanks = np.concatenate((generated_blanks, [thisArray]))   
            # Slightly jitter the next time stamp's duration to ensure sampling of
            # blanks/3 trials is not extracted at a fix interval.
            jitter += 0.2 
            if jitter > 9 :
                jitter = 6    
            blankInterval = jitter*300   
        blankAt = blankAt + blankInterval

Thanks! Since this seems to be proving difficult, maybe it could make sense for you to drop by at tomorrow’s Office Hour? Maybe a couple brains and pairs of eyes combined, plus communication in real-time will help figure what might be going on!

But are the number of trials the same across conditions per participant? That was not clear to me from your explanation of how you get the “blanks” epochs.

1 Like