When is baseline correction applied?

Iā€™m struggling with understanding when baseline correction is applied (to Epochs or Evokeds). As @agramfort points out in this related question, baseline correction is performed when accessing the data (and only if baseline is set appropriately when constructing the Epochs object).

How do I know when this actually happens? For example, letā€™s assume I have an Epochs object named epochs and I want to average them, I do:

epochs.average()

This creates an Evoked object, and its repr states for my example data:

<Evoked | '2' (average, N=158), -0.25 ā€“ 0.75 sec, baseline off, 9 ch, ~50 kB>

Notice the ā€œbaseline offā€ part, which indicates to me that no baseline correction was performed. So I tried this:

epochs.average().apply_baseline()

Now the object is represented as:

<Evoked | '2' (average, N=158), -0.25 ā€“ 0.75 sec, baseline -0.25 ā€“ 0 sec, 9 ch, ~50 kB>

This time, it shows the baseline that I have provided when creating epochs.

However, both Evoked objects contain identical data, so the baseline has been applied even in the first example, despite the repr stating that baseline is off. Am I misunderstanding something?

Hereā€™s a reprex:

import mne
import numpy as np

n_epochs, n_chans, n_samples = 100, 32, 1001
rng = np.random.default_rng()
epochs = mne.EpochsArray(
    data=rng.standard_normal(size=(n_epochs, n_chans, n_samples)),
    info=mne.create_info(n_chans, 500, "eeg"),
    tmin=-1
)

print(epochs)
e1 = epochs.average()
print(e1)
e2 = epochs.average().apply_baseline()
print(e2)

Could you please provide a MWE so I can take a look?

I touched that ā€œbaseline infoā€ code a while ago, such that you still get information about baseline correction after cropping the data. Iā€™d be interested to take a look at what youā€™re describing here. Maybe I caused an undesired side effect somewhere :bug:

I added a reprex in my initial post. I really donā€™t understand what ā€œbaseline offā€ means e.g. for epochs, which I clearly constructed using the default pre-stimulus baseline (None, 0).

baseline is None in mne.EpochsArray constructor

and apply_baseline default to (None, 0)

does it explain the behavior?

Alex

Itā€™s off until you load the data, I believe. But cannot check right now.

Whatā€™s a reprex? :thinking: I see you added an example though.

I didnā€™t notice that Epochs and EpochsArray have different defaults for their baseline parameters ā€“ this explains my reprex at least.

But with my actual data, I still have the output I posted previously: one Evoked says ā€œbaseline offā€, even though I created my epochs with mne.Epochs (and not mne.EpochsArray). It seems like the following line is the culprit:

epochs = mne.channels.combine_channels(epochs, rois)

This function apparently drops the baseline information, because before combining channels I have:

<Epochs |  437 events (all good), -0.25 - 0.75 sec, baseline -0.25 ā€“ 0 sec, ~109.6 MB, data loaded, with metadata,
 'onset': 437>

And afterwards:

<EpochsArray |  437 events (all good), -0.25 - 0.75 sec, baseline off, ~15.4 MB, data loaded, with metadata,
 '2': 437>

Seems like the baseline should also be -0.25 - 0.75 sec in the second case, right?

Thatā€™s a reproducible example aka minimal working example (MWE), only in R (Tidyverse) speak. I kind of like that term.

Urgs, I hate it :sweat_smile: but to each their own I suppose.

I like the term, the abbreviation not so much (easy to confuse with regex). But MWE is not better, easy to confuse with MNEā€¦ :smile:

1 Like

Yes MWE is super ugly too, even when spelled out I suppose many people wouldnā€™t know what it means.

To focus on the issue here, I assume this is a bug in mne.channels.combine_channels()?

I will take a look tonight, and we can discuss this at tomorrowā€™s dev meeting too :slight_smile:

I donā€™t think I can reproduce (or understand) what youā€™re seeing.

I had to adjust your example code because I was getting a ValueError about the baseline period being only a single sample. So Iā€™m now passing tmin to EpochsArray.

# %%
import mne
import numpy as np

n_epochs, n_chans, n_samples = 100, 32, 1001
rng = np.random.default_rng()
data = rng.standard_normal(size=(n_epochs, n_chans, n_samples))
info = mne.create_info(n_chans, 500, "eeg")

epochs = mne.EpochsArray(data=data, info=info, tmin=-0.2)
evoked_no_baseline = epochs.average()
evoked_baseline = epochs.average().apply_baseline()

print(epochs)
print(evoked_no_baseline)
print(evoked_baseline)

produces:

<EpochsArray |  100 events (all good), -0.2 - 1.8 sec, baseline off, ~24.5 MB, data loaded,
 '1': 100>
<Evoked | '1' (average, N=100), -0.2 ā€“ 1.8 sec, baseline off, 32 ch, ~285 kB>
<Evoked | '1' (average, N=100), -0.2 ā€“ 1.8 sec, baseline -0.2 ā€“ 0 sec, 32 ch, ~285 kB>

Which is exactly what Iā€™d expected.

The data is not identical either:

np.allclose(evoked_baseline.data, evoked_no_baseline.data)

returns False.

So what is it you believe should be different? Iā€™m failing to follow hereā€¦

Thanks,
Richard

Sorry @richard for the confusion. You are right, there is no problem with the (adapted) example. I did not know that Epochs and EpochsArray have different defaults for baseline. However, as Iā€™ve mentioned previously, combining channels with mne.channels.combine_channels() seems to forget the baseline (although the data is of course still baseline-corrected). I think this is a bug.

1 Like

Ah! Now I understand! Ok. Yes this was a design choice, as the assumption was that if somebody goes through the pain of creating an array manually, we probably shouldnā€™t alter it by default through baseline correction.

As for the combine_channels(), could you provide a tiny ā€¦ how did you call it scrolls up ā€¦ ah yes, a tiny REPREX please? :smiley: I can then try to look into this tomorrow! Thanks!

Of course, here you go:

import mne
import numpy as np

n_epochs, n_chans, n_samples = 100, 32, 1001
rng = np.random.default_rng()
epochs = mne.EpochsArray(
    data=rng.standard_normal(size=(n_epochs, n_chans, n_samples)),
    info=mne.create_info(n_chans, 500, "eeg"),
    tmin=-1,
    baseline=(None, 0)
)

print(epochs)  # baseline -1 ā€“ 0 sec (OK)
print(epochs.average())  # baseline -1 ā€“ 0 sec (OK)

combined = mne.channels.combine_channels(
    epochs,
    {"first": range(16), "second": range(16, 32)}
)

print(combined)  # baseline off (WRONG)
print(combined.average())  # baseline off (WRONG)
1 Like

I confirm the bug. combine_channels should copy the baseline information

can you send a PR?

thx
A

1 Like