Question about ERP interpretation

Hello,

I obtained the following figure, which highlights a significant cluster. However, I have a question: why is the significant cluster located where there appears to be the least visual difference? This pattern is repeated across multiple figures. I would have understood if there were no difference, but finding a difference where none is visually apparent seems strange.

here is the figure (in grey the signigicative cluster, at about 100 to 200 ms, why not at 50 ms??? it would have been clearer)

here is the code:

Alpha = 0.05
N_Conditions = 2
N_Observations = len(EpochsCondition1[0])
DegreeOfFreedomNumerator = N_Conditions - 1
DegreeOdFreedomDenominator = N_Observations - N_Conditions
Threshold = scipy.stats.f.ppf(1 - Alpha, dfn = DegreeOfFreedomNumerator, dfd = DegreeOdFreedomDenominator) 

T_Obs, Clusters, Clusters_P_Values, H0 = mne.stats.permutation_cluster_test([EpochsCondition1, EpochsCondition2],
                                                                  n_permutations = 1000,
                                                                  threshold = Threshold,
                                                                  tail = 0,
                                                                  seed = 1)
print(Clusters, Clusters_P_Values)

# Now I want to plot the stats
Color = {Condition2: "Blue", Condition1: "red"}
Linestyles = {Condition2: "-", Condition1: "--"}

EvokedsToCompare_List = []
for Evoked in Condition2Evokeds:
    EvokedsToCompare_List.append(Evoked)
for Evoked in Condition1Evokeds:
    EvokedsToCompare_List.append(Evoked)
EvokedsToCompare_List = [Evoked[0] for Evoked in EvokedsToCompare_List]

EvokedsToCompare_Dict = {Condition2: [], Condition1: []}

for Evoked in EvokedsToCompare_List:
    if Evoked.comment == Condition2:
        EvokedsToCompare_Dict[Condition2].append(Evoked)
    else:
        EvokedsToCompare_Dict[Condition1].append(Evoked)

plt.style.use('default')
fig, ax = plt.subplots()

LengthOfData = len(EvokedsToCompare_Dict[Condition1][0].get_data()[0]) 

# each point is for 0.002 s
StartTime = - 1000 # ms
X_Axis = []
for i in range(len(EvokedsToCompare_Dict[Condition1][0].get_data()[0])):
    X_Axis.append(StartTime + (i * 2))
    continue

ChannelList = []
for Channel in ROI:
    ChannelList.append(EvokedsToCompare_Dict[Condition1][0].info["ch_names"].index(Channel))
    continue

DataCondition2 = []
for Index in ChannelList:
    DataCondition2.append(EvokedsToCompare_Dict[Condition2][0].get_data()[Index])
    continue
DataCondition2 = np.array(DataCondition2)
DataCondition2 = DataCondition2 * 1000000

DataCondition1 = []
for Index in ChannelList:
    DataCondition1.append(EvokedsToCompare_Dict[Condition1][0].get_data()[Index])
    continue
DataCondition1 = np.array(DataCondition1)
DataCondition1 = DataCondition1 * 1000000

MeanCondition2 = np.mean(DataCondition2, axis = 0)
MeanCondition1 = np.mean(DataCondition1, axis = 0)

SEMCondition2 = scipy.stats.sem(DataCondition2[0])
SEMCondition1 = scipy.stats.sem(DataCondition1[0])

ax.spines['top'].set_visible(False)
ax.spines['right'].set_visible(False)
ax.spines.bottom.set_bounds(-1000, 1500)
ax.spines.left.set_bounds(-10, 10)
Ticks = [-10.0, -7.5, -5.0, -2.5, 0.0, 2.5, 5.0, 7.5, 10.0]
ax.set_yticks(Ticks)
ax.plot(X_Axis, MeanCondition2, 'r')
ax.plot(X_Axis, MeanCondition1, 'b')
ax.fill_between(X_Axis, DataCondition2[0] + SEMCondition2, DataCondition2[0] - SEMCondition2, color='r', alpha=.1)
ax.fill_between(X_Axis, DataCondition1[0] + SEMCondition1, DataCondition1[0] - SEMCondition1, color='b', alpha=.1)
ax.axhline(y = 0, color = 'k', linestyle = '-')
ax.axvline(x = 0, color = 'k', linestyle = '--')

ClustersToHighlight = []
for i in range(len(Clusters_P_Values)):
    if Clusters_P_Values[i] < 0.05:
        ClustersToHighlight.append(Clusters[i])
        continue
    else:
        continue

for i in range(len(ClustersToHighlight)):
    Inf = min(ClustersToHighlight[i][0]) * 2 # each point is 2 ms
    print("Inferior time limit for the cluster {} = {}".format(i, Inf))
    InfBorder = -1000 + Inf 
    Up = max(ClustersToHighlight[i][0]) * 2
    print("Superior time limit for the cluster {} = {}".format(i, Up))
    UpBorder = -1000 + Up
    Ymin, Ymax = ax.get_ylim()
    ax.fill_betweenx((Ymin, Ymax), InfBorder, UpBorder, color = 'grey', alpha = 0.5)
    continue
ax.legend([Condition2Title, Condition1Title], loc = "lower right")
plt.xlabel("Duration of the epoch (in ms)")
plt.ylabel("$\mu$V")

Unfortunately, I really don’t have the time to go through your code, so I can’t rule out that there’s something wrong with it. However, I wanted to point out that you should be very careful with how to interpret the results of a cluster permutation test; the gray area typically does not indicate the location of significant differences (see https://onlinelibrary.wiley.com/doi/abs/10.1111/psyp.13335).

Also, you might want to apply a lowpass filter (e.g. at 35 Hz) for a smoother ERP visualization.

1 Like

I agree with Clemens and I also didn’t go through the code in great detail, but I would suggest plotting standard errors/confidence intervals around the ERPs. this gives you a better idea of where the signal might differ before running a statistical test. You can also run a permutation t-test which is a bit easier to interpret if you are not sure about the cluster test.

Hope this helps,

Carina

2 Likes

TRhank you for your answer,
I have to say I did not understand that it did not indicate what is significant (it indicates the cluster associated with a p value < 0.05 in my code but maybe the significance is about something else than the difference in the signal). I will read the paper you indicated cautiously to understant.
I will print the CI, I plotted the SEM but it is so small that it did not appear on the graph. I will use t_test as you indicate if I fail to interpret the results of the cluster test.

Thank you for your help!!

Best!

For diagnostic purposes, I’d also suggest plotting the T values over time alongside the cluster-forming threshold you set. At least I always find this helpful!

Richard

@Hugues1273 Here is a maximally working example (i.e. a long script) :stuck_out_tongue: that incorporates the suggestions so far with some simulated data, to give you an idea of how to do it. Note that I am using seaborn to plot the CI’s across my observations, and I am z-scoring the data and F-statistics so that I can plot everything onto a single figure.

In general, when I’m having trouble with an analysis step I sometimes like to simulate data because it gives me a bit more control. From here I would try to work my way back up to my actual data. If I were you I would:

  • Remove code that does “extra stuff”, like appending times to your x-axis… These are points where bugs might be getting introduced.
  • add some assert statements to your script to check that the state of your variables are what you expect them to be. You can see how I do this in the code below.
code
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import scipy.stats as stats
import seaborn as sns
import mne

N_OBS = 20 # observations per condition. For example, 20 subjects or 20 epochs.

def generate_signal():
    """Generate a simple signal via convolution."""
    signal = np.concatenate([
        np.zeros(30), np.ones(2),
        np.zeros(20), np.ones(30),
        2*np.ones(10), np.zeros(30),
        -np.ones(10), np.zeros(40)
    ])
    
    kernel = np.exp(-np.linspace(-2, 2, 20)**2)
    return np.convolve(signal, kernel / np.sum(kernel), mode='same')

def create_observations(signal, n_obs=20, noise_level=0.5):
    """Repeat a given signal n_obs times with some added noise."""
    return signal[None, :] + noise_level * np.random.randn(n_obs, len(signal))

# Generate signals
signal_1 = generate_signal()
# Here we force this signal to have a more restricted range
signal_2 = np.clip(signal_1, -1, 1)

# Create observations
data_1 = create_observations(signal_1, N_OBS)
data_2 = create_observations(signal_2, N_OBS)

# Check that our data our in the shape expected by MNE
assert data_1.shape == (N_OBS, len(signal_1))
assert data_2.shape == (N_OBS, len(signal_2))

# Perform cluster permutation test

F_obs, clusters, cluster_pv, _ = mne.stats.permutation_cluster_test([data_1, data_2])

# Z-score the data so that we can plot everything onto a single figure
z_data_1 = stats.zscore(data_1, axis=1)
z_data_2 = stats.zscore(data_2, axis=1)
z_F_obs = stats.zscore(F_obs, nan_policy='omit')


# Prepare data for plotting
def prepare_dataframe(data):
    """Convert our toy data into a DataFrame for use with seaborn."""
    df = pd.DataFrame.from_records(data).rename_axis(index="obs")
    return df.reset_index().melt(id_vars="obs", var_name="time", value_name="amplitude")

df_1, df_2 = [prepare_dataframe(d) for d in [z_data_1, z_data_2]]
F_obs_df = pd.DataFrame(z_F_obs, columns=["F statistic"])

# Plot signals and clusters
sns.set_style("darkgrid")
fig, ax = plt.subplots()

sns.scatterplot(data=F_obs_df, ax=ax, s=10)
sns.lineplot(data=df_1, x='time', y='amplitude', ax=ax, label="Condition 1")
sns.lineplot(data=df_2, x='time', y='amplitude', ax=ax, label="Condition 2")

# Highlight significant clusters
for i_c, c in enumerate(clusters):
    c = c[0]
    color = 'green' if cluster_pv[i_c] < 0.01 else 'yellow' if cluster_pv[i_c] < 0.05 else 'red'
    ax.axvspan(c[0], c[-1], color=color, alpha=0.3)
ax.set_title("Green = p < 0.01, Yellow = p < 0.05, Red = p > 0.05")
ax.set_xlabel("Time")
ax.set_ylabel("Amplitude (z-scored)")

fig.show()

2 Likes

I’d like to add a note of caution here. In general, I do not recommend to determine regions of statistically significant differences between conditions (i.e., peaks where the conditions differ) from your data, as explained in this article by Luck and Gaspelin: How to get statistically significant effects in any ERP experiment (and why you shouldn’t). One possible solution is to only use grand average data collapsed across conditions to identify components (peaks) of interest, but looking at the conditions separately is asking for trouble. Another solution is to correct for multiple comparisons (i.e., if you have an F-statistic for every time point, you need to correct for these multiple comparisons you’re making).

1 Like

that depends on whether you are doing exploratory or confirmatory analysis. Exploratory: look at the data where differences are. Confirmatory: you should already know where you want to look for differences and you should not look at the data for this.

You have to pick one or the other. You cannot do both with the same data.

1 Like