Join data with different timestamps in MNE Raw object

Hi everyone,

I have 2 small questions concerning timestamps in raw objects in MNE - can anyone help me?

I have an xdf file containing a stream with EEG data from 128 channels, one with data from an Arduino that’s measuring grip strength (1 channel), and one with triggers from my PsychoPy experiment.
I’d like to turn my xdf file into a raw object. As I’m new to Python MNE & not familiar with the structure of raw objects, I tried building one from scratch so I can see how everything works.

Now my questions:

My xdf files contain timestamps for all recorded values in my different streams, but I don’t know how to include them in my Raw object. I looked at some examples of Raw objects and they don’t seem to contain timestamps but rather seem to work with sample numbers instead - is that correct?

If so: My timestamps don’t really match (even if I round them a bit). I added snippets from my data below so you know what I mean. I think using sample numbers won’t make sense in this case. Is there a workaround for that?

Thanks in advance for your help / ideas!
~ Merle

# Example timestamp values (values measured in seconds)

`You can see that I cut the snippets from different parts of the streams. 
The EEG recording started earlier than the rest & has a higher sampling rate, 
so the indices are way higher. 
I tried getting timestamps from the same range 
so you can see they do not match.`

# stream 1: timestamps for EEG data with sample freq = 500 Hz
streams[1]["time_stamps"][20900:21000] 
# output: 
array([6999.93109028, 6999.93309023, 6999.93509019, 6999.93709014,
       6999.93909009, 6999.94109004, 6999.94309 , 6999.94508995,
       6999.9470899 , 6999.94908986, 6999.95108981, 6999.95308976,
       6999.95508972, 6999.95708967, 6999.95908962, 6999.96108958,
       6999.96308953, 6999.96508948, 6999.96708944, 6999.96908939,
       6999.97108934, 6999.9730893 , 6999.97508925, 6999.9770892 ,
       6999.97908915, 6999.98108911, 6999.98308906, 6999.98508901,
       6999.98708897, 6999.98908892, 6999.99108887, 6999.99308883,
       6999.99508878, 6999.99708873, 6999.99908869, 7000.00108864,
       7000.00308859, 7000.00508855, 7000.0070885, 7000.00908845,
       7000.01108841, 7000.01308836, 7000.01508831, 7000.01708827,
       7000.01908822, 7000.02108817, 7000.02308812, 7000.02508808,
       7000.02708803, 7000.02908798, 7000.03108794, 7000.03308789,
       7000.03508784, 7000.0370878, 7000.03908775, 7000.0410877,
       7000.04308766, 7000.04508761, 7000.04708756, 7000.04908752,
       7000.05108747, 7000.05308742, 7000.05508738, 7000.05708733,
       7000.05908728, 7000.06108724, 7000.06308719, 7000.06508714,
       7000.06708709, 7000.06908705, 7000.071087, 7000.07308695,
       7000.07508691, 7000.07708686, 7000.07908681, 7000.08108677,
       7000.08308672, 7000.08508667, 7000.08708663, 7000.08908658,
       7000.09108653, 7000.09308649, 7000.09508644, 7000.09708639,
       7000.09908635, 7000.1010863 , 7000.10308625, 7000.1050862 ,
       7000.10708616, 7000.10908611, 7000.11108606, 7000.11308602,
       7000.11508597, 7000.11708592, 7000.11908588, 7000.12108583,
       7000.12308578, 7000.12508574, 7000.12708569, 7000.12908564])

# stream 2: timestamps for experiment triggers
streams[2]["time_stamps"][0:10] 
# output: 
array([6963.5034906 , 6964.75858024, 6991.82214163, 6992.82910679,
       6994.33896652, 6995.53916588, 7001.53491572, 7001.53492682,
       7002.53810648, 7004.04580771])

# stream 3: timestamps for Arduino data with sample freq = 45 Hz
streams[3]["time_stamps"][350:400] 
# output: 
array([6999.93935424, 6999.95185334, 6999.96437993, 6999.97685173,
       6999.98938732, 7000.00185342, 7000.01435351, 7000.02685491,
       7000.0393514, 7000.05185349, 7000.06436069, 7000.07685378,
       7000.08937208, 7000.10185407, 7000.11435287, 7000.12685416,
       7000.13935495, 7000.15185235, 7000.16435454, 7000.17685314,
       7000.18935483, 7000.20185523, 7000.21435302, 7000.22685212,
       7000.23935521, 7000.2518539 , 7000.2643561, 7000.27685459,
       7000.28935469, 7000.30185588, 7000.31436148, 7000.32685387,
       7000.33935346, 7000.35185446, 7000.36435425, 7000.37685245,
       7000.38935254, 7000.40185414, 7000.41435253, 7000.42685332,
       7000.43935442, 7000.45185551, 7000.46435401, 7000.476852,
       7000.489355, 7000.50185619, 7000.51435259, 7000.52685678,
       7000.53935587, 7000.55186167])

Hello @Merle and welcome to the forum!

Your observation is correct: when it comes to electrophysiological data, MNE works on the basis of samples, not time points. That said, I believe there’s an easy way for you to work with the data you have.

First, create a Raw object from the EEG data (for which you know the exact sampling frequency – 500 Hz). (If you need help in doing so, please ask)

Then, you can add the triggers and Arduino time stamps as annotations, which conveniently are time-based!

If at a later processing stage (e.g., if you wish to create epochs) you need (sample-based) events, simply use the events_from_annotations() function, and MNE will do its best to match the time points in the annotations with the nearest sample numbers in the EEG data.

Good luck, and don’t hesitate to ask if you get stuck!

Richard

1 Like

Thanks so much @richard, that’s exactly what I needed! :smiley:

1 Like

Hi @richard,

I’m afraid I got stuck! Could you maybe point me in the right direction?

I built annotation objects for my Arduino data and for the triggers, but when I try to add them to my Raw object containing the EEG data, I get the following error message:

TypeError: expected str, bytes or os.PathLike object, not NoneType

This is how I tried to add the annotations:
eeg_Raw.set_annotations(triggers_annot)

And this is how my annotations object looks like:

I tried googling the error message but I don’t really get what happened here. Do you know what I have to do?

Thanks a lot in advance!

Merle

Hello, could you please share the code you’re using? It’s difficult to tell what’s going on just from the screenshot you shared. Thanks!

Sorry, sure! I read in the XDF files in a loop, then created a Raw Object for the EEG data and added annotations for Arduino (GSS) and Trigger data.

import pyxdf
import mne
import glob
import os
import numpy as np
import math
from itertools import chain

# --------------------------------------------------------

# working directory
os.chdir("/Users/merle/Desktop/Masterarbeit/Master_Testdaten/")

# get list of all xdf files in my directory 
file_list = glob.glob("/Users/merle/Desktop/Masterarbeit/Master_Testdaten/*.xdf")

# set number of subjects as number of xdf files in directory
subj_n = len(file_list)

# loop xdf file names in file_list aka participants:
for file_name in file_list:
    
    """ read in XDF data """
    streams, header = pyxdf.load_xdf(file_name)
    
    """ Build NME data object from scratch """
    # stream 1: Actichamp - EEG data
    # stream 2: PsychoPyMarkers - Experiment markers
    # stream 3: Arduino - Grip strength sensor data    

    # each stream contains timestamps (measured in seconds)
    
    """ Create info for Raw Object for EEG data"""
    # Sampling rate: 500 Hz
    sampling_freq = float(streams[1]["info"]["nominal_srate"][0]) # in Hertz
    
    # name and classify channels
    ch_names = [f'EEG_{n:03}' for n in range(1, 129)]
    ch_types = ['eeg'] * 128 
    n_channels = 126

    # combine information 
    info_eeg = mne.create_info(ch_names, 
                               ch_types = ch_types, 
                               sfreq = sampling_freq)
  
    # add name of the curent dataset
    info_eeg['description'] = file_name[len(file_name)-30 : len(file_name)-4 : 1]
   
    # look at the info
    #print(info_eeg)
  
    """ Get EEG data for Raw object""" 
    
    # get EEG data from stream 1:
    # 128 arrays (1 for each electrode), 186013 sampling points
    data_eeg = np.array(streams[1]["time_series"].T) 

    # transform all values in eeg_data from Microvolt to Volt 
    data_eeg[:] *= 1e-6
  
    """ Create Raw object for EEG data""" 
    # combine info & eeg data
    eeg_Raw = mne.io.RawArray(data_eeg, info_eeg)

    """ Add Events & GSS data as Annotations to Raw Object"""
       
    # Set onset of EEG stream to None (that's the default anyway)
    # and subtract onset of EEG stream from GSS & Trigger data. 
    # This way the timestamps are relative to the EEG onset.
       
    # get difference between EEG onset and onset of Triggers
    eeg_onset = streams[1]["time_stamps"][0] 
    trigger_timestamps = streams[2]["time_stamps"] - eeg_onset
    gss_timestamps = streams[3]["time_stamps"] - eeg_onset
    
    # get names of triggers (it's a nested list in the xdf file)    
    trigger_descriptions = streams[2]["time_series"]
    # turn nested list into "normal" one dimensional list
    trigger_descriptions = list(chain.from_iterable(trigger_descriptions)) 
    
    # get gss values (it's a nested list in the xdf file)    
    gss_values = streams[3]["time_series"]
    # turn nested list into "normal" one dimensional list
    gss_values = list(chain.from_iterable(gss_values)) 

    # save trigger descriptions & their onsets as annotations for our Raw object
    triggers_annot = mne.Annotations(onset = trigger_timestamps, duration = .001, description = trigger_descriptions)
    
    # save GSS values & their onsets as annotations
    gss_annot = mne.Annotations(onset = gss_timestamps, duration = 0.001, description = gss_values) 
    
    # Add trigger and gss annotations to the Raw object that's already containing the EEG data
    #eeg_Raw.set_annotations(triggers_annot + gss_annot)
    eeg_Raw.set_annotations(triggers_annot)

I think the following code with a few adaptations could fixe your issue.
Here is some code I wrote in which I saved both EEG data and some lsl string messages in a XDF file using labrecorder.

Load the file using mnelab

import pyxdf
import pylsl
from mnelab.io import read_raw

xdf_filename = "path_to_your_datafile.xdf"

raw = read_raw(xdf_filename, stream_id=None)

Locate your streams

Basically you have a continuous data stream and a non continuous that starts on the first stimulation. If I’m correct, both have a timestamp expressed in seconds set by the LSL protocol, so you can’t just add stimulations with their absolute timestamp. The idea here is to substract the very first EEG timestamp to your stimulation stream such that you get the position of stimuli from sample 0. It is afterwards convenient to translate those stimuli as MNE annotations, with onsets expressed in seconds.

First locate them.
You must here change the match values of “type” or match using the “name” of the steam.

eeg_stream = None
msg_stream = None

streams, header = pyxdf.load_xdf(xdf_filename)
# detect the EEG stream
states_list_eeg = pyxdf.match_streaminfos(pyxdf.resolve_streams(xdf_filename),
                                          [{"type": "EEG"}])
for states_stream_id in states_list_eeg:
    for stream in streams:
        if stream["info"]["stream_id"] == states_stream_id:
            eeg_stream = stream
            print('Found eeg stream {}'.format(states_stream_id))
            break
assert eeg_stream is not None, 'EEG stream not found'

# detect the message stream
states_list_ids = pyxdf.match_streaminfos(pyxdf.resolve_streams(xdf_filename),
                                          [{"type": "msg_states"}])
for states_stream_id in states_list_ids:
    for stream in streams:
        if stream["info"]["stream_id"] == states_stream_id:
            msg_stream = stream
            print('Found message stream {}'.format(states_stream_id))
            break
assert eeg_stream is not None, 'lsl message stream not found in xdf file'

Locate the very first timestamp in the continuous EEG channel

first_samp = eeg_stream["time_stamps"][0]

use it as a correction for relative time stamp of the message channel

print('first time stamp correction {}'.format(first_samp))
onsets = msg_stream["time_stamps"] - first_samp

Manually generate and add timestamped annotations to your RawArray

descriptions = [item for sub in msg_stream["time_series"] for item in sub]
raw.annotations.append(onsets, [0] * len(onsets), descriptions)

Plot your data

raw.plot()

As you can see in my case it worked at synchronizing annotations that I knew started at timestamps 1sec and 2sec.
Please tell me if you’re successful.

ps: please note that events with duration of 0 have been ignored by mne so I’d recomment to test using [1] * len(onsets) in the durations parameter when appending annotations.

1 Like

Thanks for sharing the code. Could you please share the entire traceback (error message) also?

Sure, I hope this is what you mean?

eeg_Raw.set_annotations(triggers_annot)

Traceback (most recent call last):

  File "/Users/merle/opt/anaconda3/lib/python3.8/site-packages/IPython/core/formatters.py", line 345, in __call__
    return method()

  File "/Users/merle/opt/anaconda3/lib/python3.8/site-packages/mne/io/base.py", line 1763, in _repr_html_
    basenames = [os.path.basename(f) for f in self._filenames]

  File "/Users/merle/opt/anaconda3/lib/python3.8/site-packages/mne/io/base.py", line 1763, in <listcomp>
    basenames = [os.path.basename(f) for f in self._filenames]

  File "/Users/merle/opt/anaconda3/lib/python3.8/posixpath.py", line 142, in basename
    p = os.fspath(p)

TypeError: expected str, bytes or os.PathLike object, not NoneType

@Merle It happened to me a few weeks ago when making a mne.io.RawArray from a numpy array.
A quick fix would probably be the following:

eeg_Raw._filenames = ['real_filename_or_dummy_name', 'filename2_if_concatenated']
eeg_Raw.set_annotations(triggers_annot)
2 Likes

Interesting, this seems to be a bug in MNE then! Care to file a bug report on our GitHub issue tracker?

Aaah it works @lokinou! Thank you so much for your help, everyone! :heart_eyes: