Extremely Large Numbers in Pandas Dataframe Output

Hi there,

I’m totally new to analyzing electrophysiological data and I’m just trying to wrap my head around how best to examine it for my purposes.

I’ve got resting state data from depth electrodes (sEEG) and I’m hoping to break it up into events based on when a specific channel enters a specific frequency. I’m not entirely sure how to do this using MNE but the dataframes option seemed like a good start.

The problem is that my dataframe numbers don’t make sense. Most of the outputs for each channel at any given time frame are in the millions, and according to the Algorithms MNE documentation page (Algorithms and other implementation details — MNE 1.6.0.dev195+gfa0e8cfc4 documentation), the units in the dataframe are supposed to be recorded in volts.

I have two questions: first, what are the numbers in this dataframe supposed to represent if not volts? Second, is there an easier way to create events based on frequency activity through the tools that MNE already has?

Thanks again!

  • MNE version 1.4.2

I’m not sure if exporting the data to a data frame is the best way to do analysis in MNE-Python. If you are new, I recommend going through some tutorials to get a feeling for how things are typically done.

In any case, can you show some entries of the data frame? You are correct that the values should be in volts, so with EEG you can expect numbers like -2.233259463389e-05. Depth electrodes should result in signals that are a little larger, but definitely not millions of volts.

Hi there,

I understand it’s probably not the best way to analyze data through the dataframe, however, based on what I’ve seen so far, there doesn’t seem to be a way to generate events based on the frequency of specific electrodes using the MNE analysis tools. If I am wrong about this, please correct me.

As to the problem with the dataframe, here is an example snapshot of what I am getting when I convert the raw data into a dataframe.

As you can see, the numbers are extremely large for all of the electrodes. This is true across sessions and across participants. Any information on what might be going on here would be greatly appreciated!

Maybe something went wrong during import. Can you share your code that leads to this dataframe? Usually, the problem is that your data are stored as µV, but MNE thinks it’s in V. This would explain the values.

events based on the frequency of specific electrodes

This is very vague, how exactly would you define your event onset?

For the unit, MNE expects data to be in SI (Volts for EEG), but in the end, it’s up to the user to provide a file correctly formatted and/or to create a Raw with the correct unit. Some manufacturer/system export to a “standard” format but without respecting the unit or certain fields; and thus MNE is not able to correctly parse/load this information.

You can correct the scaling by calling raw.apply_function(lambda x: x*1e-6, channel_wise=False); but you need to know the scaling factor (1e-6 in my example).

How did you load this dataset, from which system was it recorded and in which format?

Note also that raw.to_data_frame has a scalings argument which defaults to dict(eeg=1e6, mag=1e15, grad=1e13) converting EEG data from Volts to uV (or at least applying a 1e6 scaling factor on export to dataframe). Can you share the code snippet which generates the dataframe?

To your first question, ideally, the event onset should be a transition into a specific bandwidth of a specific electrode. Event offset would be a transition out of this bandwidth. I may be going at this completely wrong (again, I’m totally new at this), but my ultimate goal is to determine what is occuring in a subset of my electrodes during the events in which a specific elctrode is in a specific frequency (if this makes sense). AIf there are tools in MNE to tackle this question, I would love to explore them.

As for the dataframe, here is the code I have used to generate it. Let me know if anything looks off:

# -- Establishing the Raw Object -- #
raw = mne.io.read_raw_edf(file_name, preload=True)  # Raw data file / the 'preload = True' argument copies to RAM and allows operations like filtering
start, stop = pull_start_stop(sub, mov1_trig)       # Time of onset/end for current subject; see sub_channel_lists for timings/events
raw.crop(tmin=start, tmax=stop)                     # Crop raw file to event timings

# Saving the Pandas Dataframe
df = raw.to_data_frame(start=start, stop=stop)
df.to_csv(sub + 'test.csv')

Thanks again for taking the time to sort this out!

event onset should be a transition into a specific bandwidth of a specific electrode.

This is still very vague. How do you define the transition within the bandwidth?

If I reformulate/extrapolate, we could define the onset as the moment when a given electrodes dominant power-band changes from a given frequency range to another, e.g. when this electrode dominant bandpower changes from delta (1, 4) Hz to alpha (8, 13) Hz. I’m not sure how this kind of onset/events would be useful to analyses a resting-state recording.

Note that this kind of analysis requires knowledge of the frequency estimate during time, i.e. a time-frequency representation. You can find functions to compute the TFR here:

Even with those, you still won’t get a ‘precise’ onset as a time-frequency representation will sacrifice time-resolution in favor of frequency-resolution (bandwidth) and vice-versa.

It’s not trivial to compute, thus I’m a bit surprise you want to convert the raw object to a DataFrame and then implements manually those methods.

With the code snippet you shared, raw.to_data_frame applied a 1e6 scaling factor to export a DataFrame in microvolts. You obviously got a different unit in the end; thus the EDF file you load is not in Volts. You should first figure out which scaling needs to be applied here and then use raw.apply_function(...) to correct the scaling.