.dat file is not read correctly

I have data that are in a .dat file. Inside the file they look like two columns with values. I tried to read this file in all the different ways that are available through mne.io.read_raw and the file is only read through read_raw_eximia, which is strange. Then I tried to plot it but got a clearly wrong plot. The program also have following output: chs: 3 Stimulus, 1 EOG, 60 EEG
highpass: 0.0 Hz
lowpass: 725.0 Hz
nchan: 64
sfreq: 1450.0 Hz
What do I need to do to read this file correctly, because I doubt it was read correctly, and then plot it?
I am a freshman at MNE library, so I need help. Thank you

MNE version: 1.3.1
OS: Windows 10

@averinkit Hi,

Perhaps the solution in the below link can help ?

You may have to use mne.io.read_raw_persyst(fname, preload=False, verbose=None) API to read the data.

One other thing, please check /make sure that your file is not corrupt/broken. I would also use a second data set to replicate the issue just in case!

best,
Dip

Hello,
I have already tried opening with persyst, but like the problem you pointed out, it requires a .lay file. But, unlike that situation, my file opens through a text editor. I will attach my file in both .dat format and .txt format.
As you can see, the data seems not corrupted or something.

@averinkit Thanks for sharing the data.

I had a quick look. I am copying @richard’s comment from the other post:
In the Notes section for that reader, it’s stated that there also needs to be a .lay file. Since you don’t have one, I doubt that your data is in the Persyst format. .dat is just a very generic extension and could basically mean anything …

I have used the following code snippet to test your data.

# -*- coding: utf-8 -*-
"""
@author: diptyajit das <bmedasdiptyajit@gmail.com>
Created on April 28, 2023

comment: data recording is wrong!
Task: test EEG data file?
"""
print(__doc__)

# import packages
from mne.io import read_raw_eximia
import matplotlib.pyplot as plt


# file name
fname= '/home/dip_meg/Downloads/1_class_1.dat'

# read raw data
raw = read_raw_eximia(fname)
print('original raw info', raw.info)

# down sample the data to 1000 Hz
raw_resamp = raw.copy().resample(sfreq=1000)
raw_resamp.filter(1., 40.)  # filter the data from 1-40 Hz
raw_resamp.plot()
print('downsampled raw info', raw_resamp.info)

# take only 1000 time points
times = raw_resamp.times[:1000]
data = raw_resamp.get_data()[:, :1000]  # 64 channels * time (1000 ms in this case)

# plot the data  for testing
plt.plot(times, data.T*1e-6)
plt.xlabel('time (ms)')
plt.ylabel('raw data')
plt.show()

Test outputs:

original raw info <Info | 7 non-empty values
 bads: []
 ch_names: GateIn, Trig1, Trig2, EOG, Fp1, Fpz, Fp2, AF1, AFz, AF2, F7, F3, ...
 chs: 3 Stimulus, 1 EOG, 60 EEG
 custom_ref_applied: False
 highpass: 0.0 Hz
 lowpass: 725.0 Hz
 meas_date: unspecified
 nchan: 64
 projs: []
 sfreq: 1450.0 Hz
>
Trigger channel has a non-zero initial value of 12 (consider using initial_event=True to detect this event)
Trigger channel has a non-zero initial value of 18 (consider using initial_event=True to detect this event)
10 events found
Event IDs: [19 20 21 22]
Trigger channel has a non-zero initial value of 18 (consider using initial_event=True to detect this event)
471 events found
Event IDs: [19 20 21 22]
Trigger channel has a non-zero initial value of 12 (consider using initial_event=True to detect this event)
Trigger channel has a non-zero initial value of 18 (consider using initial_event=True to detect this event)
10 events found
Event IDs: [19 20 21 22]
Trigger channel has a non-zero initial value of 18 (consider using initial_event=True to detect this event)
464 events found
Event IDs: [19 20 21 22]
Filtering raw data in 1 contiguous segment
Setting up band-pass filter from 1 - 40 Hz

FIR filter parameters
---------------------
Designing a one-pass, zero-phase, non-causal bandpass filter:
- Windowed time-domain design (firwin) method
- Hamming window with 0.0194 passband ripple and 53 dB stopband attenuation
- Lower passband edge: 1.00
- Lower transition bandwidth: 1.00 Hz (-6 dB cutoff frequency: 0.50 Hz)
- Upper passband edge: 40.00 Hz
- Upper transition bandwidth: 10.00 Hz (-6 dB cutoff frequency: 45.00 Hz)
- Filter length: 3301 samples (3.301 sec)

[Parallel(n_jobs=1)]: Using backend SequentialBackend with 1 concurrent workers.
[Parallel(n_jobs=1)]: Done   1 out of   1 | elapsed:    0.0s remaining:    0.0s
[Parallel(n_jobs=1)]: Done   2 out of   2 | elapsed:    0.0s remaining:    0.0s
[Parallel(n_jobs=1)]: Done   3 out of   3 | elapsed:    0.0s remaining:    0.0s
[Parallel(n_jobs=1)]: Done   4 out of   4 | elapsed:    0.0s remaining:    0.0s
[Parallel(n_jobs=1)]: Done  60 out of  60 | elapsed:    0.2s finished
downsampled raw info <Info | 7 non-empty values
 bads: []
 ch_names: GateIn, Trig1, Trig2, EOG, Fp1, Fpz, Fp2, AF1, AFz, AF2, F7, F3, ...
 chs: 3 Stimulus, 1 EOG, 60 EEG
 custom_ref_applied: False
 highpass: 1.0 Hz
 lowpass: 40.0 Hz
 meas_date: unspecified
 nchan: 64
 projs: []
 sfreq: 1000.0 Hz
>
Channels marked as bad:
none

Test results:

original:

additional:

Comment:
It seems to me, there are some technical issues with your data, this can be due to faulty connections of the electrodes/amplifier of the EEG device. Most of the channels data are flat/not recorded. But I am not completely familiar with this particular file extension (*.dat). @richard can you please follow up to see if the data requires any specific headers in-order to read correctly?

best,
Dip

1 Like

Do you have any information about the origin of this data file? Like what system it was recorded from? The .dat file you shared appears to have a single time series (first column time?, second column value) and no information about sensor name, position, units, etc. A file like this can be read into Python like so:

import numpy as np
import matplotlib.pyplot as plt

data = numpy.loadtxt("1_class_1.dat")
data = data.T
plt.plot(*data)
plt.show()

Figure_1

…and in theory you could then create an MNE object from it using RawArray:

import mne
times = data[0]
values = data[1]
sfreq = 1 / np.diff(times).mean()  # 250.0000000000236, maybe round this to 250?
info = mne.create_info(['mystery channel'], sfreq=sfreq)
raw = mne.io.RawArray(data=np.atleast_2d(values), info=info)

…but again, without knowing at least the units of measurement it’s hard to know what you might do with that.

1 Like