Hello!
I am trying to learn how to create scripts that I could use to analyze EMG data for a future student research. This was my first time, so I used a lot of chatgpt to help me solve my problems. I had some problems making pycharm read it as 2 collumns, so that’s why the really complicated start.
Well, I finished it, but I am not sure it shows anything correctly, the whole thing seems odd. I think I failed badly.
Here is my code:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import mne
# Load the data without headers to inspect it
data = pd.read_csv(r'D:\Pycharm Projects\Neurophysiological Workspace\EMG\emg_healthy.txt', header=None,
encoding='latin1')
# Try reading the file with different delimiters
delimiters = ['\t', ' ', ',', ';']
for delim in delimiters:
try:
data = pd.read_csv(r'D:\Pycharm Projects\Neurophysiological Workspace\EMG\emg_healthy.txt', delimiter=delim,
header=None, encoding='latin1')
if data.shape[1] == 2: # Check if the data is split into two columns
break
except Exception as e:
print(f"Failed with delimiter '{delim}': {e}")
# Assign column names if data is split into two columns
if data.shape[1] == 2:
data.columns = ['Time', 'EMG']
else:
print("Data is not split into two columns. Please check the delimiter and data file.")
# Clean the EMG data column
data['EMG'] = pd.to_numeric(data['EMG'], errors='coerce') # Convert non-numeric values to NaN
data = data.dropna(subset=['EMG']) # Drop rows with NaN in 'EMG'
# Check the cleaned data
print(data.head())
print(data.info())
######################################
# Create MNE Info object
info = mne.create_info(ch_names=['EMG'], sfreq=1000, ch_types='eeg') # Adjust sfreq (sampling frequency) accordingly
# Create MNE Raw object
raw = mne.io.RawArray(np.array([data['EMG'].values]), info)
# Inspect the Raw object
print(raw.info)
################################
########FILTERING####
# Apply a band-pass filter to the EMG data
raw.filter(l_freq=20, h_freq=450)
# Extract the cleaned EMG signal
emg_cleaned = raw.get_data()[0]
# Plot both raw and cleaned EMG signals for comparison
plt.figure(figsize=(14, 6))
plt.subplot(2, 1, 1)
plt.plot(data['Time'], data['EMG'], label='Raw EMG', color='blue')
plt.title('Raw EMG Signal')
plt.xlabel('Time')
plt.ylabel('EMG')
plt.legend()
plt.subplot(2, 1, 2)
plt.plot(data['Time'], emg_cleaned[:len(data['Time'])], label='Cleaned EMG', color='orange')
plt.title('Cleaned EMG Signal')
plt.xlabel('Time')
plt.ylabel('EMG')
plt.legend()
plt.tight_layout()
plt.show()
###########################
#Event detection
# Example: Manual threshold-based event detection for EMG data (Example threshold 0.1)
threshold = 0.1
events = np.where(emg_cleaned > threshold)[0]
print(f"Number of events detected: {len(events)}")
# Plot the cleaned EMG signal with detected events
plt.figure(figsize=(12, 6))
plt.plot(data['Time'], emg_cleaned[:len(data['Time'])], label='Cleaned EMG', color='orange')
plt.vlines(data['Time'].iloc[events], ymin=min(emg_cleaned), ymax=max(emg_cleaned), colors='r', linestyles='dashed', label='Events')
plt.title('Cleaned EMG Signal with Detected Events')
plt.xlabel('Time')
plt.ylabel('EMG')
plt.legend()
plt.show()
########################################
#Statistical analysis
# Example: Extract segments around the first detected event
if len(events) > 0:
event_index = events[0] # Using the first event for demonstration
segment_before = emg_cleaned[event_index-100:event_index]
segment_after = emg_cleaned[event_index:event_index+100]
# Plot the segments
plt.figure(figsize=(12, 6))
plt.plot(data['Time'].iloc[event_index-100:event_index], segment_before, label='Before Event', color='blue')
plt.plot(data['Time'].iloc[event_index:event_index+100], segment_after, label='After Event', color='orange')
plt.axvline(x=data['Time'].iloc[event_index], color='red', linestyle='--', label='Event')
plt.title('EMG Signal Segments Around an Event')
plt.xlabel('Time')
plt.ylabel('EMG')
plt.legend()
plt.show()
from scipy import stats
# Ensure the segments have enough data points
if len(segment_before) > 1 and len(segment_after) > 1:
t_stat, p_value = stats.ttest_ind(segment_before, segment_after)
print(f"T-statistic: {t_stat}, P-value: {p_value}")
else:
print("Not enough data to perform statistical test.")
##########################################
#Feature Extraction
import numpy as np
# Compute some basic features
mean_emg = np.mean(emg_cleaned)
std_emg = np.std(emg_cleaned)
max_emg = np.max(emg_cleaned)
min_emg = np.min(emg_cleaned)
# Print extracted features
print(f"Mean EMG: {mean_emg}")
print(f"Standard Deviation of EMG: {std_emg}")
print(f"Maximum EMG: {max_emg}")
print(f"Minimum EMG: {min_emg}")
# Print the detected event timings and count
print("Detected event timings (in samples):")
print(events)
And here are the screenshots of some of the results it gave me. I just find it totally out of place, even though I don’t really understand it yet to know.
And this is the statistical result:
Number of events detected: 1544
Not enough data to perform statistical test.
Mean EMG: -4.744358267801703e-07
Standard Deviation of EMG: 0.0547782648369939
Maximum EMG: 0.8504640036288111
Minimum EMG: -0.3843236059195138
Detected event timings (in samples):
[ 95 99 100 ... 50579 50580 50710]
So this is just an introduction to how much of a failure this attempt was.
If you’ve analyzed this up to here, I would really like to ask you, if you could give me some tips where to start and some work order, so I don’t get lost like this time. Thank you very much!