First try at making a script for EMGs

Hello!
I am trying to learn how to create scripts that I could use to analyze EMG data for a future student research. This was my first time, so I used a lot of chatgpt to help me solve my problems. I had some problems making pycharm read it as 2 collumns, so that’s why the really complicated start.

Well, I finished it, but I am not sure it shows anything correctly, the whole thing seems odd. I think I failed badly.

Here is my code:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import mne

# Load the data without headers to inspect it
data = pd.read_csv(r'D:\Pycharm Projects\Neurophysiological Workspace\EMG\emg_healthy.txt', header=None,
                   encoding='latin1')

# Try reading the file with different delimiters
delimiters = ['\t', ' ', ',', ';']
for delim in delimiters:
    try:
        data = pd.read_csv(r'D:\Pycharm Projects\Neurophysiological Workspace\EMG\emg_healthy.txt', delimiter=delim,
                           header=None, encoding='latin1')
        if data.shape[1] == 2:  # Check if the data is split into two columns
            break
    except Exception as e:
        print(f"Failed with delimiter '{delim}': {e}")

# Assign column names if data is split into two columns
if data.shape[1] == 2:
    data.columns = ['Time', 'EMG']
else:
    print("Data is not split into two columns. Please check the delimiter and data file.")

# Clean the EMG data column
data['EMG'] = pd.to_numeric(data['EMG'], errors='coerce')  # Convert non-numeric values to NaN
data = data.dropna(subset=['EMG'])  # Drop rows with NaN in 'EMG'

# Check the cleaned data
print(data.head())
print(data.info())

######################################
# Create MNE Info object
info = mne.create_info(ch_names=['EMG'], sfreq=1000, ch_types='eeg')  # Adjust sfreq (sampling frequency) accordingly

# Create MNE Raw object
raw = mne.io.RawArray(np.array([data['EMG'].values]), info)

# Inspect the Raw object
print(raw.info)

################################
########FILTERING####
# Apply a band-pass filter to the EMG data
raw.filter(l_freq=20, h_freq=450)

# Extract the cleaned EMG signal
emg_cleaned = raw.get_data()[0]

# Plot both raw and cleaned EMG signals for comparison
plt.figure(figsize=(14, 6))
plt.subplot(2, 1, 1)
plt.plot(data['Time'], data['EMG'], label='Raw EMG', color='blue')
plt.title('Raw EMG Signal')
plt.xlabel('Time')
plt.ylabel('EMG')
plt.legend()

plt.subplot(2, 1, 2)
plt.plot(data['Time'], emg_cleaned[:len(data['Time'])], label='Cleaned EMG', color='orange')
plt.title('Cleaned EMG Signal')
plt.xlabel('Time')
plt.ylabel('EMG')
plt.legend()

plt.tight_layout()
plt.show()
###########################
#Event detection
# Example: Manual threshold-based event detection for EMG data (Example threshold 0.1)
threshold = 0.1
events = np.where(emg_cleaned > threshold)[0]
print(f"Number of events detected: {len(events)}")

# Plot the cleaned EMG signal with detected events
plt.figure(figsize=(12, 6))
plt.plot(data['Time'], emg_cleaned[:len(data['Time'])], label='Cleaned EMG', color='orange')
plt.vlines(data['Time'].iloc[events], ymin=min(emg_cleaned), ymax=max(emg_cleaned), colors='r', linestyles='dashed', label='Events')
plt.title('Cleaned EMG Signal with Detected Events')
plt.xlabel('Time')
plt.ylabel('EMG')
plt.legend()
plt.show()

########################################
#Statistical analysis
# Example: Extract segments around the first detected event
if len(events) > 0:
    event_index = events[0]  # Using the first event for demonstration
    segment_before = emg_cleaned[event_index-100:event_index]
    segment_after = emg_cleaned[event_index:event_index+100]

    # Plot the segments
    plt.figure(figsize=(12, 6))
    plt.plot(data['Time'].iloc[event_index-100:event_index], segment_before, label='Before Event', color='blue')
    plt.plot(data['Time'].iloc[event_index:event_index+100], segment_after, label='After Event', color='orange')
    plt.axvline(x=data['Time'].iloc[event_index], color='red', linestyle='--', label='Event')
    plt.title('EMG Signal Segments Around an Event')
    plt.xlabel('Time')
    plt.ylabel('EMG')
    plt.legend()
    plt.show()

from scipy import stats

# Ensure the segments have enough data points
if len(segment_before) > 1 and len(segment_after) > 1:
    t_stat, p_value = stats.ttest_ind(segment_before, segment_after)
    print(f"T-statistic: {t_stat}, P-value: {p_value}")
else:
    print("Not enough data to perform statistical test.")

##########################################
#Feature Extraction

import numpy as np

# Compute some basic features
mean_emg = np.mean(emg_cleaned)
std_emg = np.std(emg_cleaned)
max_emg = np.max(emg_cleaned)
min_emg = np.min(emg_cleaned)

# Print extracted features
print(f"Mean EMG: {mean_emg}")
print(f"Standard Deviation of EMG: {std_emg}")
print(f"Maximum EMG: {max_emg}")
print(f"Minimum EMG: {min_emg}")

# Print the detected event timings and count
print("Detected event timings (in samples):")
print(events)

And here are the screenshots of some of the results it gave me. I just find it totally out of place, even though I don’t really understand it yet to know.

And this is the statistical result:

Number of events detected: 1544
Not enough data to perform statistical test.
Mean EMG: -4.744358267801703e-07
Standard Deviation of EMG: 0.0547782648369939
Maximum EMG: 0.8504640036288111
Minimum EMG: -0.3843236059195138
Detected event timings (in samples):
[   95    99   100 ... 50579 50580 50710]

So this is just an introduction to how much of a failure this attempt was.
If you’ve analyzed this up to here, I would really like to ask you, if you could give me some tips where to start and some work order, so I don’t get lost like this time. Thank you very much!