Getting Started¶
This guide walks you through installing ECGDataKit, parsing your first ECG file, processing signals, and creating visualizations.
Installation¶
From PyPI¶
pip install ecgdatakit
From GitHub (latest stable version)¶
pip install "ecgdatakit @ git+https://github.com/UMMISCO/ECGDataKit.git#subdirectory=lib"
From GitLab (development version)¶
pip install "ecgdatakit @ git+https://git.ummisco.fr/open/ecgdatakit.git#subdirectory=lib"
From source (local clone)¶
# GitHub (stable)
git clone https://github.com/UMMISCO/ECGDataKit.git
# GitLab (development)
git clone https://git.ummisco.fr/open/ecgdatakit.git
cd ECGDataKit/lib
pip install .
Optional extras¶
ECGDataKit ships several optional dependency groups. Install only what you need:
Extra |
Command |
Description |
|---|---|---|
|
|
Signal filtering, peak detection, HRV analysis |
|
|
Static Matplotlib-based plots |
|
|
Interactive Plotly-based plots |
|
|
ISHNE Holter format CRC validation |
|
|
DICOM waveform parsing via pydicom |
|
|
BioSPPy + NeuroKit2 ECG cleaning backends |
|
|
DeepFADE denoising autoencoder (torch) |
|
|
Everything above (except torch) |
Parsing an ECG file¶
FileParser auto-detects the file format and returns a unified ECGRecord object.
from ecgdatakit.parsing import FileParser
record = FileParser().parse("path/to/ecg_file.xml")
Patient information¶
patient = record.patient
print(patient.first_name) # First name
print(patient.last_name) # Last name
print(patient.patient_id) # Patient ID
print(patient.birth_date) # Date of birth
print(patient.sex) # Sex ("M", "F", or "U")
Recording information¶
recording = record.recording
print(recording.date) # Acquisition date
print(recording.acquisition.signal.sampling_rate) # Sampling frequency in Hz
print(recording.duration) # Duration as timedelta
Lead data¶
leads = record.leads
for lead in leads:
print(lead.label, lead.samples[:5])
Measurements and device info¶
# Automated measurements (if present in the file)
measurements = record.measurements
print(measurements.heart_rate)
print(measurements.pr_interval)
print(measurements.qrs_duration)
# Device that recorded the ECG
device = record.device
print(device.manufacturer)
print(device.model)
Serialization¶
# Export to JSON string
json_str = record.to_json()
# Export to Python dict
data = record.to_dict()
Processing signals¶
The ecgdatakit.processing module provides filtering, peak detection, heart-rate analysis, HRV metrics, signal quality assessment, and lead derivation.
Filtering¶
from ecgdatakit.processing import diagnostic_filter, remove_baseline
lead = record.leads[0] # Lead object from a parsed record
# Apply a diagnostic-grade bandpass filter (0.05 - 150 Hz)
filtered = diagnostic_filter(lead)
# Remove baseline wander
corrected = remove_baseline(lead)
You can also pass numpy arrays directly with fs=:
filtered = diagnostic_filter(my_numpy_array, fs=500)
Peak detection and heart rate¶
from ecgdatakit.processing import detect_r_peaks, heart_rate, rr_intervals
# Detect R-peaks (returns a numpy array of sample indices)
peaks = detect_r_peaks(filtered)
# Compute heart rate in BPM
hr = heart_rate(filtered, peaks)
# Get RR intervals in milliseconds
rr = rr_intervals(filtered, peaks)
Heart-rate variability (HRV)¶
from ecgdatakit.processing import time_domain
# Time-domain HRV metrics (SDNN, RMSSD, pNN50, etc.)
hrv = time_domain(rr)
print(hrv)
Signal quality¶
from ecgdatakit.processing import signal_quality_index
sqi = signal_quality_index(filtered)
print(f"Signal quality: {sqi}")
Lead derivation¶
from ecgdatakit.processing import derive_augmented, find_lead
lead_i = find_lead(record.leads, "I")
lead_ii = find_lead(record.leads, "II")
# Derive augmented limb leads (aVR, aVL, aVF) from I and II
avr, avl, avf = derive_augmented(lead_i, lead_ii)
ECG cleaning¶
from ecgdatakit.processing import clean_ecg
# Built-in cleaning (bandpass + notch, no extra deps)
cleaned = clean_ecg(lead)
# NeuroKit2 backend (pip install ecgdatakit[cleaning])
cleaned = clean_ecg(lead, method="neurokit2")
Neural-net denoising¶
DeepFADE is a denoising autoencoder developed as part of ECGDataKit. It uses a symmetric DenseNet encoder-decoder architecture trained on a large private ECG database with extensive noise augmentations. Pre-trained weights are bundled with the package.
from ecgdatakit.processing import clean_ecg
# DeepFADE denoising autoencoder (pip install ecgdatakit[denoising])
denoised = clean_ecg(lead, method="deepfade")
# Use MPS acceleration on Apple Silicon
denoised = clean_ecg(lead, method="deepfade", device="mps")
Visualizing¶
ECGDataKit provides both static (Matplotlib) and interactive (Plotly) plotting functions.
Static plots¶
Static plots auto-display by default. Pass show=False to get the figure without displaying.
from ecgdatakit.plotting import plot_12lead, plot_peaks, plot_lead, plot_hrv_summary
# Displays automatically
plot_12lead(record)
plot_peaks(filtered, peaks)
plot_hrv_summary(rr)
# Suppress display to save to file
fig = plot_12lead(record, show=False)
fig.savefig("ecg_12lead.png", dpi=150)
# Use sample indices instead of time on x-axis
plot_lead(filtered, x_axis="samples")
Interactive plots¶
from ecgdatakit.plotting import iplot_lead, iplot_12lead
# Interactive single-lead viewer with zoom and pan
iplot_lead(filtered, peaks)
# Interactive 12-lead viewer
iplot_12lead(record)
Batch processing¶
Parse multiple files in parallel with parse_batch:
from ecgdatakit.parsing import parse_batch
from pathlib import Path
files = list(Path("data/").glob("*.xml"))
records = parse_batch(files, max_workers=4)
for rec in records:
print(rec.patient.first_name, rec.recording.date)
Adding a new parser¶
To add support for a new ECG format:
Create a new file in
src/ecgdatakit/parsing/parsers/.Subclass
Parserand implement two methods:can_parseandparse.
from ecgdatakit.parsing.parsers.base import Parser
from ecgdatakit.parsing.models import ECGRecord
class MyFormatParser(Parser):
"""Parser for the MyFormat ECG file type."""
def can_parse(self, path: str) -> bool:
"""Return True if this parser can handle the given file."""
with open(path, "rb") as f:
magic = f.read(4)
return magic == b"MYFT"
def parse(self, path: str) -> ECGRecord:
"""Parse the file and return an ECGRecord."""
# Read and decode the file
# Build and return an ECGRecord
...
The FileParser will automatically discover your new parser and try it during format detection.