Getting Started

This guide walks you through installing ECGDataKit, parsing your first ECG file, processing signals, and creating visualizations.

Installation

Install the core library with pip:

pip install .

ECGDataKit ships several optional dependency groups. Install only what you need:

ExtraCommandDescription
processingpip install ".[processing]"Signal filtering, peak detection, HRV analysis
plottingpip install ".[plotting]"Static Matplotlib-based plots
plotting-interactivepip install ".[plotting-interactive]"Interactive Plotly-based plots
holterpip install ".[holter]"ISHNE Holter format CRC validation
dicompip install ".[dicom]"DICOM waveform parsing via pydicom
cleaningpip install ".[cleaning]"BioSPPy + NeuroKit2 ECG cleaning backends
denoisingpip install ".[denoising]"DeepFADE neural-net denoiser (torch)
allpip install ".[all]"Everything above (except torch)

Parsing an ECG file

FileParser auto-detects the file format and returns a unified ECGRecord object.

from ecgdatakit.parsing import FileParser

record = FileParser().parse("path/to/ecg_file.xml")

Patient information

patient = record.patient
print(patient.name)        # Patient name
print(patient.id)          # Patient ID
print(patient.birth_date)  # Date of birth
print(patient.sex)         # Sex

Recording information

recording = record.recording
print(recording.date)         # Acquisition date
print(recording.sample_rate)  # Sampling frequency in Hz
print(recording.duration)     # Duration in seconds

Lead data

leads = record.leads
for lead in leads:
    print(lead.name, lead.samples[:5])

Measurements and device info

# Automated measurements (if present in the file)
measurements = record.measurements
print(measurements.heart_rate)
print(measurements.pr_interval)
print(measurements.qrs_duration)

# Device that recorded the ECG
device = record.device
print(device.manufacturer)
print(device.model)

Serialization

# Export to JSON string
json_str = record.to_json()

# Export to Python dict
data = record.to_dict()

Processing signals

The ecgdatakit.processing module provides filtering, peak detection, heart-rate analysis, HRV metrics, signal quality assessment, and lead derivation.

Filtering

from ecgdatakit.processing import diagnostic_filter, remove_baseline

# Apply a diagnostic-grade bandpass filter (0.05 - 150 Hz)
filtered = diagnostic_filter(lead.samples, fs=recording.sample_rate)

# Remove baseline wander
corrected = remove_baseline(lead.samples, fs=recording.sample_rate)

Peak detection and heart rate

from ecgdatakit.processing import detect_r_peaks, heart_rate, rr_intervals

# Detect R-peaks
peaks = detect_r_peaks(filtered, fs=recording.sample_rate)

# Compute heart rate in BPM
hr = heart_rate(peaks, fs=recording.sample_rate)

# Get RR intervals in milliseconds
rr = rr_intervals(peaks, fs=recording.sample_rate)

Heart-rate variability (HRV)

from ecgdatakit.processing import time_domain

# Time-domain HRV metrics (SDNN, RMSSD, pNN50, etc.)
hrv = time_domain(rr)
print(hrv)

Signal quality

from ecgdatakit.processing import signal_quality_index

sqi = signal_quality_index(filtered, fs=recording.sample_rate)
print(f"Signal quality: {sqi}")

Lead derivation

from ecgdatakit.processing import derive_augmented

# Derive augmented limb leads (aVR, aVL, aVF) from I and II
augmented = derive_augmented(lead_I.samples, lead_II.samples)

ECG cleaning

from ecgdatakit.processing import clean_ecg

# Built-in cleaning (bandpass + notch, no extra deps)
cleaned = clean_ecg(lead)

# NeuroKit2 backend (pip install ecgdatakit[cleaning])
cleaned = clean_ecg(lead, method="neurokit2")

Neural-net denoising

from ecgdatakit.processing import clean_ecg

# DeepFADE DenseNet denoiser (pip install ecgdatakit[denoising])
denoised = clean_ecg(lead, method="deepfade")

# Use MPS acceleration on Apple Silicon
denoised = clean_ecg(lead, method="deepfade", device="mps")

Visualizing

ECGDataKit provides both static (Matplotlib) and interactive (Plotly) plotting functions.

Static plots

from ecgdatakit.plotting import plot_12lead, plot_peaks, plot_hrv_summary

# Standard 12-lead ECG layout
plot_12lead(record)

# Overlay detected R-peaks on a filtered signal
plot_peaks(filtered, peaks)

# HRV summary plot (Poincare, histogram, tachogram)
plot_hrv_summary(rr)

Interactive plots

from ecgdatakit.plotting import iplot_lead, iplot_12lead

# Interactive single-lead viewer with zoom and pan
iplot_lead(filtered, peaks)

# Interactive 12-lead viewer
iplot_12lead(record)

Batch processing

Parse multiple files in parallel with parse_batch:

from ecgdatakit.parsing import parse_batch
from pathlib import Path

files = list(Path("data/").glob("*.xml"))
records = parse_batch(files, max_workers=4)

for rec in records:
    print(rec.patient.name, rec.recording.date)

Adding a new parser

To add support for a new ECG format:

  1. Create a new file in src/ecgdatakit/parsing/parsers/.
  2. Subclass Parser and implement two methods: can_parse and parse.
from ecgdatakit.parsing.parsers.base import Parser
from ecgdatakit.parsing.models import ECGRecord

class MyFormatParser(Parser):
    """Parser for the MyFormat ECG file type."""

    def can_parse(self, path: str) -> bool:
        """Return True if this parser can handle the given file."""
        with open(path, "rb") as f:
            magic = f.read(4)
        return magic == b"MYFT"

    def parse(self, path: str) -> ECGRecord:
        """Parse the file and return an ECGRecord."""
        # Read and decode the file
        # Build and return an ECGRecord
        ...

The FileParser will automatically discover your new parser and try it during format detection.