Data Models¶
All models are Python dataclass instances defined in ecgdatakit.models.
Import: from ecgdatakit import ECGRecord, Lead, PatientInfo, RecordingInfo, ...
ECGRecord¶
- class ecgdatakit.models.ECGRecord[source]¶
Bases:
objectUnified ECG record returned by all parsers.
Every parser in ECGDataKit produces an
ECGRecord. Useto_dict()orto_json()to obtain a format-agnostic, JSON-serialisable representation that is identical regardless of the original file format.Samples are stored as raw ADC values by default. Call
to_physical()to convert all leads to physical voltage units, thenconvert_units()to switch betweenuV,mV, orV.- __init__(patient=<factory>, recording=<factory>, leads=<factory>, interpretation=<factory>, measurements=<factory>, median_beats=<factory>, annotations=<factory>, source_format='', raw_metadata=<factory>)¶
- Parameters:
patient (PatientInfo)
recording (RecordingInfo)
interpretation (Interpretation)
measurements (GlobalMeasurements)
source_format (str)
raw_metadata (dict)
- Return type:
None
- patient: PatientInfo¶
Patient demographics.
- recording: RecordingInfo¶
Recording session metadata (includes device and acquisition setup).
- interpretation: Interpretation¶
Machine or physician interpretation.
- measurements: GlobalMeasurements¶
Global ECG interval/axis measurements.
- to_physical()[source]¶
Convert all leads and median beats from raw ADC to physical units.
Returns a new
ECGRecordwhere everyLeadhasis_raw=False. Leads already in physical units are unchanged.- Return type:
- convert_units(target)[source]¶
Convert all leads and median beats to the specified voltage unit.
- Parameters:
target (
str) – Target unit ("uV","mV","V").- Raises:
RawSamplesError – If any lead is still raw ADC.
- Return type:
- plot(show=True, rows=None, cols=None, **kwargs)[source]¶
Plot the ECG record with patient/device header and all leads.
Lead¶
- class ecgdatakit.models.Lead[source]¶
Bases:
objectSingle ECG lead with signal data.
Resolution and scaling
ECG file formats store a raw ADC resolution value in format-specific units (e.g. nV/count for ISHNE and SCP-ECG, µV/count for Sierra XML). The parser converts this to a normalised scale factor stored in
resolution, expressed in the unit given byunits:physical_value = samples * resolution + offset (in ``units``)
The original, unconverted value from the file is preserved in
adc_resolutionfor reference.Example — ISHNE file with
ampl_res = 153(nV/count):adc_resolution = 153.0— raw file value (nV/count)resolution = 0.153— converted: 153 / 1000 (µV/count)units = "uV"
Auto-detection of
is_rawParsers set
is_rawautomatically. Whenresolution == 1.0andoffset == 0.0the samples are already in physical units (is_raw=False); otherwise they are raw ADC counts (is_raw=True) that need scaling viato_physical().- __init__(label, samples, sampling_rate, resolution=1.0, resolution_unit='', offset=0.0, units='', is_raw=True, adc_resolution=0.0, adc_resolution_unit='', quality=None, transducer='', prefiltering='', annotations=<factory>)¶
- Parameters:
- Return type:
None
- samples: ndarray[tuple[Any, ...], dtype[float64]]¶
Signal sample values (raw ADC or physical, depending on
is_raw).
- resolution: float = 1.0¶
Normalised scale factor for ADC-to-physical conversion, in the unit given by
resolution_unit. Computed fromadc_resolutionby the parser (e.g.adc_resolution / 1000for nV → µV). Used byto_physical():physical = samples * resolution + offset.
- resolution_unit: str = ''¶
Unit of the
resolutionscale factor (e.g."uV","mV"). Afterto_physical(), the resulting samples are in this unit. Set by the parser based on the format specification.
- offset: float = 0.0¶
Additive offset for ADC-to-physical conversion (default
0.0). Used byto_physical():physical = samples * resolution + offset.
- units: str = ''¶
Current unit of
samples. Empty whenis_raw=True(samples are dimensionless ADC counts). Set to the physical unit afterto_physical()orconvert_units()is called (e.g."uV","mV").
- is_raw: bool = True¶
Trueif samples are raw ADC counts needing scaling,Falseif samples are already in physicalunits. Parsers set this automatically:is_raw = not (resolution == 1.0 and offset == 0.0).
- adc_resolution: float = 0.0¶
Original ADC resolution exactly as stored in the source file, before any unit conversion. For example, ISHNE stores nV/count and SCP-ECG stores nV/unit — this field preserves that raw value (e.g.
153.0for 153 nV/count). The converted value used for scaling is inresolution.
- adc_resolution_unit: str = ''¶
Unit of
adc_resolutionas defined by the source format (e.g."nV"for ISHNE and SCP-ECG).
- to_physical()[source]¶
Convert raw ADC samples to physical voltage units.
Applies
physical = samples * resolution + offsetand returns a newLeadwithis_raw=False. If this lead is already in physical units, returnsselfunchanged.- Raises:
ValueError – If
resolutionis zero (conversion undefined).- Return type:
- convert_units(target)[source]¶
Convert between physical voltage units (uV, mV, V).
- Parameters:
target (
str) – Target unit string ("uV","mV","V"and common aliases like"µV").- Returns:
A new
Leadwith samples scaled to target.- Return type:
- Raises:
RawSamplesError – If samples are still raw ADC (
is_raw=True).ValueError – If the current or target unit is not a recognized voltage unit.
PatientInfo¶
RecordingInfo¶
- class ecgdatakit.models.RecordingInfo[source]¶
Bases:
objectRecording session metadata.
- __init__(date=None, end_date=None, duration=None, technician='', referring_physician='', room='', location='', device=<factory>, acquisition=<factory>)¶
- Parameters:
date (datetime | None)
end_date (datetime | None)
duration (timedelta | None)
technician (str)
referring_physician (str)
room (str)
location (str)
device (DeviceInfo)
acquisition (AcquisitionSetup)
- Return type:
None
- device: DeviceInfo¶
Acquisition device info.
- acquisition: AcquisitionSetup¶
Signal acquisition setup (signal characteristics + filters).
DeviceInfo¶
FilterSettings¶
AcquisitionSetup¶
- class ecgdatakit.models.AcquisitionSetup[source]¶
Bases:
objectSignal acquisition configuration: characteristics and filter settings.
- __init__(signal=<factory>, filters=<factory>)¶
- Parameters:
signal (SignalCharacteristics)
filters (FilterSettings)
- Return type:
None
- signal: SignalCharacteristics¶
Technical signal encoding and acquisition metadata.
- filters: FilterSettings¶
Filter settings applied during acquisition.
SignalCharacteristics¶
- class ecgdatakit.models.SignalCharacteristics[source]¶
Bases:
objectTechnical signal encoding and acquisition metadata.
- __init__(sampling_rate=0, resolution=0.0, bits_per_sample=None, signal_offset=None, signal_signed=None, number_channels_allocated=None, number_channels_valid=None, electrode_placement='', compression='', data_encoding='', acsetting=None, filtered=None, downsampled=None, upsampled=None, waveform_modified=None, downsampling_method='', upsampling_method='')¶
- Parameters:
sampling_rate (int)
resolution (float)
bits_per_sample (int | None)
signal_offset (int | None)
signal_signed (bool | None)
number_channels_allocated (int | None)
number_channels_valid (int | None)
electrode_placement (str)
compression (str)
data_encoding (str)
acsetting (int | None)
filtered (bool | None)
downsampled (bool | None)
upsampled (bool | None)
waveform_modified (bool | None)
downsampling_method (str)
upsampling_method (str)
- Return type:
None
Interpretation¶
- class ecgdatakit.models.Interpretation[source]¶
Bases:
objectMachine or physician ECG interpretation.
GlobalMeasurements¶
- class ecgdatakit.models.GlobalMeasurements[source]¶
Bases:
objectGlobal ECG interval and axis measurements.
- __init__(heart_rate=None, rr_interval=None, pr_interval=None, qrs_duration=None, qt_interval=None, qtc_bazett=None, qtc_fridericia=None, p_axis=None, qrs_axis=None, t_axis=None, qrs_count=None)¶
Resolution Pipeline (ADC → Physical Units)¶
ECG hardware digitises analogue signals into integer ADC counts. The
Lead dataclass carries the metadata needed to
convert those counts back to physical voltage values.
Fields involved¶
Field |
Example |
Meaning |
|---|---|---|
|
|
Raw value from the file (e.g. 153 nV/count for ISHNE) |
|
|
Unit of |
|
|
Scale factor normalised to |
|
|
Unit of the |
|
|
Additive offset: |
|
|
Current unit of |
|
|
|
Conversion formula¶
physical_value = samples × resolution + offset
Auto-detection by parsers¶
Parsers compute is_raw automatically:
is_raw = not (resolution == 1.0 and offset == 0.0)
If resolution is 1.0 and offset is 0.0, the data is already in physical
units — no scaling is needed, and units is set directly. Otherwise,
units stays empty until to_physical() is called.
Example: ISHNE Holter (153 nV/count)¶
record = FileParser().parse("holter.ecg", auto_scale=False)
lead = record.leads[0]
# lead.adc_resolution → 153.0 (raw file value)
# lead.adc_resolution_unit → "nV" (file stores nV/count)
# lead.resolution → 0.153 (153 nV ÷ 1000 = 0.153 µV)
# lead.resolution_unit → "uV" (resolution is in µV/count)
# lead.units → "" (raw ADC, no unit yet)
# lead.is_raw → True
physical = lead.to_physical()
# physical.samples → original × 0.153
# physical.units → "uV"
# physical.is_raw → False
in_mv = physical.convert_units("mV")
# in_mv.units → "mV"
Using auto_scale¶
FileParser().parse(path, auto_scale=True) (default) calls to_physical()
then convert_units("mV") automatically on every lead that has scaling
metadata.
Working with Data Models¶
ECGDataKit functions accept both Lead objects and raw numpy arrays. When passing a numpy array, provide the sample rate via fs.
Using numpy arrays¶
import numpy as np
from ecgdatakit.processing import diagnostic_filter, detect_r_peaks
from ecgdatakit.plotting import plot_lead
signal = np.array([0.12, 0.15, 0.13, ...], dtype=np.float64)
filtered = diagnostic_filter(signal, fs=500)
peaks = detect_r_peaks(filtered)
fig = plot_lead(filtered, peaks=peaks)
Note:
fsis required when passing a numpy array and will raise aTypeErrorif omitted. When passing aLead,fsis ignored.
Using Lead objects¶
from ecgdatakit import Lead
lead = Lead(
label="II",
samples=samples,
sampling_rate=500,
units="mV",
is_raw=False,
)
# No need for fs= when using Lead objects
filtered = diagnostic_filter(lead)
Extracting numpy arrays¶
raw_array = lead.samples # NDArray[np.float64]
fs = lead.sampling_rate # int (Hz)
Building a Lead from external data¶
import numpy as np
from ecgdatakit import Lead
# Synthetic sine wave (10 s at 500 Hz)
fs = 500
t = np.arange(fs * 10, dtype=np.float64) / fs
signal = np.sin(2 * np.pi * 1.2 * t)
lead = Lead(label="II", samples=signal, sampling_rate=fs, units="mV", is_raw=False)
# From a pandas DataFrame
import pandas as pd
df = pd.read_csv("ecg_data.csv")
lead = Lead(
label="V1",
samples=df["voltage"].to_numpy(dtype=np.float64),
sampling_rate=250,
units="mV",
is_raw=False,
)
Building an ECGRecord from scratch¶
from ecgdatakit import ECGRecord, Lead, PatientInfo, RecordingInfo
import numpy as np
leads = [
Lead(label=name, samples=np.random.randn(5000).astype(np.float64),
sampling_rate=500, units="mV", is_raw=False)
for name in ["I", "II", "III", "aVR", "aVL", "aVF",
"V1", "V2", "V3", "V4", "V5", "V6"]
]
rec = RecordingInfo()
rec.acquisition.signal.sampling_rate = 500
record = ECGRecord(
patient=PatientInfo(patient_id="001", first_name="Jane", last_name="Doe"),
recording=rec,
leads=leads,
)
All fields are optional with sensible defaults.