PPG pipeline

Warning

Include code to show how to create a pipeline to clean an ecg signal. It is probably in one of the jupyter notebooks (maybe ecg_qc.ipynb).

Load the data

17
18
19
20
21
22
23
 # First, lets load the data (ppg sample data)

 # Libraries
 import pandas as pd

 # Load data
 #data = pd.read_csv(path)

Preprocessing

Wearable devices need some time to pick up stable signals. For this reason, it is a common practice to trim the data. In the following example, the f first and last 5 minutes of each recording are trimmed to exclude unstable signals.

35
36
 # Trim data
 #data = trim(data, start=5, end=5)

Now, lets remove the following noise:

  • PLETH is 0 or unchanged values for xxx time

  • SpO2 < 80

  • Pulse > 200 bpm or Pulse < 40 bpm

  • Perfusion < 0.2

  • Lost connection: sampling rate reduced due to (possible) Bluetooth connection lost. Timestamp column shows missing timepoints. If the missing duration is larger than 1 cycle (xxx ms), recording is split. If not, missing timepoints are interpolated.

50
51
52
53
54
55
56
57
58
59
60
61
62
63
 # Remove invalid PLETH
 #idxs_1 = data.PLETH == 0
 #idxs_2 = unchanged(period=xxxx)
 #data = data[~(idxs_1 | idxs_2)]

 # Remove invalid ranges
 #data = data[data.SpO2>=80]
 #data = data[data.Pulse.between(40, 200)]
 #data = data[data.Perfusion>=0.2]

 # Remove lost connection
 #data = data[lost_connection(min_fs, max_fs) ??

 # The recording is then split into files.

Lets filter the data with a band pass filter; high pass filter (cut off at 1Hz)

Lets detrend the signal

71
72
73
74
75
 # Lets split the data
 #4.1. Cut data by time domain. Split data into sub segments of 30 seconds
 #4.2. Apply the peak and trough detection methods in peak_approaches.py to get single PPG cycles in each segment
 #4.3. Shift baseline above 0 and tapering each single PPG cycle to compute the mean template
 #Notes: the described process is implemented in split_to_segments.py

SQI scores

Lets compute the SQI scores

Visualization

Final notes

Initially, it might have a lot of code. If the library grows, as Stefan mentioned it might be possible to create a generalised pipeline from sub-methods or classes. Or allow the user to configure the process and then automate all the run.

 97
 98
 99
100
101
102
103
104
105
106
107
108
 # Create steps
 #step1 = Trim(start=5, end=5)
 #step2 = Unchanged(param1=x, param2=y)
 #step3 = LostConnection(param1=20, param2=30)

 # Create pipeline
 #pipe = Pipeline(steps=[('step1', step1),
 #                       ('step2', step2),
 #                       ('step3', step3))

 # Data
 #data = pipe.fit_transform(data)

Total running time of the script: ( 0 minutes 0.001 seconds)

Gallery generated by Sphinx-Gallery