.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "_examples/pandas/plot_format04_therapy_all.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here <sphx_glr_download__examples_pandas_plot_format04_therapy_all.py>` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr__examples_pandas_plot_format04_therapy_all.py: 04. Format MIMIC therapy (all) =============================== Description... .. GENERATED FROM PYTHON SOURCE LINES 7-13 .. code-block:: default :lineno-start: 7 # Generic libraries import pandas as pd # Show in terminal TERMINAL = False .. GENERATED FROM PYTHON SOURCE LINES 14-15 First, lets load and do some basic formatting on the data. .. GENERATED FROM PYTHON SOURCE LINES 15-54 .. code-block:: default :lineno-start: 16 # ----------------------------- # Constants # ----------------------------- # Path path = './data/mimic-therapy/ICU_diagnoses_antibiotics.csv' # ----------------------------- # Load data # ----------------------------- # Read data data = pd.read_csv(path, dayfirst=True, parse_dates=['starttime', 'stoptime']) # Keep only useful columns data = data[['subject_id', 'hadm_id', 'stay_id', 'icd_code', 'antibiotic', 'route', 'starttime', 'stoptime']] # Reformat (time info and str) data.starttime = data.starttime.dt.date data.stoptime = data.stoptime.dt.date data.antibiotic = data.antibiotic \ .str.lower() \ .str.strip() # Show if TERMINAL: print("\nData:") print(data) data .. raw:: html <div class="output_subarea output_html rendered_html output_result"> <div> <style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>subject_id</th> <th>hadm_id</th> <th>stay_id</th> <th>icd_code</th> <th>antibiotic</th> <th>route</th> <th>starttime</th> <th>stoptime</th> </tr> </thead> <tbody> <tr> <th>0</th> <td>10656173</td> <td>25778760</td> <td>30001555</td> <td>J95851</td> <td>ceftriaxone</td> <td>IV</td> <td>2177-09-27</td> <td>2177-10-02</td> </tr> <tr> <th>1</th> <td>10656173</td> <td>25778760</td> <td>37985659</td> <td>J95851</td> <td>cefepime</td> <td>IV</td> <td>2177-09-19</td> <td>2177-09-23</td> </tr> <tr> <th>2</th> <td>10656173</td> <td>25778760</td> <td>37985659</td> <td>J95851</td> <td>cefepime</td> <td>IV</td> <td>2177-09-11</td> <td>2177-09-12</td> </tr> <tr> <th>3</th> <td>10656173</td> <td>25778760</td> <td>37985659</td> <td>J95851</td> <td>cefepime</td> <td>IV</td> <td>2177-09-10</td> <td>2177-09-11</td> </tr> <tr> <th>4</th> <td>10656173</td> <td>25778760</td> <td>37985659</td> <td>J95851</td> <td>cefepime</td> <td>IV</td> <td>2177-09-12</td> <td>2177-09-12</td> </tr> <tr> <th>...</th> <td>...</td> <td>...</td> <td>...</td> <td>...</td> <td>...</td> <td>...</td> <td>...</td> <td>...</td> </tr> <tr> <th>21541</th> <td>15689523</td> <td>23914765</td> <td>39918058</td> <td>99731</td> <td>metronidazole (flagyl)</td> <td>IV</td> <td>2159-07-13</td> <td>2159-07-25</td> </tr> <tr> <th>21542</th> <td>15689523</td> <td>23914765</td> <td>39918058</td> <td>99731</td> <td>metronidazole (flagyl)</td> <td>IV</td> <td>2159-06-27</td> <td>2159-06-27</td> </tr> <tr> <th>21543</th> <td>15689523</td> <td>23914765</td> <td>39918058</td> <td>99731</td> <td>sulfamethoxazole-trimethoprim</td> <td>IV</td> <td>2159-06-25</td> <td>2159-06-26</td> </tr> <tr> <th>21544</th> <td>15689523</td> <td>23914765</td> <td>39918058</td> <td>99731</td> <td>sulfamethoxazole-trimethoprim</td> <td>IV</td> <td>2159-06-26</td> <td>2159-07-02</td> </tr> <tr> <th>21545</th> <td>15689523</td> <td>23914765</td> <td>39918058</td> <td>99731</td> <td>sulfamethoxazole-trimethoprim</td> <td>IV</td> <td>2159-06-29</td> <td>2159-06-29</td> </tr> </tbody> </table> <p>21546 rows × 8 columns</p> </div> </div> <br /> <br /> .. GENERATED FROM PYTHON SOURCE LINES 55-63 Lets transform the data .. note:: You might need to add ``NaNs`` for missing days per patient. The other sample included in this repository for a single patient :ref:`sphx_glr__examples_pandas_plot_format04_therapy_one.py` achieves this by using the following code: ``aux = aux.asfreq('1D')`` Note it needs to be applied per patient! .. GENERATED FROM PYTHON SOURCE LINES 63-92 .. code-block:: default :lineno-start: 64 # ----------------------------- # Transform data # ----------------------------- # .. note: The closed parameter indicates whether to include # the first and/or last samples. None will keep both, # left will keep only start date and right will keep # also the right date. # Create column with date range data['startdate'] = data.apply(lambda x: pd.date_range(start=x['starttime'], end=x['stoptime'], closed='left', # ignoring right freq='D') ,axis=1) # Explode such column data = data.explode('startdate') # Groupby groupby = ['subject_id', 'hadm_id', 'stay_id', 'startdate'] # Create daily therapies aux = data.groupby(groupby) \ .apply(lambda x: sorted(x.antibiotic \ .unique().tolist())) .. GENERATED FROM PYTHON SOURCE LINES 93-94 Lets see the formatted data .. GENERATED FROM PYTHON SOURCE LINES 94-102 .. code-block:: default :lineno-start: 95 # Show if TERMINAL: print("\nFormatted:") print(aux) aux .. rst-class:: sphx-glr-script-out Out: .. code-block:: none subject_id hadm_id stay_id startdate 10004733 27411876 39635619 2174-12-04 [piperacillin-tazobactam, vancomycin] 2174-12-05 [piperacillin-tazobactam, vancomycin] 2174-12-06 [piperacillin-tazobactam, vancomycin] 2174-12-07 [piperacillin-tazobactam] 2174-12-08 [piperacillin-tazobactam, vancomycin] ... 19997367 20617667 35616526 2126-05-06 [ceftriaxone] 2126-05-07 [ceftriaxone] 2126-05-08 [ceftriaxone] 2126-05-09 [ceftriaxone] 2126-05-10 [ceftriaxone] Length: 22936, dtype: object .. GENERATED FROM PYTHON SOURCE LINES 103-104 Lets count the number of days .. GENERATED FROM PYTHON SOURCE LINES 104-112 .. code-block:: default :lineno-start: 105 # Show if TERMINAL: print("\nTherapies (number of days)") print(aux.value_counts()) aux.value_counts() .. rst-class:: sphx-glr-script-out Out: .. code-block:: none [cefepime, vancomycin] 2630 [cefepime] 1802 [vancomycin] 1732 [piperacillin-tazobactam, vancomycin] 1490 [meropenem] 1378 ... [cefepime, gentamicin, gentamicin sulfate, vancomycin] 1 [ciprofloxacin, ciprofloxacin iv, vancomycin] 1 [doxycycline hyclate, metronidazole (flagyl)] 1 [cefepime, ciprofloxacin iv, metronidazole (flagyl), sulfamethoxazole-trimethoprim, vancomycin] 1 [ceftolozane-tazobactam, ceftolozane-tazobactam *nf*] 1 Length: 707, dtype: int64 .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 5.921 seconds) .. _sphx_glr_download__examples_pandas_plot_format04_therapy_all.py: .. only :: html .. container:: sphx-glr-footer :class: sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_format04_therapy_all.py <plot_format04_therapy_all.py>` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_format04_therapy_all.ipynb <plot_format04_therapy_all.ipynb>` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_