.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "_examples\indexes\plot_sari_d_temporal.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note :ref:`Go to the end ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr__examples_indexes_plot_sari_d_temporal.py: ``SARI`` - Compute timeseries ------------------------------- .. |1D30| replace:: 1D\ :sub:`30` .. |1M1| replace:: 1M\ :sub:`1` .. |3M1| replace:: 3M\ :sub:`1` .. |1M30| replace:: 1M\ :sub:`30` .. |7D4| replace:: 7D\ :sub:`4` .. |1M12| replace:: 1M\ :sub:`12` .. |1M6| replace:: 1M\ :sub:`6` .. |1M3| replace:: 1M\ :sub:`3` .. |12M1| replace:: 12M\ :sub:`1` .. |SP| replace:: SHIFT\ :sub:`period` In order to study the temporal evolution of AMR, it is necessary to generate a resistance time series from the susceptibility test data. This is often achieved by computing the resistance index on consecutive partitions of the data. Note that each partition contains the susceptibility tests required to compute a resistance index. The traditional strategy of dealing with partitions considers independent time intervals (see yearly, monthly or weekly time series in Table 4.2). Unfortunately, this strategy forces to trade-off between granularity (level of detail) and accuracy. On one side, weekly time series are highly granular but inaccurate. On the other hand, yearly time series are accurate but rough. Note that the granularity is represented by the number of observations in a time series while the accuracy is closely related with the number of susceptibility tests used to compute the resistance index. Conversely, the overlapping time intervals strategy drops such dependence by defining a window of fixed size which is moved across time. The length of the window is denoted as period and the time step as shift. For instance, three time series obtained using the overlapping time intervals strategy with a monthly shift (1M) and window lengths of 12, 6 and 3 have been presented for the sake of clarity (see |1M12|, |1M6| and |1M3| in Table 4.2). .. image:: ../../_static/imgs/timeseries-generation.png :width: 500 :align: center :alt: Generation of Time-Series The notation to define the time series generation methodology (|SP|) is described with various examples in Table 4.2. For instance, |7D4| defines a time series with weekly resistance indexes (7D) calculated using the microbiology records available for the previous four weeks (4x7D). It is important to note that some notations are equivalent representations of the same susceptibility data at different granularities, hence their trends are comparable. As an example, the trend estimated for |1M1| should be approximately thirty times the trend estimated for |1D30|. .. GENERATED FROM PYTHON SOURCE LINES 47-51 Let's see how to compute SARI time series with examples. We first load the data and select one single pair for clarity. .. GENERATED FROM PYTHON SOURCE LINES 52-109 .. code-block:: default :lineno-start: 53 # Libraries import numpy as np import pandas as pd import seaborn as sns import matplotlib as mpl import matplotlib.pyplot as plt # Import own libraries from pyamr.core.sari import sari from pyamr.datasets.load import load_data_nhs # ------------------------- # Configuration # ------------------------- # Configure seaborn style (context=talk) sns.set(style="white") # Set matplotlib mpl.rcParams['xtick.labelsize'] = 9 mpl.rcParams['ytick.labelsize'] = 9 mpl.rcParams['axes.titlesize'] = 11 mpl.rcParams['legend.fontsize'] = 9 # Pandas configuration pd.set_option('display.max_colwidth', 40) pd.set_option('display.width', 300) pd.set_option('display.precision', 4) # Numpy configuration np.set_printoptions(precision=2) # ------------------------------------------- # Load data # ------------------------------------------- # Load data data, antimicrobials, microorganisms = load_data_nhs() # Show print("\nData:") print(data) print("\nColumns:") print(data.columns) print("\nDtypes:") print(data.dtypes) # Filter idxs_spec = data.specimen_code.isin(['URICUL']) idxs_orgs = data.microorganism_code.isin(['ECOL']) idxs_abxs = data.antimicrobial_code.isin(['AAUG']) # Filter data = data[idxs_spec & idxs_orgs & idxs_abxs] # Filter dates (2016-2018 missing) data = data[data.date_received.between('2008-01-01', '2016-12-31')] .. rst-class:: sphx-glr-script-out .. code-block:: none Data: date_received date_outcome patient_id laboratory_number specimen_code specimen_name ... antimicrobial_code antimicrobial_name sensitivity_method sensitivity mic reported 0 2009-01-03 00:00:00 NaN 20091 X428501 BLDCUL NaN ... AAMI amikacin NaN sensitive NaN NaN 1 2009-01-03 00:00:00 NaN 20091 X428501 BLDCUL NaN ... AAMO amoxycillin NaN resistant NaN NaN 2 2009-01-03 00:00:00 NaN 20091 X428501 BLDCUL NaN ... AAUG augmentin NaN sensitive NaN NaN 3 2009-01-03 00:00:00 NaN 20091 X428501 BLDCUL NaN ... AAZT aztreonam NaN sensitive NaN NaN 4 2009-01-03 00:00:00 NaN 20091 X428501 BLDCUL NaN ... ACAZ ceftazidime NaN sensitive NaN NaN ... ... ... ... ... ... ... ... ... ... ... ... ... ... 7929 2021-01-21 23:56:00 2021-01-22 00:00:00 199863 H2230229 FBCUL Fluid in Blood Culture Bottles ... AGEN gentamicin DD resistant NaN Y 7930 2021-01-21 23:56:00 2021-01-22 00:00:00 199863 H2230229 FBCUL Fluid in Blood Culture Bottles ... AMER meropenem DD sensitive NaN Y 7931 2021-01-21 23:56:00 2021-01-22 00:00:00 199863 H2230229 FBCUL Fluid in Blood Culture Bottles ... ATAZ piperacillin-tazobactam DD sensitive NaN N 7932 2021-01-21 23:56:00 2021-01-22 00:00:00 199863 H2230229 FBCUL Fluid in Blood Culture Bottles ... ATEM temocillin DD sensitive NaN N 7933 2021-01-21 23:56:00 2021-01-22 00:00:00 199863 H2230229 FBCUL Fluid in Blood Culture Bottles ... ATIG tigecycline DD sensitive NaN N [3770034 rows x 15 columns] Columns: Index(['date_received', 'date_outcome', 'patient_id', 'laboratory_number', 'specimen_code', 'specimen_name', 'specimen_description', 'microorganism_code', 'microorganism_name', 'antimicrobial_code', 'antimicrobial_name', 'sensitivity_method', 'sensitivity', 'mic', 'reported'], dtype='object') Dtypes: date_received datetime64[ns] date_outcome object patient_id object laboratory_number object specimen_code object specimen_name object specimen_description object microorganism_code object microorganism_name object antimicrobial_code object antimicrobial_name object sensitivity_method object sensitivity object mic object reported object dtype: object .. GENERATED FROM PYTHON SOURCE LINES 110-116 Independent Time Intervals (ITI) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This is the traditional method used in antimicrobial surveillance systems where the time spans considered are independent; that is, they do not overlap (e.g. monthly time series - |1M1| or yearly timeseries - |12M1|). .. GENERATED FROM PYTHON SOURCE LINES 117-176 .. code-block:: default :lineno-start: 118 # ------------------------------------------- # Compute ITI sari (temporal) # ------------------------------------------- from pyamr.core.sari import SARI # Create SARI instance sar = SARI(groupby=['specimen_code', 'microorganism_code', 'antimicrobial_code', 'sensitivity']) # Create constants shift, period = '30D', '30D' # Compute sari timeseries iti = sar.compute(data, shift=shift, period=period, cdate='date_received') # Reset index iti = iti.reset_index() # -------------- # Plot # -------------- # Create figure fig, axes = plt.subplots(2, 1, sharex=True, gridspec_kw={'height_ratios': [2, 1]}) axes = axes.flatten() # Plot line sns.lineplot(x=iti.date_received, y=iti.sari, palette="tab10", linewidth=0.75, linestyle='--', marker='o', markersize=3, markeredgecolor='k', markeredgewidth=0.5, markerfacecolor=None, alpha=0.5, ax=axes[0]) # Compute widths widths = [d.days for d in np.diff(iti.date_received.tolist())] # Plot bars axes[1].bar(x=iti.date_received, height=iti.freq, width=.8*widths[0], linewidth=0.75, alpha=0.5) # Configure axes[0].set(ylim=[-0.1, 1.1], title='Time-series $%s_{%s}$' % (shift, period)) # Despine sns.despine(bottom=True) # Tight layout plt.tight_layout() # Show print("\nTemporal (ITI):") print(iti) .. image-sg:: /_examples/indexes/images/sphx_glr_plot_sari_d_temporal_001.png :alt: Time-series $30D_{30D}$ :srcset: /_examples/indexes/images/sphx_glr_plot_sari_d_temporal_001.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none Temporal (ITI): specimen_code microorganism_code antimicrobial_code date_received intermediate resistant sensitive freq sari 0 URICUL ECOL AAUG 2009-01-03 0.0 5.0 32.0 37.0 0.1351 1 URICUL ECOL AAUG 2009-02-02 0.0 7.0 107.0 114.0 0.0614 2 URICUL ECOL AAUG 2009-03-04 0.0 26.0 904.0 930.0 0.0280 3 URICUL ECOL AAUG 2009-04-03 0.0 23.0 812.0 835.0 0.0275 4 URICUL ECOL AAUG 2009-05-03 0.0 48.0 839.0 887.0 0.0541 .. ... ... ... ... ... ... ... ... ... 81 URICUL ECOL AAUG 2015-08-30 3.0 72.0 606.0 681.0 0.1101 82 URICUL ECOL AAUG 2015-09-29 0.0 97.0 586.0 683.0 0.1420 83 URICUL ECOL AAUG 2015-10-29 2.0 117.0 606.0 725.0 0.1641 84 URICUL ECOL AAUG 2015-11-28 0.0 85.0 658.0 743.0 0.1144 85 URICUL ECOL AAUG 2015-12-28 0.0 32.0 196.0 228.0 0.1404 [86 rows x 9 columns] .. GENERATED FROM PYTHON SOURCE LINES 177-189 Overlapping Time Intervals (OTI) ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ This method is defined as a fixed region which is moved across time to compute consecutive resistance indexes. It is described by two parameters |SP| where ``period`` denotes the length of the window and ``shift`` the distance between consecutive windows. This approach it is more versatile and allows to include larger number of susceptibility tests when computing ``sari``. Therefore, this is useful in scenarios in which pairs do not have a large number of records in the dataset. Note how the ``sari`` values are now larger than in the previous example!! .. GENERATED FROM PYTHON SOURCE LINES 190-240 .. code-block:: default :lineno-start: 191 # ------------------------------------------- # Compute OTI sari (temporal) # ------------------------------------------- # Variables shift, period = '30D', '180D' # Compute sari timeseries oti = sar.compute(data, shift=shift, period=period, cdate='date_received') # Reset index oti = oti.reset_index() # -------------- # Plot # -------------- # Create figure fig, axes = plt.subplots(2, 1, sharex=True, gridspec_kw={'height_ratios': [2, 1]}) axes = axes.flatten() # Plot line sns.lineplot(x=oti.date_received, y=oti.sari, palette="tab10", linewidth=0.75, linestyle='--', marker='o', markersize=3, markeredgecolor='k', markeredgewidth=0.5, markerfacecolor=None, alpha=0.5, ax=axes[0]) # Compute widths widths = [d.days for d in np.diff(oti.date_received.tolist())] # Plot bars axes[1].bar(x=oti.date_received, height=oti.freq, width=.8*widths[0], linewidth=0.75, alpha=0.5) # Configure axes[0].set(ylim=[-0.1, 1.1], title='Time-series $%s_{%s}$' % (shift, period)) # Despine sns.despine(bottom=True) # Tight layout plt.tight_layout() # Show print("\nTemporal (OTI):") print(oti) .. image-sg:: /_examples/indexes/images/sphx_glr_plot_sari_d_temporal_002.png :alt: Time-series $30D_{180D}$ :srcset: /_examples/indexes/images/sphx_glr_plot_sari_d_temporal_002.png :class: sphx-glr-single-img .. rst-class:: sphx-glr-script-out .. code-block:: none Temporal (OTI): specimen_code microorganism_code antimicrobial_code date_received intermediate resistant sensitive freq sari 0 URICUL ECOL AAUG 2009-01-03 0.0 5.0 32.0 37.0 0.1351 1 URICUL ECOL AAUG 2009-02-02 0.0 12.0 139.0 151.0 0.0795 2 URICUL ECOL AAUG 2009-03-04 0.0 38.0 1043.0 1081.0 0.0352 3 URICUL ECOL AAUG 2009-04-03 0.0 61.0 1855.0 1916.0 0.0318 4 URICUL ECOL AAUG 2009-05-03 0.0 109.0 2694.0 2803.0 0.0389 .. ... ... ... ... ... ... ... ... ... 81 URICUL ECOL AAUG 2015-08-30 8.0 479.0 2999.0 3486.0 0.1397 82 URICUL ECOL AAUG 2015-09-29 5.0 479.0 3006.0 3490.0 0.1387 83 URICUL ECOL AAUG 2015-10-29 7.0 494.0 3098.0 3599.0 0.1392 84 URICUL ECOL AAUG 2015-11-28 7.0 474.0 3279.0 3760.0 0.1279 85 URICUL ECOL AAUG 2015-12-28 7.0 452.0 3103.0 3562.0 0.1289 [86 rows x 9 columns] .. GENERATED FROM PYTHON SOURCE LINES 241-275 Important considerations ~~~~~~~~~~~~~~~~~~~~~~~~ .. warning :: The rolling(window=w) just applies the function on w size windows. Note however that it does not take into consideration the dates. Thus it could be applying the mean operation on months Jan, Feb, May if there was not data and therefore no entry in the dataframe for March. This issue can be addressed in two different ways: - setting the date_received as index and using w='3M'. - resampling the dataframe so that all date entries appear. Note that depending on the function applied we might want to fill gaps with different values (e.g. NaN, 0, ...) ``Implemented`` As mentioned before, by using the |SP| approach you can define both strategies to generate time-series (ITI and OTI). The ITI strategy is limited in the number of samples that can be used to compute the index and therefore you have to trade between granularity and accuracy whereas the latter is more flexible. For instance, in examples with low number of records ``sari`` might go up from barely 0.1 (in ITI) to 0.4 (in OTI) when more records are used. The most noticeable increase from |1M1| to |1M3|, that is when instead of records for a month we considered records for three months and reached certain stability on |1M6| approximately (ish?). .. note :: - Ideally shift and period same unit (eg. D). - Period should be always larger than shift. .. GENERATED FROM PYTHON SOURCE LINES 276-338 .. code-block:: default :lineno-start: 278 # -------------------------------- # Comparison # -------------------------------- # Configuration shift = '30D' # Create figure f, axes = plt.subplots(2, 2, figsize=(10, 6), sharey=True) axes = axes.flatten() # Loop for period in ['30D', '90D', '180D', '365D']: # Compute sari time-series iti = sar.compute(data, shift=period, period=period, cdate='date_received') oti = sar.compute(data, shift=shift, period=period, cdate='date_received') # Compute rolling mean iti['sari_rolling'] = iti.sari.rolling(window=3, win_type='gaussian', min_periods=1).mean(std=3) oti['sari_rolling'] = oti.sari.rolling(window=3, win_type='gaussian', min_periods=1).mean(std=3) # Plot sns.lineplot(data=iti, x='date_received', y='sari', linewidth=0.75, linestyle='--', ax=axes[0], marker='o', markersize=3, markeredgecolor='k', markeredgewidth=0.5, markerfacecolor=None, alpha=0.5, label='$%sM_{%s}$' % (period, 1)) sns.lineplot(data=oti, x='date_received', y='sari', linewidth=0.75, linestyle='--', ax=axes[1], marker='o', markersize=3, markeredgecolor='k', markeredgewidth=0.5, markerfacecolor=None, alpha=0.5, label='$%s_{%s}$' % (shift, period)) sns.lineplot(data=iti, x='date_received', y='sari_rolling', linewidth=0.75, ax=axes[2], label='$%s_{%s}$ - smooth' % (period, 1)) sns.lineplot(data=oti, x='date_received', y='sari_rolling', linewidth=0.75, ax=axes[3], label='$%s_{%s}$ - smooth' % (shift, period)) # Configure sns.despine(bottom=True) # Configure axes axes[0].set(title='Independent Time Intervals') axes[1].set(title='Overlapping Time Intervals') # Adjust plt.tight_layout() # Show plt.show() .. image-sg:: /_examples/indexes/images/sphx_glr_plot_sari_d_temporal_003.png :alt: Independent Time Intervals, Overlapping Time Intervals :srcset: /_examples/indexes/images/sphx_glr_plot_sari_d_temporal_003.png :class: sphx-glr-single-img .. GENERATED FROM PYTHON SOURCE LINES 339-341 Plotting multiple pairs using FaceGrid. .. GENERATED FROM PYTHON SOURCE LINES 342-377 .. code-block:: default :lineno-start: 343 """ # ---------------------- # Facet Grid # ---------------------- # Show print(aux) # Create palette pal = sns.cubehelix_palette(50, rot=-.25, light=.7) # Create g = sns.FacetGrid(data=aux, col="antimicrobial_code", hue="antimicrobial_code", aspect=1.2, height=2, sharex=True, sharey=True, palette=pal, col_wrap=8) # Plot line g.map(sns.lineplot, "date_received", 'sari', alpha=1, linewidth=1.5) """ """ g.map(plt.fill_between, aux.date_received.values, aux.sari.values) """ """ #g.map(label, "x") #g.set_titles("") #g.set(yticks=[]) g.set(xlabel='date') g.despine(bottom=True, left=True) # Show plt.show() """ .. rst-class:: sphx-glr-script-out .. code-block:: none '\n#g.map(label, "x")\n#g.set_titles("")\n#g.set(yticks=[])\ng.set(xlabel=\'date\')\ng.despine(bottom=True, left=True)\n\n# Show\nplt.show()\n' .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 15.742 seconds) .. _sphx_glr_download__examples_indexes_plot_sari_d_temporal.py: .. only:: html .. container:: sphx-glr-footer sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_sari_d_temporal.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_sari_d_temporal.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_