Step 02 - Temporal evolution

In this example, we will explore how to compute a time series from susceptibility test data and examine different indexes that can be utilized. Through this analysis, we will gain a deeper understanding of how to utilize these indexes to evaluate the evolving patterns of bacterial susceptibility and guide effective antimicrobial therapy strategies.

In order to study the temporal evolution of AMR, it is necessary to generate a resistance time series from the susceptibility test data. This is often achieved by computing the resistance index on consecutive partitions of the data. Note that each partition contains the susceptibility tests required to compute a resistance index. The traditional strategy of dealing with partitions considers independent time intervals (see yearly, monthly or weekly time series in Table 4.2). Unfortunately, this strategy forces to trade-off between granularity (level of detail) and accuracy. On one side, weekly time series are highly granular but inaccurate. On the other hand, yearly time series are accurate but rough. Note that the granularity is represented by the number of observations in a time series while the accuracy is closely related with the number of susceptibility tests used to compute the resistance index. Conversely, the overlapping time intervals strategy drops such dependence by defining a window of fixed size which is moved across time. The length of the window is denoted as period and the time step as shift. For instance, three time series obtained using the overlapping time intervals strategy with a monthly shift (1M) and window lengths of 12, 6 and 3 have been presented for the sake of clarity (see 1M₁₂, 1M₆ and 1M₃ in Table 4.2).

The notation to define the time series generation methodology (SHIFT_period) is described with various examples in Table 4.2. For instance, 1M₁₂ defines a time series with weekly resistance indexes (7D) calculated using the microbiology records available for the previous four weeks (4x7D). It is important to note that some notations are equivalent representations of the same susceptibility data at different granularities, hence their trends are comparable. As an example, the trend estimated for 1M₁ should be approximately thirty times the trend estimated for 1D₃₀.

Note

Using overlapping time intervals to compute an index is better than applying a moving average because it captures more detailed patterns in the data and reduces bias from fixed window sizes. It provides a more comprehensive analysis and improves accuracy in representing the characteristics of the time series.

Loading data

A small dataset will be used for this example.

 # Libraries
 import numpy as np
 import pandas as pd
 import seaborn as sns
 import matplotlib as mpl
 import matplotlib.pyplot as plt

 # Import from pyAMR
 from pyamr.datasets.load import make_susceptibility

 # -------------------------------------------
 # Load data
 # -------------------------------------------
 # Load data
 data = make_susceptibility()
 data = data.drop_duplicates()

 # Convert date to datetime
 data.date_received = pd.to_datetime(data.date_received)

 # Filter (speeds up the execution)
 idxs_spec = data.specimen_code.isin(['URICUL'])
 idxs_abxs = data.antimicrobial_name.isin(['augmentin'])

 # Filter
 data = data[idxs_spec & idxs_abxs]

 # Show
 print("\nData:")
 print(data)
 print("\nColumns:")
 print(data.dtypes)

Data:
       date_received  date_outcome  patient_id laboratory_number specimen_code  specimen_name specimen_description  ... microorganism_name antimicrobial_code antimicrobial_name sensitivity_method  sensitivity mic  reported
113       2009-01-03           NaN       20099           X428892        URICUL            NaN                urine  ...           coliform               AAUG          augmentin                NaN    sensitive NaN       NaN
119       2009-01-03           NaN       20100           X429141        URICUL            NaN       catheter urine  ...           coliform               AAUG          augmentin                NaN    sensitive NaN       NaN
164       2009-01-03           NaN       22571           X429323        URICUL            NaN     mid stream urine  ...           coliform               AAUG          augmentin                NaN    sensitive NaN       NaN
268       2009-01-03           NaN       22576           X428467        URICUL            NaN     mid stream urine  ...   escherichia coli               AAUG          augmentin                NaN    sensitive NaN       NaN
293       2009-01-03           NaN       24017           X429325        URICUL            NaN    clean catch urine  ...   escherichia coli               AAUG          augmentin                NaN    sensitive NaN       NaN
...              ...           ...         ...               ...           ...            ...                  ...  ...                ...                ...                ...                ...          ...  ..       ...
318940    2009-12-31           NaN       20088          H2011867        URICUL            NaN     mid stream urine  ...   escherichia coli               AAUG          augmentin                NaN    sensitive NaN       NaN
318946    2009-12-31           NaN       20089          H2012653        URICUL            NaN     mid stream urine  ...   escherichia coli               AAUG          augmentin                NaN    sensitive NaN       NaN
318983    2009-12-31           NaN       22565          F1741389        URICUL            NaN     mid stream urine  ...   escherichia coli               AAUG          augmentin                NaN    sensitive NaN       NaN
319048    2009-12-31           NaN       24013          H2012150        URICUL            NaN                urine  ...       enterococcus               AAUG          augmentin                NaN    sensitive NaN       NaN
319070    2009-12-31           NaN       24015          H2012340        URICUL            NaN     mid stream urine  ...   escherichia coli               AAUG          augmentin                NaN    sensitive NaN       NaN

[15269 rows x 15 columns]

Columns:
date_received           datetime64[ns]
date_outcome                   float64
patient_id                       int64
laboratory_number               object
specimen_code                   object
specimen_name                  float64
specimen_description            object
microorganism_code              object
microorganism_name              object
antimicrobial_code              object
antimicrobial_name              object
sensitivity_method             float64
sensitivity                     object
mic                            float64
reported                       float64
dtype: object

Computing SARI timeseries

In order to study the temporal evolution of AMR, it is necessary to generate a resistance time series from the susceptibility test data. This is often achieved by calculating the resistance index; that is SARI on consecutive partitions of the data. Note that each partition contains the susceptibility tests that will be used to compute the resistance index.

For more information see: pyamr.core.sari.SARI

For more examples see:

SARI - Compute timeseries

First, let’s compute the time series

 # -----------------------------------------
 # Compute  sari (temporal)
 # -----------------------------------------
 from pyamr.core.sari import SARI

 # Create SARI instance
 sar = SARI(groupby=['specimen_code',
                     'microorganism_name',
                     'antimicrobial_name',
                     'sensitivity'])

 # Create constants
 shift, period = '30D', '30D'

 # Compute sari timeseries
 iti = sar.compute(data, shift=shift,
      period=period, cdate='date_received')

 # Reset index
 iti = iti.reset_index()

 # Show
 #print("\nSARI (temporal):")
 #print(iti)

 iti.head(10)

	specimen_code	microorganism_name	antimicrobial_name	date_received	resistant	sensitive	freq	sari
0	URICUL	acinetobacter	augmentin	2009-03-04	0.0	1.0	1.0	0.000000
1	URICUL	acinetobacter	augmentin	2009-05-03	0.0	1.0	1.0	0.000000
2	URICUL	acinetobacter	augmentin	2009-08-01	0.0	1.0	1.0	0.000000
3	URICUL	acinetobacter ba...	augmentin	2009-02-02	1.0	0.0	1.0	1.000000
4	URICUL	acinetobacter ba...	augmentin	2009-04-03	1.0	1.0	2.0	0.500000
5	URICUL	acinetobacter ba...	augmentin	2009-05-03	0.0	1.0	1.0	0.000000
6	URICUL	acinetobacter ba...	augmentin	2009-07-02	1.0	0.0	1.0	1.000000
7	URICUL	acinetobacter ba...	augmentin	2009-08-31	1.0	0.0	1.0	1.000000
8	URICUL	citrobacter	augmentin	2009-06-02	2.0	1.0	3.0	0.666667
9	URICUL	citrobacter	augmentin	2009-07-02	1.0	0.0	1.0	1.000000

Let’s plot the evolution of a single combination.

 # --------------
 # Filter
 # --------------
 # Constants
 s, o, a = 'URICUL', 'escherichia coli', 'augmentin'

 # Filter
 idxs_spec = iti.specimen_code == s
 idxs_orgs = iti.microorganism_name == o
 idxs_abxs = iti.antimicrobial_name == a
 aux = iti[idxs_spec & idxs_orgs & idxs_abxs]

 # --------------
 # Plot
 # --------------
 # Create figure
 fig, axes = plt.subplots(2, 1, sharex=True,
      gridspec_kw={'height_ratios': [2, 1]})
 axes = axes.flatten()

 # Plot line
 sns.lineplot(x=aux.date_received, y=aux.sari,
     linewidth=0.75, linestyle='--', #palette="tab10",
     marker='o', markersize=3, markeredgecolor='k',
     markeredgewidth=0.5, markerfacecolor=None,
     alpha=0.5, ax=axes[0])

 # Compute widths
 widths = [d.days for d in np.diff(aux.date_received.tolist())]

 # Plot bars
 axes[1].bar(x=aux.date_received, height=aux.freq,
     width=.8*widths[0], linewidth=0.75, alpha=0.5)

 # Configure
 axes[0].set(ylim=[-0.1, 1.1],
     title='[%s, %s, %s] with $%s_{%s}$' % (
         s, o.upper(), a.upper(), shift, period))

 # Despine
 sns.despine(bottom=True)

 # Tight layout
 plt.tight_layout()

 # Show
 #print("\nTemporal (ITI):")
 #print(aux)
 aux

$[URICUL, ESCHERICHIA COLI, AUGMENTIN] with $30D_{30D}$$

	specimen_code	microorganism_name	antimicrobial_name	date_received	resistant	sensitive	freq	sari
52	URICUL	escherichia coli	augmentin	2009-01-03	5.0	32.0	37.0	0.135135
53	URICUL	escherichia coli	augmentin	2009-02-02	7.0	107.0	114.0	0.061404
54	URICUL	escherichia coli	augmentin	2009-03-04	26.0	904.0	930.0	0.027957
55	URICUL	escherichia coli	augmentin	2009-04-03	23.0	812.0	835.0	0.027545
56	URICUL	escherichia coli	augmentin	2009-05-03	48.0	839.0	887.0	0.054115
57	URICUL	escherichia coli	augmentin	2009-06-02	64.0	861.0	925.0	0.069189
58	URICUL	escherichia coli	augmentin	2009-07-02	55.0	815.0	870.0	0.063218
59	URICUL	escherichia coli	augmentin	2009-08-01	58.0	811.0	869.0	0.066743
60	URICUL	escherichia coli	augmentin	2009-08-31	59.0	928.0	987.0	0.059777
61	URICUL	escherichia coli	augmentin	2009-09-30	53.0	924.0	977.0	0.054248
62	URICUL	escherichia coli	augmentin	2009-10-30	99.0	795.0	894.0	0.110738
63	URICUL	escherichia coli	augmentin	2009-11-29	98.0	648.0	746.0	0.131367
64	URICUL	escherichia coli	augmentin	2009-12-29	7.0	85.0	92.0	0.076087

Computing ASAI timeseries

Warning

It is important to take into account that computing this index, specially over a period of time, requires a lot of consistent data. Ideally, all the species for the genus of interest should appear on all the time periods.

Once we have computed SARI on a temporal fashion, it is possible to use such information to compute ASAI in a temporal fashion too. However, as explained in the previous tutorial, in order to compute ASAI, we need to at least have columns with the following information:

antimicrobial

microorganism genus

microorganism species

resistance

Moreover, in this example we will compute the ASAI for each gram_stain category independently so we will need the microorganism gram stain information too. This information is available in the registries: pyamr.datasets.registries.

Lets include all this information using the MicroorganismRegistry.

 # ------------------------------
 # Include gram stain
 # ------------------------------
 # Libraries
 from pyamr.datasets.registries import MicroorganismRegistry

 # Load registry
 mreg = MicroorganismRegistry()

 # Format sari dataframe
 dataframe = iti.copy(deep=True)
 dataframe = dataframe.reset_index()

 # Create genus and species
 dataframe[['genus', 'species']] = \
     dataframe.microorganism_name \
         .str.capitalize() \
         .str.split(expand=True, n=1)

 # Combine with registry information
 dataframe = mreg.combine(dataframe, on='microorganism_name')

 # Fill missing gram stain
 dataframe.gram_stain = dataframe.gram_stain.fillna('u')

 # Show
 dataframe.head(4).T

	0	1	2	3
index	0	1	2	3
specimen_code	URICUL	URICUL	URICUL	URICUL
microorganism_name	acinetobacter	acinetobacter	acinetobacter	acinetobacter ba...
antimicrobial_name	augmentin	augmentin	augmentin	augmentin
date_received	2009-03-04 00:00:00	2009-05-03 00:00:00	2009-08-01 00:00:00	2009-02-02 00:00:00
resistant	0.0	0.0	0.0	1.0
sensitive	1.0	1.0	1.0	0.0
freq	1.0	1.0	1.0	1.0
sari	0.0	0.0	0.0	1.0
genus	Acinetobacter	Acinetobacter	Acinetobacter	Acinetobacter
species	None	None	None	baumannii
domain	Bacteria	Bacteria	Bacteria	Bacteria
phylum	Proteobacteria	Proteobacteria	Proteobacteria	Proteobacteria
class	Gammaproteobacteria	Gammaproteobacteria	Gammaproteobacteria	Gammaproteobacteria
order	Pseudomonadales	Pseudomonadales	Pseudomonadales	Pseudomonadales
family	Moraxellaceae	Moraxellaceae	Moraxellaceae	Moraxellaceae
acronym	ACINETOBACTER	ACINETOBACTER	ACINETOBACTER	ACIN_BAUM
gram_stain	n	n	n	n
exists_in_registry	True	True	True	True

Now that we have the genus, species and gram_stain information, lets see how to compute ASAI in a temporal fashion with an example. It is important to highlight that now the date (date_received) is also included in the groupby parameter when calling the compute method.

For more information see: pyamr.core.asai.ASAI

For more examples see:

ASAI - Compute timeseries

 # -------------------------------------------
 # Compute ASAI
 # -------------------------------------------
 # Import specific libraries
 from pyamr.core.asai import ASAI

 # Create asai instance
 asai = ASAI(column_genus='genus',
             column_specie='species',
             column_resistance='sari',
             column_frequency='freq')

 # Compute
 scores = asai.compute(dataframe,
     groupby=['date_received',
              'specimen_code',
              'antimicrobial_name',
              'gram_stain'],
     weights='uniform',
     threshold=0.5,
     min_freq=0)

 # Stack
 scores = scores

 # Show
 print("\nASAI (overall):")
 print(scores.unstack())
 scores.unstack()

c:\users\kelda\desktop\repositories\github\pyamr\main\pyamr\core\asai.py:572: UserWarning:


                 Extreme resistances [0, 1] were found in the DataFrame. These
                 rows should be reviewed since these resistances might correspond
                 to pairs with low number of records.


c:\users\kelda\desktop\repositories\github\pyamr\main\pyamr\core\asai.py:583: UserWarning:


                 There are NULL values in columns that are required. These
                 rows will be ignored to safely compute ASAI. Please review
                 the DataFrame and address this inconsistencies. See below
                 for more information:

                        date_received          0
                        specimen_code          0
                        antimicrobial_name     0
                        gram_stain             0
                        GENUS                  0
                        SPECIE                89
                        RESISTANCE             0



ASAI (overall):
                                               N_GENUS      N_SPECIE      ASAI_SCORE
gram_stain                                           n    p        n    p          n     p
date_received specimen_code antimicrobial_name
2009-01-03    URICUL        augmentin              2.0  2.0      2.0  3.0   0.500000  1.00
2009-02-02    URICUL        augmentin              2.0  2.0      2.0  3.0   0.500000  0.75
2009-03-04    URICUL        augmentin              1.0  2.0      1.0  5.0   1.000000  1.00
2009-04-03    URICUL        augmentin              3.0  2.0      3.0  6.0   0.333333  1.00
2009-05-03    URICUL        augmentin              3.0  2.0      3.0  5.0   0.666667  1.00
2009-06-02    URICUL        augmentin              3.0  2.0      3.0  4.0   0.333333  1.00
2009-07-02    URICUL        augmentin              4.0  2.0      4.0  6.0   0.250000  1.00
2009-08-01    URICUL        augmentin              1.0  2.0      1.0  5.0   1.000000  1.00
2009-08-31    URICUL        augmentin              2.0  2.0      2.0  4.0   0.500000  1.00
2009-09-30    URICUL        augmentin              1.0  2.0      1.0  6.0   1.000000  1.00
2009-10-30    URICUL        augmentin              3.0  2.0      3.0  5.0   0.333333  0.75
2009-11-29    URICUL        augmentin              2.0  2.0      2.0  4.0   0.500000  1.00
2009-12-29    URICUL        augmentin              1.0  2.0      1.0  4.0   1.000000  1.00

			N_GENUS		N_SPECIE		ASAI_SCORE
		gram_stain	n	p	n	p	n	p
date_received	specimen_code	antimicrobial_name
2009-01-03	URICUL	augmentin	2.0	2.0	2.0	3.0	0.500000	1.00
2009-02-02	URICUL	augmentin	2.0	2.0	2.0	3.0	0.500000	0.75
2009-03-04	URICUL	augmentin	1.0	2.0	1.0	5.0	1.000000	1.00
2009-04-03	URICUL	augmentin	3.0	2.0	3.0	6.0	0.333333	1.00
2009-05-03	URICUL	augmentin	3.0	2.0	3.0	5.0	0.666667	1.00
2009-06-02	URICUL	augmentin	3.0	2.0	3.0	4.0	0.333333	1.00
2009-07-02	URICUL	augmentin	4.0	2.0	4.0	6.0	0.250000	1.00
2009-08-01	URICUL	augmentin	1.0	2.0	1.0	5.0	1.000000	1.00
2009-08-31	URICUL	augmentin	2.0	2.0	2.0	4.0	0.500000	1.00
2009-09-30	URICUL	augmentin	1.0	2.0	1.0	6.0	1.000000	1.00
2009-10-30	URICUL	augmentin	3.0	2.0	3.0	5.0	0.333333	0.75
2009-11-29	URICUL	augmentin	2.0	2.0	2.0	4.0	0.500000	1.00
2009-12-29	URICUL	augmentin	1.0	2.0	1.0	4.0	1.000000	1.00

Let’s plot the evolution for both stains.

 # Libraries
 import calendar

 # Month numbers to abbr
 def month_abbr(v):
     return [calendar.month_abbr[x] for x in v]

 # --------------
 # Filter
 # --------------
 #
 s, a = 'URICUL', 'augmentin'
 # Filter and drop index.
 scores = scores.filter(like=s, axis=0)
 scores = scores.filter(like=a, axis=0)
 scores.index = scores.index.droplevel(level=[1,2])

 # ----------
 # Plot
 # ----------
 # Initialize the matplotlib figure
 f, ax = plt.subplots(1, figsize=(10, 5))

 # Show
 sns.lineplot(data=scores, x='date_received', y='ASAI_SCORE',
              hue='gram_stain', palette="tab10", linewidth=0.75,
              linestyle='--', marker='o', markersize=3,
              markeredgecolor='k', markeredgewidth=0.5,
              markerfacecolor=None, alpha=0.5, ax=ax)#, ax=axes[0])

 # Create aux table for visualization
 aux = scores[['N_GENUS', 'N_SPECIE']] \
      .unstack().T.round(0) \
      .astype(str).replace({'nan': '-'})

 # Rename columns
 #aux.columns = month_abbr(range(1, len(aux.columns)+1))

 # Draw table
 table = ax.table(cellText=aux.to_numpy(),
                  rowLabels=aux.index,
                  colLabels=aux.columns.date,
                  cellLoc='center',
                  loc='bottom')
 table.auto_set_font_size(False)
 table.set_fontsize(7.5)
 table.scale(1, 1.2)

 # Sns config
 sns.despine(left=True, bottom=True)

 # Add a legend and informative axis label
 ax.set(xlabel='', ylabel='ASAI', xticks=[],
        title="[%s, %s] with $%s_{%s}$" % (
         s, a.upper(), shift, period))


 # Tight layout()
 plt.tight_layout()

 # Show
 plt.show()

 # Show
 #print("\nASAI (overall):")
 #print(scores.unstack())

 scores.unstack()

$[URICUL, AUGMENTIN] with $30D_{30D}$$

	N_GENUS		N_SPECIE		ASAI_SCORE
gram_stain	n	p	n	p	n	p
date_received
2009-01-03	2.0	2.0	2.0	3.0	0.500000	1.00
2009-02-02	2.0	2.0	2.0	3.0	0.500000	0.75
2009-03-04	1.0	2.0	1.0	5.0	1.000000	1.00
2009-04-03	3.0	2.0	3.0	6.0	0.333333	1.00
2009-05-03	3.0	2.0	3.0	5.0	0.666667	1.00
2009-06-02	3.0	2.0	3.0	4.0	0.333333	1.00
2009-07-02	4.0	2.0	4.0	6.0	0.250000	1.00
2009-08-01	1.0	2.0	1.0	5.0	1.000000	1.00
2009-08-31	2.0	2.0	2.0	4.0	0.500000	1.00
2009-09-30	1.0	2.0	1.0	6.0	1.000000	1.00
2009-10-30	3.0	2.0	3.0	5.0	0.333333	0.75
2009-11-29	2.0	2.0	2.0	4.0	0.500000	1.00
2009-12-29	1.0	2.0	1.0	4.0	1.000000	1.00

Further considerations

Warning

Pending!

Total running time of the script: ( 0 minutes 1.421 seconds)

Gallery generated by Sphinx-Gallery