Note
Click here to download the full example code
Drug Resistance Index
The DRI
measures changes through time in the proportion of disease-causing
pathogens that are resistant to the antibiotics commonly used to treat them.
The annual percentage change in the DRI
is a measure of the rate of depletion
of antibiotic effectiveness.
Since antibiotic use may change over time in response to changing levels of
antibiotic resistance, we compare trends in the index with the counterfactual
case, where antibiotic use remains fixed to a baseline year. A static-use DRI
allows assessment of the extent to which drug use has adapted in response to
resistance and the burden that this resistance would have caused if antibiotic
use patterns had not changed. Changing antibiotic use patterns over time may
mitigate the burden of antibiotic resistance. To incorporate changing trends
in antibiotic use, we also construct an adaptive version of the DRI
.
where
t is time
i is the microorganism
k is the antimicrobial
p is the proportion of resistance among organism i to drug k at time t
q is the frequency of drug k used to treat organism in at time t.
See below for a few resources.
51 # Libraries
52 import numpy as np
53 import pandas as pd
54 import matplotlib.pyplot as plt
55
56 from pathlib import Path
57
58 def print_example_heading(n, t=''):
59 print("\n" + "=" * 80 + "\nExample %s\n"%n + "=" * 80)
60
61 try:
62 __file__
63 TERMINAL = True
64 except:
65 TERMINAL = False
a) Example using CDDEP (tutorial)
The Drug Resistance Index or ``DRI``can be calculated at the facility or individual ward level to measure the performance of stewardship efforts. The index can be calculated in three simple steps using the pharmacy and antibiogram data available in most hospitals.
The CDDEP recommends to aggregate susceptibility tests results by species and class of antimicrobials (see note). Laboratories and hospital antibiograms report susceptibility results by individual antibiotic compound. However, certain drugs may be widely tested to proxy for the effectiveness of an entire class in vitro, although they are rarely used in clinical practice (e.g. oxacillin tests for all b-lactam activity against staphylococci, nalidixic axic for quinolone activity in Gram-negative uropathogens). In addition, not all drugs on formulary may be included in laboratory testing. Therefore, when matching resistance data to antibiotic utilization, we recommend grouping results by broader therapeutic class. The optimal grouping will depend on many factors such as the species being included in the index, facility formulary and testing practices etc.
89 # This sample data was copied from the CDDEP reference (R1) to
90 # evaluate whether the Drug Resistance Index is being computed
91 # correctly.
92 cddep = [
93 ['Q2 2011', 'E. Coli', 'AMINOPENICILLINS', 329, 139, 300, 1000],
94 ['Q2 2011', 'E. Coli', 'B-LACTAM, INCREASED ACTIVITY', 554, 72, 250, 1000],
95 ['Q2 2011', 'E. Coli', 'G3 CEPHALOSPORINS', 287, 3, 100, 1000],
96 ['Q2 2011', 'E. Coli', 'QUINOLONES, SYSTEMIC', 293, 41, 250, 1000],
97 ['Q2 2011', 'E. Coli', 'SULFA & TRIMETH COMB', 334, 75, 100, 1000],
98 ['Q3 2011', 'E. Coli', 'AMINOPENICILLINS', 231, 101, 250, 1100],
99 ['Q3 2011', 'E. Coli', 'B-LACTAM, INCREASED ACTIVITY', 408, 54, 300, 1100],
100 ['Q3 2011', 'E. Coli', 'G3 CEPHALOSPORINS', 211, 3, 150, 1100],
101 ['Q3 2011', 'E. Coli', 'QUINOLONES, SYSTEMIC', 218, 36, 300, 1100],
102 ['Q3 2011', 'E. Coli', 'SULFA & TRIMETH COMB', 236, 55, 100, 1100]
103 ]
104
105 # Create DataFrame
106 df1 = pd.DataFrame(cddep,
107 columns=['time', 'org', 'abx', 'isolates', 'R', 'use', 'use_period'])
108
109 # Format DataFrame
110 df1['S'] = df1.isolates - df1.R
111 df1['r_rate'] = (df1.R / (df1.R + df1.S)) #.round(decimals=2)
112 df1['u_weight'] = (df1.use / df1.use_period) #.round(decimals=2)
113 df1['w_rate'] = (df1.r_rate * df1.u_weight) #.round(decimals=3)
114 df1['dri'] = df1.groupby(by='time').w_rate.transform(lambda x: x.sum())
Lets see the results
118 if TERMINAL:
119 print_example_heading(n=1)
120 print('\nResult (CDDEP):')
121 print(df1)
122 df1.round(decimals=3)
b) Example using CDDEP (raw)
In the previous example we computed the Drug Resistance Index or DRI
using the summary table provided by CDDEP. While this table is useful
to understand the implementation of the Drug Resistance Index, the
values contained in the columns (e.g. isolates, R, use or use_period)
need to be computed from the raw susceptibility test data. This example
shows how to go from raw susceptibility test and prescription data to
the summary table in order to compute the DRI
.
First, let’s create the raw data from the summary table provided by CDDEP.
141 def create_raw_data_from_summary_table(df):
142 """Create raw susceptibility and prescription data from summary.
143
144 Parameters
145 ----------
146 df: pd.DataFrame
147 DataFrame with the summary data
148
149 Returns
150 -------
151
152 """
153 # Variable
154 keep = ['time', 'org', 'abx']
155
156 # Indexes
157 idxr = df.index.repeat(df.R)
158 idxs = df.index.repeat(df.S)
159 idxu = df.index.repeat(df.use)
160
161 # Compute partial DataFrames
162 r = df[keep].reindex(idxr).assign(sensitivity='resistant')
163 s = df[keep].reindex(idxs).assign(sensitivity='sensitive')
164 u = df[keep].reindex(idxu).assign(dose='standard')
165
166 # Return
167 return pd.concat([r, s]), u
168
169
170 # .. note:: Uncomment if you need to create a simulation of
171 # the raw susceptibility and prescription data based
172 # on the aggregated measurements.
173
174 # Create raw data.
175 susceptibility, prescription = \
176 create_raw_data_from_summary_table(df1)
177
178 # Save
179 #susceptibility.to_csv('./data/cddep/susceptibility.csv')
180 #prescription.to_csv('./data/cddep/prescription.csv')
Lets visualise microbiology data
184 if TERMINAL:
185 print_example_heading(n=2)
186 print('\nSusceptibility:')
187 print(susceptibility)
188 susceptibility.head(5)
Lets visualise the prescription data
192 if TERMINAL:
193 print('\nPrescription:')
194 print(prescription)
195 prescription.head(5)
Lets compute the DRI
200 # -------------------------------------------
201 # Compute SARI
202 # -------------------------------------------
203 # Libraries
204 from pyamr.core.sari import SARI
205
206 # Create sari instance
207 sari = SARI(groupby=['time',
208 'org',
209 'abx',
210 'sensitivity'])
211
212 # Compute SARI
213 df2 = sari.compute(susceptibility, return_frequencies=True)
214
215 # -------------------------------------------
216 # Compute DRI
217 # -------------------------------------------
218
219 def compute_drug_resistance_index(summry, groupby,
220 cu='use', cr='sari', ct='time'):
221 """Computes the Drug Resistance Index
222
223 Parameters
224 ----------
225 summry: pd.DataFrame
226 The summary DataFrame with columns including use,
227 resistance and time.
228 groupby:
229 The elements to groupby as described in pandas.
230 cu: str
231 Column name with use
232 cr: str
233 Column name with resistance
234 ct: str
235 Column name with time
236
237 Returns
238 -------
239 """
240 # Clone matrix
241 m = summry.copy(deep=True)
242
243 # Compute
244 m['use_period'] = m \
245 .groupby(level=groupby)[cu] \
246 .transform(lambda x: x.sum())
247 m['u_weight'] = (m[cu] / m.use_period) # .round(decimals=2)
248 m['w_rate'] = (m[cr] * m.u_weight) # .round(decimals=3)
249 m['dri'] = m \
250 .groupby(by=ct).w_rate \
251 .transform(lambda x: x.sum())
252
253 # Return
254 return m
255
256
257
258 # Compute the number of prescriptions
259 aux = prescription \
260 .groupby(by=['time', 'org', 'abx']) \
261 .dose.count().rename('use')
262
263 # Merge susceptibility and prescription data
264 df2 = df2.merge(aux, how='inner',
265 left_on=['time', 'org', 'abx'],
266 right_on=['time', 'org', 'abx'])
267
268 # Compute drug resistance index
269 df2 = compute_drug_resistance_index(df2, groupby=0)
Lets see the results
274 if TERMINAL:
275 print('\nResult (CDDEP):')
276 print(df1)
277 df2.round(decimals=3)
c) Example using MIMIC
In this example, we will compute the Drug Resistance Index or DRI
using
the freely-available MIMIC
database which provides both susceptibility
test data and prescriptions for a large number of patients admitted from
2008 to 2019. Please note that for de-identification purpose the dates were
shifted into the future.
Note
In this example we will be counting each entry in the prescription data as one ‘standard’ dose for clarity. However, this could be implemented more accurately by computing for example the complete dosage.
First, lets load the susceptibility and prescription data
300 # subset
301 subset = [
302 'time',
303 'laboratory_number',
304 'specimen_code',
305 'microorganism_code',
306 'antimicrobial_code',
307 'sensitivity'
308 ]
309
310 # -----------------------------
311 # Load susceptibility test data
312 # -----------------------------
313 # Load data
314 path = Path('./data/mimic')
315 filename = 'microbiologyevents.csv'
316
317 if (path / filename).exists():
318 data = pd.read_csv(path / 'microbiologyevents.csv')
319
320 # Rename columns
321 data = data.rename(columns={
322 'chartdate': 'time',
323 'micro_specimen_id': 'laboratory_number',
324 'spec_type_desc': 'specimen_code',
325 'org_name': 'microorganism_code',
326 'ab_name': 'antimicrobial_code',
327 'interpretation': 'sensitivity'
328 })
329
330 # Format data
331 data = data[subset]
332 data = data.dropna(subset=subset, how='any')
333 data.time = pd.to_datetime(data.time)
334 data.sensitivity = data.sensitivity.replace({
335 'S': 'sensitive',
336 'R': 'resistant',
337 'I': 'intermediate',
338 'P': 'pass'
339 })
340
341
342 # ------------------
343 # Load prescriptions
344 # ------------------
345 # Load prescription data (limited to first nrows).
346 use = pd.read_csv(path / 'prescriptions.csv', nrows=100000)
347
348 # Keep prescriptions which have also been tested
349 aux = use.copy(deep=True)
350 aux.drug = aux.drug.str.upper()
351 aux.starttime = pd.to_datetime(aux.starttime)
352 aux = aux[aux.drug.isin(data.antimicrobial_code.unique())]
353
354 # Rename variables
355 susceptibility, prescription = data, aux
356
357 #%%
358 # Lets visualise microbiology data
359 if TERMINAL:
360 print_example_heading(n=3)
361 print('\nSusceptibility:')
362 print(susceptibility)
363 susceptibility.head(5)
364
365 #%%
366 # Lets visualise the prescription data
367 if TERMINAL:
368 print('\nPrescription:')
369 print(prescription)
370 prescription.head(5)
371
372
373 # %%
374 # Lets compute the DRI
375
376 # -------------------------------------------
377 # Compute SARI
378 # -------------------------------------------
379
380 # .. note:: These are some possible options to group
381 # data by a datetime column.
382 #
383 # by year: data.time.dt.year
384 # by quarter: data.time.dt.to_period('Q')
385 # by decade: pd.Grouper(key='time', freq='10AS')
386 # by century: data.time.dt.year // 100
387
388
389 # Libraries
390 from pyamr.core.sari import SARI
391
392 # Create sari instance
393 sari = SARI(groupby=[data.time.dt.year,
394 #'specimen_code',
395 'microorganism_code',
396 'antimicrobial_code',
397 'sensitivity'])
398
399 # Compute SARI
400 df3 = sari.compute(susceptibility, return_frequencies=True)
401
402 # .. note: Uncomment this line to create random use if the
403 # prescription data is not available and we want
404 # to simulate it.
405
406 # Create random use
407 #df3['use'] = np.random.randint(10, 600, df3.shape[0])
408
409 # Compute the number of prescriptions
410 aux = prescription \
411 .groupby(by=[aux.starttime.dt.year, 'drug']) \
412 .drug.count().rename('use')
413 aux.index.names = ['time', 'antimicrobial_code']
414
415 # Merge susceptibility and prescription data
416 df3 = df3.reset_index().merge(aux.reset_index(),
417 how='inner',
418 left_on=['time', 'antimicrobial_code'],
419 right_on=['time', 'antimicrobial_code'])
420
421 # Format the summary DataFrame
422 df3 = df3.set_index(['time', 'microorganism_code', 'antimicrobial_code'])
423
424 # Compute drug resistance index
425 df3 = compute_drug_resistance_index(df3, groupby=0)
426
427 #%%
428 # Lets see the results
429 if TERMINAL:
430 print('\nResult (MIMIC):')
431 print(df1)
432 df3[['freq',
433 'sari',
434 'use',
435 'use_period',
436 'u_weight',
437 'w_rate',
438 'dri']].round(decimals=3)
439
440
441
442 #%%
443 # Lets display DRI over time
444 #
445
446 # Libraries
447 import matplotlib.pyplot as plt
448 import seaborn as sns
449
450 # Get first element by level.
451 df4 = df3.groupby(level=[0]).first().reset_index()
452
453 # Display using lineplot
454 #sns.lineplot(data=df4.dri)
455
456 # Display using plt.plot.
457 #plt.plot(df4.index, df4.dri, linewidth=0.75, markersize=3, marker='o')
458
459 # Display using relplot
460 sns.relplot(data=df4.reset_index(), x='time', y='dri',
461 #hue='event', style='event', col='region, palette='palette',
462 height=4, aspect=3.0, kind='line',
463 linewidth=0.75, markersize=3, marker='o'
464 )
465
466 # Show
467 plt.show()
468
469 #%%
470 # .. note:: In ``MIMIC``, the deidentification process for structured data required
471 # the removal of dates. In particular, dates were shifted into the future
472 # by a random offset for each individual patient in a consistent manner to
473 # preserve intervals, resulting in stays which occur sometime between the
474 # years 2100 and 2200. Time of day, day of the week, and approximate
475 # seasonality were conserved during date shifting.
476
477 ########################################################################
478 # d) Extra code
479
480
481 """
482 # Define data
483 data = [
484
485 ['o1', 'a1', 2000, 0, 100, 10],
486 ['o1', 'a2', 2000, 0, 100, 9],
487 ['o1', 'a3', 2000, 0, 100, 8],
488
489 ['o1', 'a1', 2001, 50, 50, 10],
490 ['o1', 'a2', 2001, 60, 40, 9],
491 ['o1', 'a3', 2001, 70, 30, 8],
492
493 ['o1', 'a1', 2002, 50, 50, 10],
494 ['o1', 'a2', 2002, 60, 40, 10],
495 ['o1', 'a3', 2002, 70, 30, 10],
496
497 ['o1', 'a1', 2003, 50, 50, 10],
498 ['o1', 'a2', 2003, 60, 40, 11],
499 ['o1', 'a3', 2003, 70, 30, 12],
500
501 ['o1', 'a1', 2004, 50, 50, 16],
502 ['o1', 'a2', 2004, 60, 40, 22],
503 ['o1', 'a3', 2004, 70, 30, 30],
504
505 ['o1', 'a1', 2005, 30, 70, 16],
506 ['o1', 'a2', 2005, 30, 70, 22],
507 ['o1', 'a3', 2005, 30, 70, 30],
508
509 ['o1', 'a1', 2006, 50, 50, 16],
510 ['o1', 'a2', 2006, 60, 40, 22],
511 ['o1', 'a3', 2006, 70, 30, 22],
512
513 ['o1', 'a1', 2007, 100, 0, 10],
514 ['o1', 'a2', 2007, 100, 0, 9],
515 ['o1', 'a3', 2007, 100, 0, 8],
516
517 ['o2', 'a1', 2001, 50, 50, 16],
518 ['o2', 'a2', 2001, 60, 40, 22],
519 ['o2', 'a3', 2001, 70, 30, 22],
520 ]
521
522
523 # Create DataFrame
524 df = pd.DataFrame(data,
525 columns=['o', 'a', 'year', 'R', 'S', 'dose'])
526
527 # Format DataFrame
528 df.year = pd.to_datetime(df.year, format='%Y')
529 df['r_prop'] = df.R / (df.R + df.S) # resistance proportion
530 df['d_prop'] = df.dose / df.groupby(by=['o', 'year']).dose.transform('sum') # dose proportion
531
532 # Show
533 print("Data:")
534 print(df)
535
536 # Example 1: Fixed
537 # ----------------
538
539 # Define p and q
540 p = df.r_prop
541 q = df.d_prop
542
543 # Compute
544 r0 = (p * q).sum()
545 r1 = np.dot(p, q)
546 r2 = np.matmul(p, q)
547
548 # Show
549 print("\n" + "="*80 + "\nExample 1\n" + "="*80)
550 print(r0)
551 print(r1)
552 print(r2)
553
554
555
556 # Example 2: Manual
557 # -----------------
558
559 # Initialize
560 qik0 = None
561 v_fixed = []
562 v_evolv = []
563
564 # Loop
565 for i,g in df.groupby(by=['o', 'year']):
566 if qik0 is None:
567 qik0 = list(g.groupby(by='year'))[0][1].d_prop
568
569 v_fixed.append({'o': i[0], 't': i[1], 'v': np.dot(g.r_prop, qik0.T)})
570 v_evolv.append({'o': i[0], 't': i[1], 'v': np.dot(g.r_prop, g.d_prop)})
571
572 # Format
573 v_fixed = pd.DataFrame(v_fixed)
574 v_evolv = pd.DataFrame(v_evolv)
575 v_combn = v_evolv.merge(v_fixed, how='outer',
576 on=['o', 't'], suffixes=('_evolv', '_fixed'))
577
578 # Show
579 print("\n" + "="*80 + "\nExample 2\n" + "="*80)
580 print(v_combn)
581
582
583
584
585 # Example 3: Matrix
586 # -----------------
587 #
588
589 print("\n" + "="*80 + "\nExample 3\n" + "="*80)
590
591
592 m1 = pd.pivot_table(df, index='o', columns='a', values='r_prop')
593 m2 = pd.pivot_table(df, index='o', columns='a', values='d_prop')
594 print(m1)
595 print(m2)
596 print(m1*m2)
597
598 # Loop
599 for i,g in df.groupby(by='year'):
600 m1 = pd.pivot_table(g, index='o', columns='a', values='r_prop')
601 m2 = pd.pivot_table(g, index='o', columns='a', values='d_prop')
602 dri = np.dot(m1, m2.T)
603 print(i, dri)
604
605
606 """
607
608 """
609 # Compute
610 df3['use_period'] = df3 \
611 .groupby(level=0).use \
612 .transform(lambda x: x.sum())
613 df3['u_weight'] = (df3.use / df3.use_period) #.round(decimals=2)
614 df3['w_rate'] = (df3.sari * df3.u_weight) #.round(decimals=3)
615 df3['dri'] = df3 \
616 .groupby(by='time').w_rate \
617 .transform(lambda x: x.sum())
618 """
Total running time of the script: ( 0 minutes 0.051 seconds)