pyamr.datasets.microbiology package

Submodules

pyamr.datasets.microbiology.create_quickimport module

Functions:

create_registry(data[, keyword, keep])

Creates registry from data.

pyamr.datasets.microbiology.create_quickimport.create_registry(data, keyword=None, keep=None)[source]

Creates registry from data.

Parameters:
  • data (pd.DataFrame) – The data

  • keyword (string) – The keyword for the columns. All columns starting with such keyword will be kept and used for the registry.

  • keep (list) – The list of columns to keep for the registry

pyamr.datasets.microbiology.create_susceptibility module

Functions:

create_antimicrobials_lookup_table(abxs)

Creates the look up table for the antimicorbials.

create_microorganisms_lookup_table(orgs)

Creates the look up table for the organisms.

pyamr.datasets.microbiology.create_susceptibility.create_antimicrobials_lookup_table(abxs)[source]

Creates the look up table for the antimicorbials.

This method uses the information in the antibiotics dataframe and the information in the default antimicrobials registry to create a unique lookup table for the data.

Parameters:

abxs (pd.DataFrame) – The DataFrame with … The DataFrame must contain the following columns:

Returns:

Lookup table DataFrame with the following columns:

Return type:

pd.DataFrame

pyamr.datasets.microbiology.create_susceptibility.create_microorganisms_lookup_table(orgs)[source]

Creates the look up table for the organisms.

This method uses the information in the organisms dataframe and the information in the default microorganisms registry to create a unique lookup table for the data.

Parameters:

orgs (pd.DataFrame) – The DataFrame with the organism genus and organism species for which the look up table should be created. The DataFrame must contain the following columns:

microorganism_name genus species

Returns:

Lookup table DataFrame with the following columns:

’domain’ ‘phylum’ ‘class’ ‘order’ ‘family’ ‘genus’ ‘species’ ‘acronym’ ‘exists_in_registry’ ‘gram_stain’ ‘microorganism_code’ ‘microorganism_name

Return type:

pd.DataFrame

pyamr.datasets.microbiology.quickimport_create module

pyamr.datasets.microbiology.quickimport_run module

Functions:

quickimport(path, connection)

param path:

pyamr.datasets.microbiology.quickimport_run.quickimport(path, connection)[source]
Parameters:
  • path

  • connection

Returns:

pyamr.datasets.microbiology.test module

import pandas as pd

# ————————————————- # Test to_csv date_format # ————————————————- # Create DataFrame a = pd.DataFrame()

# Create dates a[‘dates’] = [‘23/01/2015 19:37’,

‘23/01/2015 20:08’]

# Format dates a.dates = pd.to_datetime(a.dates)

# Save print(” DF:”) print(a) print(“Saving…”) #a.to_csv(‘test-v0.1.csv’) #a.to_csv(‘test-v0.2.csv’, date_format=’%Y-%m-%d %H:%M:%S’)

# ————————————————- # Test cleaning / replacing # ————————————————- # Create DataFrame a = pd.DataFrame()

# Create regexpmap REGEX_MAP = {

‘([^)]*)’: ‘’, # Remove everything between (). ‘species’: ‘’, # Rename species for next regexp ‘sp(.)?(s|$)+’: ‘ ‘, # Remove sp from word. ‘strep(.|s|$)’: ‘streptococcus ‘, # Complete ‘staph(.|s|$)’: ‘staphylococcus ‘, # Complete ‘s+’: ‘ ‘ # Remove duplicated spaces.

}

# Create data a[‘spaces’] = [’ in between ‘, ‘ sides ‘, ‘end ‘, ‘ start’, None] a[‘species’] = [’ sp.’, ‘ sp’, ‘sp ‘, ‘ sp. ‘, ‘species’] a[‘occus’] = [‘haemolytic streptococcus’,

‘haemolytic strep’, ‘haemolytic strep.’, ‘haemolytic strep. aureus’, ‘haemolytic strep aureus’]

a[‘occus2’] = [‘strep.aureus’,

‘staph.aureus’, ‘methicillin resistant staph.aureus’, ‘feo strepococcus’, ‘streptococcus feo’]

# Cleaned cleaned = a.copy(deep=True) cleaned = cleaned.replace(regex=REGEX_MAP) cleaned = cleaned.apply(lambda x: x.str.strip() if x.dtype == “object” else x)

# Show print(“-”*80) print(” Original”) print(a) print(” Cleaned”) print(cleaned)

# ————————————————— # Haemolytic # ————————————————– # .. note: https://regex101.com/r/KFXCCM/1

# —————————————————- # Test genus at the beginning # —————————————————- # Import regular expressions import re

# Import function from pyamr.datasets.clean import word_to_start

# Species examples series = pd.Series([‘is viridians, enteroccocus en’,

‘is viridians, enteroccocus’, ‘viridians enteroccocus’, ‘enterococcus viridians’, ‘non haemolytic feo enteroccocus’, ‘non-haemolytic enteroccocus’, ‘non haemolytic enteroccocus’, ‘vancomycin resistant enteroccocus’, None])

# Corrected corrected = series.apply(word_to_start, w=’enteroccocus’)

# Show print(“-”*80) print(”

Raw”) print(series) print(” Corrected”) print(corrected)

# —————————————————- # Test hyphen # —————————————————- # Import from pyamr.datasets.clean import hyphen_before

# Haemolytic examples series= pd.Series([‘non haemolytic something’,

‘ non haemolytic something’, ‘ when beta haemolytic something’, ‘ whn gamma haemolytic’, ‘non haemolytic’])

# Correct it corrected = series.apply(hyphen_before, w=’haemolytic’)

# Show print(“-”*80) print(”

Raw:”) print(series) print(” Corrected:”) print(corrected)

# —————————————————- # Full test # —————————————————- # In this section, we do a full test, specially for # those examples that have shown to be problematic in # the final microorganisms.csv outcome. # Import clean common from pyamr.datasets.clean import clean_common from pyamr.datasets.registries import _clean_microorganism

# Create dataframe df = pd.DataFrame()

# Add microorganism names. df[‘microorganism_name’] = [

‘non haemolytic streptococcus aureus’, ‘this non haemolytic streptococcus’, ‘beta-haemolytic streptococcus group a’, ‘beta-haemolytic streptococcus group b’, ‘beta-haemolytic streptococcus group c’, ‘beta-haemolytic streptococcus group c/g’, ‘beta-haemolytic streptococcus group g’, ‘this is haemolyticus’, ‘perestreptococcus’, ‘Coagulase negative staphylococcus’, ‘Methicillin Resistant Staph.aureus’, ‘mixed streptococcus alpha-haemolytic’, ‘non-coliform lactose fermenting’, ‘Non-haemolytic streptococcus’, ‘Non-lactose Fermenting Coliform’, ‘Vancomycin Resistant Enterococcus’, ‘ non- lactose fermenting coliform’, ‘non-lactose fermenting coliform’, ‘mixed lactose fermenting coliform’, ‘paenibacillus sp’, ‘paenibacillus sp.’, ‘paenibacillus sp..’, ‘escherichia coli o157’, ‘* mrsa * isolated’, ‘streptococcus milleri group’, ‘aspergillus fumigatus’

]

# Any alterations df.microorganism_name = df.microorganism_name.str.upper()

# Clean #aux = clean_common(df.copy(deep=True)) aux = _clean_microorganism(df.microorganism_name)

# Show print(“-”*80) print(” Data:”) print(df) print(” Corrected:”) print(aux)

Module contents