.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "_examples/pandas/plot_main01.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here <sphx_glr_download__examples_pandas_plot_main01.py>` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr__examples_pandas_plot_main01.py: 99. Basic Example ================== .. GENERATED FROM PYTHON SOURCE LINES 5-41 .. code-block:: default :lineno-start: 6 # Library import numpy as np import pandas as pd # Show in terminal TERMINAL = False # Create data data = [ ['p1', '1/5/2021', 1, 2, 3], ['p1', '2/5/2021', 3, 3, 3], ['p1', '3/5/2021', 4, 4, 4], ['p1', '5/5/2021', 5, 5, 5], ['p2', '11/5/2021', 5, 3, 3], ['p2', '12/5/2021', 4, 3, None], ['p2', '16/5/2021', None, 1, None], # unordered ['p2', '15/5/2021', 5, 2, 4], ] # Load DataFrame data = pd.DataFrame(data, columns=['patient', 'date', 'plt', 'hct', 'bil']) # Format datetime # Date will be a datetime64[ns] instead of string data.date = pd.to_datetime(data.date, dayfirst=True) data.date = data.date.dt.normalize() # Show if TERMINAL: print("\nData:") print(data) data .. raw:: html <div class="output_subarea output_html rendered_html output_result"> <div> <style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>patient</th> <th>date</th> <th>plt</th> <th>hct</th> <th>bil</th> </tr> </thead> <tbody> <tr> <th>0</th> <td>p1</td> <td>2021-05-01</td> <td>1.0</td> <td>2</td> <td>3.0</td> </tr> <tr> <th>1</th> <td>p1</td> <td>2021-05-02</td> <td>3.0</td> <td>3</td> <td>3.0</td> </tr> <tr> <th>2</th> <td>p1</td> <td>2021-05-03</td> <td>4.0</td> <td>4</td> <td>4.0</td> </tr> <tr> <th>3</th> <td>p1</td> <td>2021-05-05</td> <td>5.0</td> <td>5</td> <td>5.0</td> </tr> <tr> <th>4</th> <td>p2</td> <td>2021-05-11</td> <td>5.0</td> <td>3</td> <td>3.0</td> </tr> <tr> <th>5</th> <td>p2</td> <td>2021-05-12</td> <td>4.0</td> <td>3</td> <td>NaN</td> </tr> <tr> <th>6</th> <td>p2</td> <td>2021-05-16</td> <td>NaN</td> <td>1</td> <td>NaN</td> </tr> <tr> <th>7</th> <td>p2</td> <td>2021-05-15</td> <td>5.0</td> <td>2</td> <td>4.0</td> </tr> </tbody> </table> </div> </div> <br /> <br /> .. GENERATED FROM PYTHON SOURCE LINES 42-43 Lets sort values .. GENERATED FROM PYTHON SOURCE LINES 43-54 .. code-block:: default :lineno-start: 44 # Note that if you set columns as indexes (e.g. the # datetime) they will be sorted by default. aux = data.sort_values(by=['plt', 'hct']) # Show if TERMINAL: print("\nOut:") print(aux) aux .. raw:: html <div class="output_subarea output_html rendered_html output_result"> <div> <style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>patient</th> <th>date</th> <th>plt</th> <th>hct</th> <th>bil</th> </tr> </thead> <tbody> <tr> <th>0</th> <td>p1</td> <td>2021-05-01</td> <td>1.0</td> <td>2</td> <td>3.0</td> </tr> <tr> <th>1</th> <td>p1</td> <td>2021-05-02</td> <td>3.0</td> <td>3</td> <td>3.0</td> </tr> <tr> <th>5</th> <td>p2</td> <td>2021-05-12</td> <td>4.0</td> <td>3</td> <td>NaN</td> </tr> <tr> <th>2</th> <td>p1</td> <td>2021-05-03</td> <td>4.0</td> <td>4</td> <td>4.0</td> </tr> <tr> <th>7</th> <td>p2</td> <td>2021-05-15</td> <td>5.0</td> <td>2</td> <td>4.0</td> </tr> <tr> <th>4</th> <td>p2</td> <td>2021-05-11</td> <td>5.0</td> <td>3</td> <td>3.0</td> </tr> <tr> <th>3</th> <td>p1</td> <td>2021-05-05</td> <td>5.0</td> <td>5</td> <td>5.0</td> </tr> <tr> <th>6</th> <td>p2</td> <td>2021-05-16</td> <td>NaN</td> <td>1</td> <td>NaN</td> </tr> </tbody> </table> </div> </div> <br /> <br /> .. GENERATED FROM PYTHON SOURCE LINES 55-56 Lets select columns .. GENERATED FROM PYTHON SOURCE LINES 56-66 .. code-block:: default :lineno-start: 57 # Select columns from DataFrame aux = data[['patient', 'date', 'plt']] # Show if TERMINAL: print("\nOut:") print(aux) aux .. raw:: html <div class="output_subarea output_html rendered_html output_result"> <div> <style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>patient</th> <th>date</th> <th>plt</th> </tr> </thead> <tbody> <tr> <th>0</th> <td>p1</td> <td>2021-05-01</td> <td>1.0</td> </tr> <tr> <th>1</th> <td>p1</td> <td>2021-05-02</td> <td>3.0</td> </tr> <tr> <th>2</th> <td>p1</td> <td>2021-05-03</td> <td>4.0</td> </tr> <tr> <th>3</th> <td>p1</td> <td>2021-05-05</td> <td>5.0</td> </tr> <tr> <th>4</th> <td>p2</td> <td>2021-05-11</td> <td>5.0</td> </tr> <tr> <th>5</th> <td>p2</td> <td>2021-05-12</td> <td>4.0</td> </tr> <tr> <th>6</th> <td>p2</td> <td>2021-05-16</td> <td>NaN</td> </tr> <tr> <th>7</th> <td>p2</td> <td>2021-05-15</td> <td>5.0</td> </tr> </tbody> </table> </div> </div> <br /> <br /> .. GENERATED FROM PYTHON SOURCE LINES 67-69 Lets do indexing (not nan) .. GENERATED FROM PYTHON SOURCE LINES 69-80 .. code-block:: default :lineno-start: 70 # Keep rows where plt is not nan aux = data[data.plt.notna()] # Show if TERMINAL: print("\nOut:") print(aux) aux .. raw:: html <div class="output_subarea output_html rendered_html output_result"> <div> <style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>patient</th> <th>date</th> <th>plt</th> <th>hct</th> <th>bil</th> </tr> </thead> <tbody> <tr> <th>0</th> <td>p1</td> <td>2021-05-01</td> <td>1.0</td> <td>2</td> <td>3.0</td> </tr> <tr> <th>1</th> <td>p1</td> <td>2021-05-02</td> <td>3.0</td> <td>3</td> <td>3.0</td> </tr> <tr> <th>2</th> <td>p1</td> <td>2021-05-03</td> <td>4.0</td> <td>4</td> <td>4.0</td> </tr> <tr> <th>3</th> <td>p1</td> <td>2021-05-05</td> <td>5.0</td> <td>5</td> <td>5.0</td> </tr> <tr> <th>4</th> <td>p2</td> <td>2021-05-11</td> <td>5.0</td> <td>3</td> <td>3.0</td> </tr> <tr> <th>5</th> <td>p2</td> <td>2021-05-12</td> <td>4.0</td> <td>3</td> <td>NaN</td> </tr> <tr> <th>7</th> <td>p2</td> <td>2021-05-15</td> <td>5.0</td> <td>2</td> <td>4.0</td> </tr> </tbody> </table> </div> </div> <br /> <br /> .. GENERATED FROM PYTHON SOURCE LINES 81-83 Lets drop nan (in subset) .. GENERATED FROM PYTHON SOURCE LINES 83-94 .. code-block:: default :lineno-start: 84 # Keep rows without any nan in subset aux = data.dropna(how='any', subset=['plt', 'bil']) # Show if TERMINAL: print("\nOut:") print(aux) aux .. raw:: html <div class="output_subarea output_html rendered_html output_result"> <div> <style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>patient</th> <th>date</th> <th>plt</th> <th>hct</th> <th>bil</th> </tr> </thead> <tbody> <tr> <th>0</th> <td>p1</td> <td>2021-05-01</td> <td>1.0</td> <td>2</td> <td>3.0</td> </tr> <tr> <th>1</th> <td>p1</td> <td>2021-05-02</td> <td>3.0</td> <td>3</td> <td>3.0</td> </tr> <tr> <th>2</th> <td>p1</td> <td>2021-05-03</td> <td>4.0</td> <td>4</td> <td>4.0</td> </tr> <tr> <th>3</th> <td>p1</td> <td>2021-05-05</td> <td>5.0</td> <td>5</td> <td>5.0</td> </tr> <tr> <th>4</th> <td>p2</td> <td>2021-05-11</td> <td>5.0</td> <td>3</td> <td>3.0</td> </tr> <tr> <th>7</th> <td>p2</td> <td>2021-05-15</td> <td>5.0</td> <td>2</td> <td>4.0</td> </tr> </tbody> </table> </div> </div> <br /> <br /> .. GENERATED FROM PYTHON SOURCE LINES 95-97 Lets drop nan (all) .. GENERATED FROM PYTHON SOURCE LINES 97-107 .. code-block:: default :lineno-start: 98 # Keep rows without any nan at all aux = data.dropna(how='any') # Show if TERMINAL: print("\nOut:") print(aux) aux .. raw:: html <div class="output_subarea output_html rendered_html output_result"> <div> <style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>patient</th> <th>date</th> <th>plt</th> <th>hct</th> <th>bil</th> </tr> </thead> <tbody> <tr> <th>0</th> <td>p1</td> <td>2021-05-01</td> <td>1.0</td> <td>2</td> <td>3.0</td> </tr> <tr> <th>1</th> <td>p1</td> <td>2021-05-02</td> <td>3.0</td> <td>3</td> <td>3.0</td> </tr> <tr> <th>2</th> <td>p1</td> <td>2021-05-03</td> <td>4.0</td> <td>4</td> <td>4.0</td> </tr> <tr> <th>3</th> <td>p1</td> <td>2021-05-05</td> <td>5.0</td> <td>5</td> <td>5.0</td> </tr> <tr> <th>4</th> <td>p2</td> <td>2021-05-11</td> <td>5.0</td> <td>3</td> <td>3.0</td> </tr> <tr> <th>7</th> <td>p2</td> <td>2021-05-15</td> <td>5.0</td> <td>2</td> <td>4.0</td> </tr> </tbody> </table> </div> </div> <br /> <br /> .. GENERATED FROM PYTHON SOURCE LINES 108-110 Lets resample daily .. GENERATED FROM PYTHON SOURCE LINES 110-120 .. code-block:: default :lineno-start: 111 # Resample aux = data.set_index('date').resample('D').asfreq() # Show if TERMINAL: print("\nOut:") print(aux) aux .. raw:: html <div class="output_subarea output_html rendered_html output_result"> <div> <style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>patient</th> <th>plt</th> <th>hct</th> <th>bil</th> </tr> <tr> <th>date</th> <th></th> <th></th> <th></th> <th></th> </tr> </thead> <tbody> <tr> <th>2021-05-01</th> <td>p1</td> <td>1.0</td> <td>2.0</td> <td>3.0</td> </tr> <tr> <th>2021-05-02</th> <td>p1</td> <td>3.0</td> <td>3.0</td> <td>3.0</td> </tr> <tr> <th>2021-05-03</th> <td>p1</td> <td>4.0</td> <td>4.0</td> <td>4.0</td> </tr> <tr> <th>2021-05-04</th> <td>NaN</td> <td>NaN</td> <td>NaN</td> <td>NaN</td> </tr> <tr> <th>2021-05-05</th> <td>p1</td> <td>5.0</td> <td>5.0</td> <td>5.0</td> </tr> <tr> <th>2021-05-06</th> <td>NaN</td> <td>NaN</td> <td>NaN</td> <td>NaN</td> </tr> <tr> <th>2021-05-07</th> <td>NaN</td> <td>NaN</td> <td>NaN</td> <td>NaN</td> </tr> <tr> <th>2021-05-08</th> <td>NaN</td> <td>NaN</td> <td>NaN</td> <td>NaN</td> </tr> <tr> <th>2021-05-09</th> <td>NaN</td> <td>NaN</td> <td>NaN</td> <td>NaN</td> </tr> <tr> <th>2021-05-10</th> <td>NaN</td> <td>NaN</td> <td>NaN</td> <td>NaN</td> </tr> <tr> <th>2021-05-11</th> <td>p2</td> <td>5.0</td> <td>3.0</td> <td>3.0</td> </tr> <tr> <th>2021-05-12</th> <td>p2</td> <td>4.0</td> <td>3.0</td> <td>NaN</td> </tr> <tr> <th>2021-05-13</th> <td>NaN</td> <td>NaN</td> <td>NaN</td> <td>NaN</td> </tr> <tr> <th>2021-05-14</th> <td>NaN</td> <td>NaN</td> <td>NaN</td> <td>NaN</td> </tr> <tr> <th>2021-05-15</th> <td>p2</td> <td>5.0</td> <td>2.0</td> <td>4.0</td> </tr> <tr> <th>2021-05-16</th> <td>p2</td> <td>NaN</td> <td>1.0</td> <td>NaN</td> </tr> </tbody> </table> </div> </div> <br /> <br /> .. GENERATED FROM PYTHON SOURCE LINES 121-123 Lets fill missing values (pad) .. GENERATED FROM PYTHON SOURCE LINES 123-133 .. code-block:: default :lineno-start: 124 # Pad is synonym of DataFrame.fillna() with method='ffill'. aux = data.set_index('date').resample('D').asfreq().pad() # Show if TERMINAL: print("\nOut:") print(aux) aux .. raw:: html <div class="output_subarea output_html rendered_html output_result"> <div> <style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>patient</th> <th>plt</th> <th>hct</th> <th>bil</th> </tr> <tr> <th>date</th> <th></th> <th></th> <th></th> <th></th> </tr> </thead> <tbody> <tr> <th>2021-05-01</th> <td>p1</td> <td>1.0</td> <td>2.0</td> <td>3.0</td> </tr> <tr> <th>2021-05-02</th> <td>p1</td> <td>3.0</td> <td>3.0</td> <td>3.0</td> </tr> <tr> <th>2021-05-03</th> <td>p1</td> <td>4.0</td> <td>4.0</td> <td>4.0</td> </tr> <tr> <th>2021-05-04</th> <td>p1</td> <td>4.0</td> <td>4.0</td> <td>4.0</td> </tr> <tr> <th>2021-05-05</th> <td>p1</td> <td>5.0</td> <td>5.0</td> <td>5.0</td> </tr> <tr> <th>2021-05-06</th> <td>p1</td> <td>5.0</td> <td>5.0</td> <td>5.0</td> </tr> <tr> <th>2021-05-07</th> <td>p1</td> <td>5.0</td> <td>5.0</td> <td>5.0</td> </tr> <tr> <th>2021-05-08</th> <td>p1</td> <td>5.0</td> <td>5.0</td> <td>5.0</td> </tr> <tr> <th>2021-05-09</th> <td>p1</td> <td>5.0</td> <td>5.0</td> <td>5.0</td> </tr> <tr> <th>2021-05-10</th> <td>p1</td> <td>5.0</td> <td>5.0</td> <td>5.0</td> </tr> <tr> <th>2021-05-11</th> <td>p2</td> <td>5.0</td> <td>3.0</td> <td>3.0</td> </tr> <tr> <th>2021-05-12</th> <td>p2</td> <td>4.0</td> <td>3.0</td> <td>3.0</td> </tr> <tr> <th>2021-05-13</th> <td>p2</td> <td>4.0</td> <td>3.0</td> <td>3.0</td> </tr> <tr> <th>2021-05-14</th> <td>p2</td> <td>4.0</td> <td>3.0</td> <td>3.0</td> </tr> <tr> <th>2021-05-15</th> <td>p2</td> <td>5.0</td> <td>2.0</td> <td>4.0</td> </tr> <tr> <th>2021-05-16</th> <td>p2</td> <td>5.0</td> <td>1.0</td> <td>4.0</td> </tr> </tbody> </table> </div> </div> <br /> <br /> .. GENERATED FROM PYTHON SOURCE LINES 134-135 Lets group by patient and sum .. GENERATED FROM PYTHON SOURCE LINES 135-145 .. code-block:: default :lineno-start: 136 # Group by patient and sum agg = aux.groupby('patient').sum() # Show if TERMINAL: print("\nOut:") print(agg) agg .. raw:: html <div class="output_subarea output_html rendered_html output_result"> <div> <style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th>plt</th> <th>hct</th> <th>bil</th> </tr> <tr> <th>patient</th> <th></th> <th></th> <th></th> </tr> </thead> <tbody> <tr> <th>p1</th> <td>42.0</td> <td>43.0</td> <td>44.0</td> </tr> <tr> <th>p2</th> <td>27.0</td> <td>15.0</td> <td>20.0</td> </tr> </tbody> </table> </div> </div> <br /> <br /> .. GENERATED FROM PYTHON SOURCE LINES 146-147 Lets group by patient per 2 days and compute mean and max. .. GENERATED FROM PYTHON SOURCE LINES 147-160 .. code-block:: default :lineno-start: 148 agg = aux.groupby(by=['patient', pd.Grouper(freq='2D')]) \ .agg('mean', 'max') #.agg({'idx': ['first', 'last'], # 0: [skew, kurtosis, own], # 1: [skew, kurtosis, own], # '0_hr': [own], # '0_rr': [own]}) # Show if TERMINAL: print("\nOut:") print(agg) agg .. raw:: html <div class="output_subarea output_html rendered_html output_result"> <div> <style scoped> .dataframe tbody tr th:only-of-type { vertical-align: middle; } .dataframe tbody tr th { vertical-align: top; } .dataframe thead th { text-align: right; } </style> <table border="1" class="dataframe"> <thead> <tr style="text-align: right;"> <th></th> <th></th> <th>plt</th> <th>hct</th> <th>bil</th> </tr> <tr> <th>patient</th> <th>date</th> <th></th> <th></th> <th></th> </tr> </thead> <tbody> <tr> <th rowspan="5" valign="top">p1</th> <th>2021-05-01</th> <td>2.0</td> <td>2.5</td> <td>3.0</td> </tr> <tr> <th>2021-05-03</th> <td>4.0</td> <td>4.0</td> <td>4.0</td> </tr> <tr> <th>2021-05-05</th> <td>5.0</td> <td>5.0</td> <td>5.0</td> </tr> <tr> <th>2021-05-07</th> <td>5.0</td> <td>5.0</td> <td>5.0</td> </tr> <tr> <th>2021-05-09</th> <td>5.0</td> <td>5.0</td> <td>5.0</td> </tr> <tr> <th rowspan="3" valign="top">p2</th> <th>2021-05-11</th> <td>4.5</td> <td>3.0</td> <td>3.0</td> </tr> <tr> <th>2021-05-13</th> <td>4.0</td> <td>3.0</td> <td>3.0</td> </tr> <tr> <th>2021-05-15</th> <td>5.0</td> <td>1.5</td> <td>4.0</td> </tr> </tbody> </table> </div> </div> <br /> <br /> .. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 0.046 seconds) .. _sphx_glr_download__examples_pandas_plot_main01.py: .. only :: html .. container:: sphx-glr-footer :class: sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_main01.py <plot_main01.py>` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_main01.ipynb <plot_main01.ipynb>` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery <https://sphinx-gallery.github.io>`_