.. DO NOT EDIT. .. THIS FILE WAS AUTOMATICALLY GENERATED BY SPHINX-GALLERY. .. TO MAKE CHANGES, EDIT THE SOURCE PYTHON FILE: .. "_examples/pandas/plot_main01.py" .. LINE NUMBERS ARE GIVEN BELOW. .. only:: html .. note:: :class: sphx-glr-download-link-note Click :ref:`here ` to download the full example code .. rst-class:: sphx-glr-example-title .. _sphx_glr__examples_pandas_plot_main01.py: 99. Basic Example ================== .. GENERATED FROM PYTHON SOURCE LINES 5-41 .. code-block:: default :lineno-start: 6 # Library import numpy as np import pandas as pd # Show in terminal TERMINAL = False # Create data data = [ ['p1', '1/5/2021', 1, 2, 3], ['p1', '2/5/2021', 3, 3, 3], ['p1', '3/5/2021', 4, 4, 4], ['p1', '5/5/2021', 5, 5, 5], ['p2', '11/5/2021', 5, 3, 3], ['p2', '12/5/2021', 4, 3, None], ['p2', '16/5/2021', None, 1, None], # unordered ['p2', '15/5/2021', 5, 2, 4], ] # Load DataFrame data = pd.DataFrame(data, columns=['patient', 'date', 'plt', 'hct', 'bil']) # Format datetime # Date will be a datetime64[ns] instead of string data.date = pd.to_datetime(data.date, dayfirst=True) data.date = data.date.dt.normalize() # Show if TERMINAL: print("\nData:") print(data) data .. raw:: html
patient date plt hct bil
0 p1 2021-05-01 1.0 2 3.0
1 p1 2021-05-02 3.0 3 3.0
2 p1 2021-05-03 4.0 4 4.0
3 p1 2021-05-05 5.0 5 5.0
4 p2 2021-05-11 5.0 3 3.0
5 p2 2021-05-12 4.0 3 NaN
6 p2 2021-05-16 NaN 1 NaN
7 p2 2021-05-15 5.0 2 4.0


.. GENERATED FROM PYTHON SOURCE LINES 42-43 Lets sort values .. GENERATED FROM PYTHON SOURCE LINES 43-54 .. code-block:: default :lineno-start: 44 # Note that if you set columns as indexes (e.g. the # datetime) they will be sorted by default. aux = data.sort_values(by=['plt', 'hct']) # Show if TERMINAL: print("\nOut:") print(aux) aux .. raw:: html
patient date plt hct bil
0 p1 2021-05-01 1.0 2 3.0
1 p1 2021-05-02 3.0 3 3.0
5 p2 2021-05-12 4.0 3 NaN
2 p1 2021-05-03 4.0 4 4.0
7 p2 2021-05-15 5.0 2 4.0
4 p2 2021-05-11 5.0 3 3.0
3 p1 2021-05-05 5.0 5 5.0
6 p2 2021-05-16 NaN 1 NaN


.. GENERATED FROM PYTHON SOURCE LINES 55-56 Lets select columns .. GENERATED FROM PYTHON SOURCE LINES 56-66 .. code-block:: default :lineno-start: 57 # Select columns from DataFrame aux = data[['patient', 'date', 'plt']] # Show if TERMINAL: print("\nOut:") print(aux) aux .. raw:: html
patient date plt
0 p1 2021-05-01 1.0
1 p1 2021-05-02 3.0
2 p1 2021-05-03 4.0
3 p1 2021-05-05 5.0
4 p2 2021-05-11 5.0
5 p2 2021-05-12 4.0
6 p2 2021-05-16 NaN
7 p2 2021-05-15 5.0


.. GENERATED FROM PYTHON SOURCE LINES 67-69 Lets do indexing (not nan) .. GENERATED FROM PYTHON SOURCE LINES 69-80 .. code-block:: default :lineno-start: 70 # Keep rows where plt is not nan aux = data[data.plt.notna()] # Show if TERMINAL: print("\nOut:") print(aux) aux .. raw:: html
patient date plt hct bil
0 p1 2021-05-01 1.0 2 3.0
1 p1 2021-05-02 3.0 3 3.0
2 p1 2021-05-03 4.0 4 4.0
3 p1 2021-05-05 5.0 5 5.0
4 p2 2021-05-11 5.0 3 3.0
5 p2 2021-05-12 4.0 3 NaN
7 p2 2021-05-15 5.0 2 4.0


.. GENERATED FROM PYTHON SOURCE LINES 81-83 Lets drop nan (in subset) .. GENERATED FROM PYTHON SOURCE LINES 83-94 .. code-block:: default :lineno-start: 84 # Keep rows without any nan in subset aux = data.dropna(how='any', subset=['plt', 'bil']) # Show if TERMINAL: print("\nOut:") print(aux) aux .. raw:: html
patient date plt hct bil
0 p1 2021-05-01 1.0 2 3.0
1 p1 2021-05-02 3.0 3 3.0
2 p1 2021-05-03 4.0 4 4.0
3 p1 2021-05-05 5.0 5 5.0
4 p2 2021-05-11 5.0 3 3.0
7 p2 2021-05-15 5.0 2 4.0


.. GENERATED FROM PYTHON SOURCE LINES 95-97 Lets drop nan (all) .. GENERATED FROM PYTHON SOURCE LINES 97-107 .. code-block:: default :lineno-start: 98 # Keep rows without any nan at all aux = data.dropna(how='any') # Show if TERMINAL: print("\nOut:") print(aux) aux .. raw:: html
patient date plt hct bil
0 p1 2021-05-01 1.0 2 3.0
1 p1 2021-05-02 3.0 3 3.0
2 p1 2021-05-03 4.0 4 4.0
3 p1 2021-05-05 5.0 5 5.0
4 p2 2021-05-11 5.0 3 3.0
7 p2 2021-05-15 5.0 2 4.0


.. GENERATED FROM PYTHON SOURCE LINES 108-110 Lets resample daily .. GENERATED FROM PYTHON SOURCE LINES 110-120 .. code-block:: default :lineno-start: 111 # Resample aux = data.set_index('date').resample('D').asfreq() # Show if TERMINAL: print("\nOut:") print(aux) aux .. raw:: html
patient plt hct bil
date
2021-05-01 p1 1.0 2.0 3.0
2021-05-02 p1 3.0 3.0 3.0
2021-05-03 p1 4.0 4.0 4.0
2021-05-04 NaN NaN NaN NaN
2021-05-05 p1 5.0 5.0 5.0
2021-05-06 NaN NaN NaN NaN
2021-05-07 NaN NaN NaN NaN
2021-05-08 NaN NaN NaN NaN
2021-05-09 NaN NaN NaN NaN
2021-05-10 NaN NaN NaN NaN
2021-05-11 p2 5.0 3.0 3.0
2021-05-12 p2 4.0 3.0 NaN
2021-05-13 NaN NaN NaN NaN
2021-05-14 NaN NaN NaN NaN
2021-05-15 p2 5.0 2.0 4.0
2021-05-16 p2 NaN 1.0 NaN


.. GENERATED FROM PYTHON SOURCE LINES 121-123 Lets fill missing values (pad) .. GENERATED FROM PYTHON SOURCE LINES 123-133 .. code-block:: default :lineno-start: 124 # Pad is synonym of DataFrame.fillna() with method='ffill'. aux = data.set_index('date').resample('D').asfreq().pad() # Show if TERMINAL: print("\nOut:") print(aux) aux .. raw:: html
patient plt hct bil
date
2021-05-01 p1 1.0 2.0 3.0
2021-05-02 p1 3.0 3.0 3.0
2021-05-03 p1 4.0 4.0 4.0
2021-05-04 p1 4.0 4.0 4.0
2021-05-05 p1 5.0 5.0 5.0
2021-05-06 p1 5.0 5.0 5.0
2021-05-07 p1 5.0 5.0 5.0
2021-05-08 p1 5.0 5.0 5.0
2021-05-09 p1 5.0 5.0 5.0
2021-05-10 p1 5.0 5.0 5.0
2021-05-11 p2 5.0 3.0 3.0
2021-05-12 p2 4.0 3.0 3.0
2021-05-13 p2 4.0 3.0 3.0
2021-05-14 p2 4.0 3.0 3.0
2021-05-15 p2 5.0 2.0 4.0
2021-05-16 p2 5.0 1.0 4.0


.. GENERATED FROM PYTHON SOURCE LINES 134-135 Lets group by patient and sum .. GENERATED FROM PYTHON SOURCE LINES 135-145 .. code-block:: default :lineno-start: 136 # Group by patient and sum agg = aux.groupby('patient').sum() # Show if TERMINAL: print("\nOut:") print(agg) agg .. raw:: html
plt hct bil
patient
p1 42.0 43.0 44.0
p2 27.0 15.0 20.0


.. GENERATED FROM PYTHON SOURCE LINES 146-147 Lets group by patient per 2 days and compute mean and max. .. GENERATED FROM PYTHON SOURCE LINES 147-160 .. code-block:: default :lineno-start: 148 agg = aux.groupby(by=['patient', pd.Grouper(freq='2D')]) \ .agg('mean', 'max') #.agg({'idx': ['first', 'last'], # 0: [skew, kurtosis, own], # 1: [skew, kurtosis, own], # '0_hr': [own], # '0_rr': [own]}) # Show if TERMINAL: print("\nOut:") print(agg) agg .. raw:: html
plt hct bil
patient date
p1 2021-05-01 2.0 2.5 3.0
2021-05-03 4.0 4.0 4.0
2021-05-05 5.0 5.0 5.0
2021-05-07 5.0 5.0 5.0
2021-05-09 5.0 5.0 5.0
p2 2021-05-11 4.5 3.0 3.0
2021-05-13 4.0 3.0 3.0
2021-05-15 5.0 1.5 4.0


.. rst-class:: sphx-glr-timing **Total running time of the script:** ( 0 minutes 0.046 seconds) .. _sphx_glr_download__examples_pandas_plot_main01.py: .. only :: html .. container:: sphx-glr-footer :class: sphx-glr-footer-example .. container:: sphx-glr-download sphx-glr-download-python :download:`Download Python source code: plot_main01.py ` .. container:: sphx-glr-download sphx-glr-download-jupyter :download:`Download Jupyter notebook: plot_main01.ipynb ` .. only:: html .. rst-class:: sphx-glr-signature `Gallery generated by Sphinx-Gallery `_