Impyute¶
Impyute is a library of missing data imputation algorithms written in Python 3. This library was designed to be super lightweight, here’s a sneak peak at what impyute can do.
>>> from impyute.datasets import random_uniform
>>> raw_data = random_uniform(shape=(5, 5), missingness="mcar", th=0.2)
>>> print(raw_data)
[[ 1. 0. 4. 0. 1.]
[ 1. nan 6. 4. nan]
[ 5. 0. nan 1. 3.]
[ 2. 1. 5. 4. 6.]
[ 2. 1. 0. 0. 6.]]
>>> from impyute.imputations.cs import mean_imputation
>>> complete_data = random_imputation(raw_data)
>>> print(complete_data)
[[ 1. 0. 4. 0. 1. ]
[ 1. 0.5 6. 4. 4. ]
[ 5. 0. 3.75 1. 3. ]
[ 2. 1. 5. 4. 6. ]
[ 2. 1. 0. 0. 6. ]]
Feature Support¶
- Imputation of Cross Sectional Data
- K-Nearest Neighbours
- Multivariate Imputation by Chained Equations
- Expectation Maximization
- Mean Imputation
- Mode Imputation
- Median Imputation
- Random Imputation
- Imputation of Time Series Data
- Last Observation Carried Forward
- Moving Window
- Autoregressive Integrated Moving Average (WIP)
- Diagnostic Tools
- Loggers
- Distribution of Null Values
- Comparison of imputations
- Little’s MCAR Test (WIP)
Versions¶
Currently tested on 2.7, 3.4, 3.5, 3.6 and 3.7
Installation¶
To install impyute, run the following:
$ pip3 install impyute
Or to get the most latest build:
$ git clone https://github.com/eltonlaw/impyute
$ cd impyute
$ python setup.py install
Documentation¶
Documentation is available here: http://impyute.readthedocs.io/
How to Contribute¶
Check out CONTRIBUTING