Impyute

https://travis-ci.org/eltonlaw/impyute.svg?branch=master https://img.shields.io/pypi/v/impyute.svg

Impyute is a library of missing data imputation algorithms written in Python 3. This library was designed to be super lightweight, here’s a sneak peak at what impyute can do.

>>> from impyute.datasets import random_uniform
>>> raw_data = random_uniform(shape=(5, 5), missingness="mcar", th=0.2)
>>> print(raw_data)
[[  1.   0.   4.   0.   1.]
 [  1.  nan   6.   4.  nan]
 [  5.   0.  nan   1.   3.]
 [  2.   1.   5.   4.   6.]
 [  2.   1.   0.   0.   6.]]
>>> from impyute.imputations.cs import mean_imputation
>>> complete_data = random_imputation(raw_data)
>>> print(complete_data)
[[ 1.    0.    4.    0.    1.  ]
 [ 1.    0.5   6.    4.    4.  ]
 [ 5.    0.    3.75  1.    3.  ]
 [ 2.    1.    5.    4.    6.  ]
 [ 2.    1.    0.    0.    6.  ]]

Feature Support

  • Imputation of Cross Sectional Data
    • K-Nearest Neighbours
    • Multivariate Imputation by Chained Equations
    • Expectation Maximization
    • Mean Imputation
    • Mode Imputation
    • Median Imputation
    • Random Imputation
  • Imputation of Time Series Data
    • Last Observation Carried Forward
    • Moving Window
    • Autoregressive Integrated Moving Average (WIP)
  • Diagnostic Tools
    • Loggers
    • Distribution of Null Values
    • Comparison of imputations
    • Little’s MCAR Test (WIP)

Versions

Currently tested on 2.7, 3.4, 3.5, 3.6 and 3.7

Installation

To install impyute, run the following:

$ pip3 install impyute

Or to get the most latest build:

$ git clone https://github.com/eltonlaw/impyute
$ cd impyute
$ python setup.py install

Documentation

Documentation is available here: http://impyute.readthedocs.io/

How to Contribute

Check out CONTRIBUTING