Time Series Imputation

Imputations for time-series data.

impyute.imputation.ts.locf(data, axis=0)[source]

Last Observation Carried Forward

For each set of missing indices, use the value of one row before(same column). In the case that the missing value is the first row, look one row ahead instead. If this next row is also NaN, look to the next row. Repeat until you find a row in this column that’s not NaN. All the rows before will be filled with this value.

Parameters:
data: numpy.ndarray

Data to impute.

axis: boolean (optional)

0 if time series is in row format (Ex. data[0][:] is 1st data point). 1 if time series is in col format (Ex. data[:][0] is 1st data point).

Returns:
numpy.ndarray

Imputed data.

impyute.imputation.ts.moving_window(data, nindex=None, wsize=5, errors='coerce', func=<function mean at 0x7f14d3fd3f28>, inplace=False)[source]

Interpolate the missing values based on nearby values.

For example, with an array like this:

array([[-1.24940, -1.38673, -0.03214945, 0.08255145, -0.007415],
[ 2.14662, 0.32758 , -0.82601414, 1.78124027, 0.873998], [-0.41400, -0.977629, nan, -1.39255344, 1.680435], [ 0.40975, 1.067599, 0.29152388, -1.70160145, -0.565226], [-0.54592, -1.126187, 2.04004377, 0.16664863, -0.010677]])

Using a k or window size of 3. The one missing value would be set to -1.18509122. The window operates on the horizontal axis.

Parameters:
data: numpy.ndarray

2D matrix to impute.

nindex: int

Null index. Index of the null value inside the moving average window. Use cases: Say you wanted to make value skewed toward the left or right side. 0 would only take the average of values from the right and -1 would only take the average of values from the left

wsize: int

Window size. Size of the moving average window/area of values being used for each local imputation. This number includes the missing value.

errors: {“raise”, “coerce”, “ignore”}

Errors will occur with the indexing of the windows - for example if there is a nan at data[x][0] and nindex is set to -1 or there is a nan at data[x][-1] and nindex is set to 0. “raise” will raise an error, “coerce” will try again using an nindex set to the middle and “ignore” will just leave it as a nan.

inplace: {True, False}

Whether to return a copy or run on the passed-in array

Returns:
numpy.ndarray

Imputed data.