Declaring Missings

Impute.declaremissingsFunction
Impute.declaremissings(data; values)

DeclareMissings (or replace) various missing data representations with missing.

Keyword Arguments

  • value::Tuple: A tuple of values that should be considered missing

Example

julia> using DataFrames, Impute


julia> df = DataFrame(
           :a => [1.1, 2.2, NaN, NaN, 5.5],
           :b => [1, 2, 3, -9999, 5],
           :c => ["v", "w", "x", "y", "NULL"],
       )
5×3 DataFrame
 Row │ a        b      c
     │ Float64  Int64  String
─────┼────────────────────────
   1 │     1.1      1  v
   2 │     2.2      2  w
   3 │   NaN        3  x
   4 │   NaN    -9999  y
   5 │     5.5      5  NULL

julia> Impute.declaremissings(df; values=(NaN, -9999, "NULL"))
5×3 DataFrame
 Row │ a          b        c
     │ Float64?   Int64?   String?
─────┼─────────────────────────────
   1 │       1.1        1  v
   2 │       2.2        2  w
   3 │ missing          3  x
   4 │ missing    missing  y
   5 │       5.5        5  missing
source
Impute.DeclareMissingsType
DeclareMissings(; values)

DeclareMissings (or replace) various missing data values with missing. This is useful for downstream imputation methods that assume missing data is represented by a missing.

!!! In-place methods are only applicable for datasets which already allowmissing.

Keyword Arguments

  • value::Tuple: A tuple of values that should be considered missing

Example

julia> using Impute: DeclareMissings, apply

julia> M = [1.0 2.0 -9999.0 NaN 5.0; 1.1 2.2 3.3 0.0 5.5]
2×5 Matrix{Float64}:
 1.0  2.0  -9999.0  NaN    5.0
 1.1  2.2      3.3    0.0  5.5

julia> apply(M, DeclareMissings(; values=(NaN, -9999.0, 0.0)))
2×5 Matrix{Union{Missing, Float64}}:
 1.0  2.0   missing  missing  5.0
 1.1  2.2  3.3       missing  5.5
source