API

Transforms

Abstract Transform Types

FeatureTransforms.Transform — Type

Transform

Abstract supertype for all feature Transforms.

source

FeatureTransforms.AbstractScaling — Type

AbstractScaling <: Transform

Linearly scale the data according to some statistics.

source

Implemented Transforms

FeatureTransforms.HoD — Type

HoD <: Transform

Get the hour of day corresponding to the data.

source

FeatureTransforms.Power — Type

Power(exponent) <: Transform

Raise the data by the given exponent.

source

FeatureTransforms.Periodic — Type

Periodic{P, S}(f, period::P, [phase_shift::S]) <: Transform

Applies a periodic function f with provided period and phase_shift to the data.

The period and phase_shift must have the same supertype of Real or Period, depending on whether the data is Real or TimeType respectively.

Note

For TimeType data, the result will change depending on the type of period given, even if the same amount of time is described. Example: Week(1) vs Second(Week(1)); the former starts the period on the most recent Monday, while the latter starts the period on the most recent multiple of 604800 seconds since time 0.

Fields

f::Union{typeof(cos), typeof(sin)}: the periodic function
period::Union{Real, Period}: the function period. Must be strictly positive.
phase_shift::Union{Real, Period} (optional): adjusts the phase of the periodic function, measured in the same units as the input. Increasing the value translates the function to the right, toward higher/later input values.

source

FeatureTransforms.StandardScaling — Type

StandardScaling <: AbstractScaling

Transforms the data according to

x -> (x - μ) / σ

where μ and σ are the mean and standard deviation of the training data.

Note

fit!(scaling, data) needs to be called before the transform can be applyed. By default all the data is considered when fit!ing the mean and standard deviation.

source

FeatureTransforms.IdentityScaling — Type

IdentityScaling <: AbstractScaling

Represents the no-op scaling which simply returns the data it is applied on.

source

FeatureTransforms.InverseHyperbolicSine — Type

InverseHyperbolicSine <: Transform

Logarithmically transform the data through: log(x + √(x² + 1)).

This is the inverse hyperbolic sine.

source

FeatureTransforms.LinearCombination — Type

LinearCombination(coefficients) <: Transform

Calculates the linear combination of a collection of terms weighted by some coefficients.

When applied to an N-dimensional array, LinearCombination reduces along the dim provided and returns an (N-1)-dimensional array.

If no inds are specified, then the transform is applied to all elements.

!!!note The current default is that dims=1 but this behaviour will be deprecated in a future release and the dims keyword argument will have to be specified explicitly. https://github.com/invenia/FeatureTransforms.jl/issues/82

source

FeatureTransforms.LogTransform — Type

LogTransform <: Transform

Logarithmically transform the data through: sign(x) * log(|x| + 1).

This allows transformations of all real numbers, not just positive ones.

source

FeatureTransforms.OneHotEncoding — Type

OneHotEncoding{R<:Real} <: Transform

One-hot encode the categorical value for each target element.

Construct a n-by-p binary matrix, given a Vector of target data x (of length n) and a Vector of all unique possible values in x (of length p).

The element [i, j] is true if the i^th target in x corresponds to the j^th possible value and false otherwise. Note that Rcan be specified to determine the return type of results. It defaults to a Matrix of Bools.

Note that this Transform does not support specifying dims other than : (all dims) because it is a one-to-many transform (for example a Vector input produces a Matrix output).

Note that OneHotEncoding needs to be first encoded with the expected categories before it can be used. This is because the data might be missing certain categories which will lead to incomplete classification.

source

Applying Transforms

FeatureTransforms.apply — Function

apply(data::T, ::Transform; kwargs...)

Applies the Transform to the data. New transforms should usually only extend _apply which this method delegates to.

Where necessary, this should be extended for new data types T.

source

FeatureTransforms.apply! — Function

apply!(data::T, ::Transform; kwargs...) -> T

Applies the Transform mutating the input data. This method delegates to apply under the hood so does not need to be defined separately.

If Transform does not support mutation, this method will error.

source

FeatureTransforms.apply_append — Function

apply_append(A::AbstractArray, ::Transform; append_dim, kwargs...)

Applies the Transform to A and returns the result in a new array where the output is appended to A along the append_dim dimension. The remaining kwargs correspond to the usual Transform being invoked.

source

apply_append(table, ::Transform; [header], kwargs...)

Applies the Transform to the table and appends the result in a new table with an optional header. If none is provided the default in Tables.table is used. The remaining kwargs correspond to the Transform being invoked.

source

Transform Interface

FeatureTransforms.is_transformable — Function

is_transformable(x)

Determine if x is both a valid input and output of any Transform, i.e. that it has an apply method defined and therefore follows the transform interface.

source

FeatureTransforms.transform! — Function

transform!(::T, data)

Mutating version of transform.

source

FeatureTransforms.transform — Function

transform(::T, data)

Defines the feature engineering pipeline for some type T, which comprises a collection of Transforms and other steps to be peformed on the data.

The idea around a "transform interface” is to make feature transformations composable, i.e. the output of any one Transform should be valid input to another.

Feature engineering pipelines should obey the same principle and it should be trivial to add/remove Transform steps that compose the pipeline without it breaking.

transform should be overloaded for custom types T that require feature engineering. The only requirement is that the return of transformis itself "transformable", i.e. calling is_transformable on the output returns true.

source

Deprecated funtionality

FeatureTransforms.MeanStdScaling — Type

MeanStdScaling(μ, σ) <: AbstractScaling

Linearly scale the data by the statistical mean μ and standard deviation σ. This is also known as standardization, or the Z score transform.

Keyword arguments to apply

inverse=true: inverts the scaling (e.g. to reconstruct the unscaled data).
eps=1e-3: used in place of all 0 values in σ before scaling (if inverse=false).

source