API

AxisSets.KeyAlignmentErrorType
KeyAlignmentError

Is thrown when the constrained dimensions of components in a KeyedDataset have misaligned key values.

Fields

  • constraint::Pattern - Constraint pattern describing all dimensions that must align
  • groups - An iterator of paths and keys for each non-matching group
source
AxisSets.KeyedDatasetType
KeyedDataset

A KeyedDataset describes an associative collection of component KeyedArrays with constraints on their shared dimensions.

Fields

  • constraints::OrderedSet{Pattern} - Constraint Patterns on shared dimensions.
  • data::LittleDict{Tuple, KeyedArray} - Flattened key paths as tuples component keyed arrays.
source
AxisSets.KeyedDatasetMethod
(ds::KeyedDataset)(key) -> KeyedDataset

A collable syntax for selecting of filtering a subset of a KeyedDataset.

Example

julia> using AxisKeys; using AxisSets: KeyedDataset, flatten;

julia> ds = KeyedDataset(
           flatten([
               :g1 => [
                   :a => KeyedArray(zeros(3); time=1:3),
                   :b => KeyedArray(ones(3, 2); time=1:3, loc=[:x, :y]),
                ],
                :g2 => [
                    :a => KeyedArray(ones(3); time=1:3),
                    :b => KeyedArray(zeros(3, 2); time=1:3, loc=[:x, :y]),
                ]
            ])...
       );

julia> collect(keys(ds(:__, :a).data))
2-element Vector{Tuple}:
 (:g1, :a)
 (:g2, :a)

julia> collect(keys(ds(:g1, :__).data))
2-element Vector{Tuple}:
 (:g1, :a)
 (:g1, :b)
source
AxisSets.PatternType
Pattern

A pattern is just a wrapper around a Tuple which enables searching and filtering for matching components and dimension paths in a KeyedDataset. Special symbols :_ and :__ are used as wildcards, similar to * and ** in glob pattern matching.

Example

julia> using AxisSets: Pattern;

julia> items = [
           (:train, :input, :load, :time),
           (:train, :input, :load, :id),
           (:train, :input, :temperature, :time),
           (:train, :input, :temperature, :id),
           (:train, :output, :load, :time),
           (:train, :output, :load, :id),
       ];

julia> filter(in(Pattern(:__, :time)), items)
3-element Vector{NTuple{4, Symbol}}:
 (:train, :input, :load, :time)
 (:train, :input, :temperature, :time)
 (:train, :output, :load, :time)

julia> filter(in(Pattern(:__, :load, :_)), items)
4-element Vector{NTuple{4, Symbol}}:
 (:train, :input, :load, :time)
 (:train, :input, :load, :id)
 (:train, :output, :load, :time)
 (:train, :output, :load, :id)
source
AxisKeys.axiskeysMethod
axiskeys(ds)
axiskeys(ds, dimname)
axiskeys(ds, pattern)
axiskeys(ds, dimpath)

Returns a list of unique axis keys within the KeyedDataset. A Tuple will always be returned unless you explicitly specify the dimpath you want.

Example

julia> using AxisKeys; using AxisSets: KeyedDataset;

julia> ds = KeyedDataset(
           :val1 => KeyedArray(rand(4, 3, 2); time=1:4, loc=-1:-1:-3, obj=[:a, :b]),
           :val2 => KeyedArray(rand(4, 3, 2) .+ 1.0; time=1:4, loc=-1:-1:-3, obj=[:a, :b]),
       );

julia> axiskeys(ds)
(1:4, -1:-1:-3, [:a, :b])

julia> axiskeys(ds, :time)
(1:4,)

julia> axiskeys(ds, (:val1, :time))
1:4
source
AxisSets.constraintmapMethod
constraintmap(ds)

Returns a mapping of constraint patterns to specific dimension paths. The returned dictionary has keys of type Pattern and the values are sets of Tuple.

Example

julia> using AxisKeys; using AxisSets: KeyedDataset, constraintmap;

julia> ds = KeyedDataset(
           :val1 => KeyedArray(rand(4, 3, 2); time=1:4, loc=-1:-1:-3, obj=[:a, :b]),
           :val2 => KeyedArray(rand(4, 3, 2) .+ 1.0; time=1:4, loc=-1:-1:-3, obj=[:a, :b]),
       );

julia> collect(constraintmap(ds))
3-element Vector{Pair{AxisSets.Pattern, Set{Tuple}}}:
 Pattern((:__, :time)) => Set([(:val2, :time), (:val1, :time)])
  Pattern((:__, :loc)) => Set([(:val1, :loc), (:val2, :loc)])
  Pattern((:__, :obj)) => Set([(:val2, :obj), (:val1, :obj)])
source
AxisSets.dimpathsMethod
dimpaths(ds, [pattern]) -> Vector{<:Tuple}

Return a list of all dimension paths in the KeyedDataset. Optionally, you can filter the results using a Pattern.

Example

julia> using AxisKeys; using AxisSets: KeyedDataset, dimpaths;

julia> ds = KeyedDataset(
           :val1 => KeyedArray(rand(4, 3, 2); time=1:4, loc=-1:-1:-3, obj=[:a, :b]),
           :val2 => KeyedArray(rand(4, 3, 2) .+ 1.0; time=1:4, loc=-1:-1:-3, obj=[:a, :b]),
       );

julia> dimpaths(ds)
6-element Vector{Tuple{Symbol, Symbol}}:
 (:val1, :time)
 (:val1, :loc)
 (:val1, :obj)
 (:val2, :time)
 (:val2, :loc)
 (:val2, :obj)
source
AxisSets.flattenFunction
flatten(collection, [delim])

Flatten a collection of nested associative types into a flat collection of pairs.

Example

julia> using AxisSets: flatten

julia> data = (
           val1 = (a1 = 1, a2 = 2),
           val2 = (b1 = 11, b2 = 22),
           val3 = [111, 222],
           val4 = 4.3,
       );

julia> flatten(data, :_)
(val1_a1 = 1, val1_a2 = 2, val2_b1 = 11, val2_b2 = 22, val3 = [111, 222], val4 = 4.3)
flatten(A, dims, [delim])

Flatten a KeyedArray along the specified consecutive dimensions. The dims argument can either be a Tuple of symbols or a Pair{Tuple, Symbol} if you'd like to specify the desired flattened dimension name.

Example

julia> using AxisKeys, Dates, NamedDims; using AxisSets: flatten

julia> A = KeyedArray(
           reshape(1:24, (4, 3, 2));
           time=DateTime(2021, 1, 1, 11):Hour(1):DateTime(2021, 1, 1, 14),
           obj=[:a, :b, :c],
           loc=[1, 2],
       );

julia> dimnames(flatten(A, (:obj, :loc), :_))
(:time, :obj_loc)
source
AxisSets.rekeyMethod
rekey(f, ds, dim)

Apply function f to key values of each matching dim in the KeyedDataset. dim can either be a Symbol or a Pattern for the dimension paths.

Example

julia> using AxisKeys; using AxisSets: KeyedDataset, rekey;

julia> ds = KeyedDataset(
           :a => KeyedArray(zeros(3); time=1:3),
           :b => KeyedArray(ones(3, 2); time=1:3, loc=[:x, :y]),
       );

julia> r = rekey(k -> k .+ 1, ds, :time);

julia> r.time
3-element ReadOnlyArrays.ReadOnlyArray{Int64, 1, UnitRange{Int64}}:
 2
 3
 4
source
AxisSets.validateMethod
validate(ds, [constraint])

Validate that all constrained dimension paths within a KeyedDataset have matching key values. Optionally, you can test an explicit constraint Pattern.

Returns

  • true if an error isn't thrown

Throws

  • ArgumentError: If the constraints are not respected
source
Base.getindexMethod
getindex(ds::KeyedDataset, key)

Lookup KeyedDataset component by its Tuple key, or Symbol for keys of depth 1. Shared axis keys for the returned KeyedArray are wrapped in a ReadOnlyArray for safety.

Example

```jldoctest julia> using AxisKeys; using AxisSets: KeyedDataset;

julia> ds = KeyedDataset( :val1 => KeyedArray(zeros(3, 2); time=1:3, obj=[:a, :b]), :val2 => KeyedArray(ones(3, 2) .+ 1.0; time=1:3, loc=[:x, :y]), );

julia> ds[:val1] 2-dimensional KeyedArray(NamedDimsArray(...)) with keys: ↓ time ∈ 3-element ReadOnlyArrays.ReadOnlyArray{Int64,...} → obj ∈ 2-element ReadOnlyArrays.ReadOnlyArray{Symbol,...} And data, 3×2 Array{Float64,2}: (:a) (:b) (1) 0.0 0.0 (2) 0.0 0.0 (3) 0.0 0.0

source
Base.getpropertyMethod
getproperty(ds::KeyedDataset, sym::Symbol)

Extract KeyedDataset fields, dimension keys or components in that order. Shared axis keys are wrapped in a ReadOnlyArray for safety.

Example

julia> using AxisKeys; using AxisSets: KeyedDataset;

julia> ds = KeyedDataset(
           :val1 => KeyedArray(zeros(3, 2); time=1:3, obj=[:a, :b]),
           :val2 => KeyedArray(ones(3, 2) .+ 1.0; time=1:3, loc=[:x, :y]),
       );

julia> collect(keys(ds.data))
2-element Vector{Tuple}:
 (:val1,)
 (:val2,)

julia> ds.time
3-element ReadOnlyArrays.ReadOnlyArray{Int64, 1, UnitRange{Int64}}:
 1
 2
 3

julia> dimnames(ds.val1)
(:time, :obj)
source
Base.mapMethod
map(f, ds, [key]) -> KeyedDataset

Apply function f to each component of the KeyedDataset. Returns a new dataset with the same constraints, but new components. The function can be applied to a subselection of components via a Pattern key.

Example

julia> using AxisKeys, Statistics; using AxisSets: KeyedDataset, flatten;

julia> ds = KeyedDataset(
           flatten([
               :g1 => [
                   :a => KeyedArray(zeros(3); time=1:3),
                   :b => KeyedArray(ones(3, 2); time=1:3, loc=[:x, :y]),
                ],
                :g2 => [
                    :a => KeyedArray(ones(3); time=1:3),
                    :b => KeyedArray(zeros(3, 2); time=1:3, loc=[:x, :y]),
                ]
            ])...
       );

julia> r = map(a -> a .+ 100, ds, (:__, :a, :_));  # The extra `:_` is to clarify that we don't care about the dimnames.

julia> [k => mean(v) for (k, v) in r.data]  # KeyedArray printing isn't consistent in jldoctests
4-element Vector{Pair{Tuple{Symbol, Symbol}, Float64}}:
 (:g1, :a) => 100.0
 (:g1, :b) => 1.0
 (:g2, :a) => 101.0
 (:g2, :b) => 0.0
source
Base.mapslicesMethod
mapslices(f, ds, [key]; dims) -> KeyedDataset

Apply the mapslices call to each of the desired components and returns a new KeyedDataset. Selection Patterns may be provided via key, but components are selected by the desired dims by default.

Example

julia> using AxisKeys, Statistics; using AxisSets: KeyedDataset;

julia> ds = KeyedDataset(
           :val1 => KeyedArray(zeros(3, 2); time=1:3, obj=[:a, :b]),
           :val2 => KeyedArray(ones(3, 2); time=1:3, loc=[:x, :y]),
       );

julia> r = mapslices(sum, ds; dims=:time);  # KeyedArray printing isn't consistent in jldoctests

julia> [k => parent(parent(v)) for (k, v) in r.data]
2-element Vector{Pair{Tuple{Symbol}, Matrix{Float64}}}:
 (:val1,) => [0.0 0.0]
 (:val2,) => [3.0 3.0]
source
Base.mergeMethod
merge(ds::KeyedDataset, others::KeyedDataset...)

Combine the constraints and data from multiple KeyedDatasets.

Example

julia> using AxisKeys; using AxisSets: KeyedDataset;

julia> ds1 = KeyedDataset(
           :a => KeyedArray(zeros(3); time=1:3),
           :b => KeyedArray(ones(3, 2); time=1:3, loc=[:x, :y]),
       );

julia> ds2 = KeyedDataset(
           :c => KeyedArray(ones(3); time=1:3),
           :d => KeyedArray(zeros(3, 2); time=1:3, loc=[:x, :y]),
       );

julia> collect(keys(merge(ds1, ds2).data))
4-element Vector{Tuple}:
 (:a,)
 (:b,)
 (:c,)
 (:d,)
source
Base.setindex!Method
setindex!(ds::KeyedDataset{T}, val, key) -> T

Store the new val in the KeyedDataset. If any new dimension names don't any existing constraints then Pattern(:__, <dimname>) is used by default. If the axis values of the new val doesn't meet the existing constraints in the dataset then an error will be throw.

Example

julia> using AxisKeys; using AxisSets: KeyedDataset, constraintmap;

julia> ds = KeyedDataset(:a => KeyedArray(zeros(3); time=1:3));

julia> ds[:b] = KeyedArray(ones(3, 2); time=1:3, lag=[-1, -2]);

julia> collect(constraintmap(ds))
2-element Vector{Pair{AxisSets.Pattern, Set{Tuple}}}:
 Pattern((:__, :time)) => Set([(:b, :time), (:a, :time)])
  Pattern((:__, :lag)) => Set([(:b, :lag)])

julia> ds[:c] = KeyedArray(ones(3, 2); time=2:4, lag=[-1, -2])
ERROR: KeyAlignmentError: Misaligned dimension keys on constraint Pattern((:__, :time))
  Tuple[(:b, :time), (:a, :time)] ∈ 3-element UnitRange{Int64}
  Tuple[(:c, :time)] ∈ 3-element UnitRange{Int64}
source
FeatureTransforms.applyFunction
FeatureTransforms.apply(ds::KeyedDataset, t::Transform, [key]; dims=:, kwargs...)

Apply the Transform to each component of the KeyedDataset. Returns a new dataset with the same constraints, but transformed components.

The transform can be applied to a subselection of components via a Pattern key. Otherwise, components are selected by the desired dims.

Keyword arguments including dims are passed to the appropriate FeatureTransforms method for a component.

Example

julia> using AxisKeys, FeatureTransforms; using AxisSets: KeyedDataset, Pattern, flatten;

julia> ds = KeyedDataset(
           flatten([
               :train => [
                   :load => KeyedArray([7.0 7.7; 8.0 8.2; 9.0 9.9]; time=1:3, loc=[:x, :y]),
                   :price => KeyedArray([-2.0 4.0; 3.0 2.0; -1.0 -1.0]; time=1:3, id=[:a, :b]),
               ],
               :predict => [
                   :load => KeyedArray([7.0 7.7; 8.1 7.9; 9.0 9.9]; time=1:3, loc=[:x, :y]),
                   :price => KeyedArray([0.5 -1.0; -5.0 -2.0; 0.0 1.0]; time=1:3, id=[:a, :b]),
               ]
           ])...
       );

julia> p = Power(2);

julia> r = FeatureTransforms.apply(ds, p, (:_, :price, :_));

julia> [k => parent(parent(v)) for (k, v) in r.data]
4-element Vector{Pair{Tuple{Symbol, Symbol}, Matrix{Float64}}}:
    (:train, :load) => [7.0 7.7; 8.0 8.2; 9.0 9.9]
   (:train, :price) => [4.0 16.0; 9.0 4.0; 1.0 1.0]
  (:predict, :load) => [7.0 7.7; 8.1 7.9; 9.0 9.9]
 (:predict, :price) => [0.25 1.0; 25.0 4.0; 0.0 1.0]
source
Impute.applyMethod
Impute.apply(ds, filter; dims)

Filter out missing data along the dims for each component in the KeyedDataset with that dimension.

Example

julia> using AxisKeys, Impute; using AxisSets: KeyedDataset, Pattern, flatten;

julia> ds = KeyedDataset(
           flatten([
               :train => [
                   :temp => KeyedArray([1.0 1.1; missing 2.2; 3.0 3.3]; time=1:3, id=[:a, :b]),
                   :load => KeyedArray([7.0 7.7; 8.0 missing; 9.0 9.9]; time=1:3, loc=[:x, :y]),
                ],
                :predict => [
                   :temp => KeyedArray([1.0 missing; 2.0 2.2; 3.0 3.3]; time=1:3, id=[:a, :b]),
                   :load => KeyedArray([7.0 7.7; 8.1 missing; 9.0 9.9]; time=1:3, loc=[:x, :y]),
                ]
            ])...
       );

julia> [k => parent(parent(v)) for (k, v) in Impute.filter(ds; dims=:time).data]  # KeyedArray printing isn't consistent in jldoctests
4-element Vector{Pair{Tuple{Symbol, Symbol}, Matrix{Union{Missing, Float64}}}}:
   (:train, :temp) => [3.0 3.3]
   (:train, :load) => [9.0 9.9]
 (:predict, :temp) => [3.0 3.3]
 (:predict, :load) => [9.0 9.9]

julia> [k => parent(parent(v)) for (k, v) in Impute.filter(ds; dims=Pattern(:train, :__, :time)).data]
4-element Vector{Pair{Tuple{Symbol, Symbol}, Matrix{Union{Missing, Float64}}}}:
   (:train, :temp) => [1.0 1.1; 3.0 3.3]
   (:train, :load) => [7.0 7.7; 9.0 9.9]
 (:predict, :temp) => [1.0 missing; 3.0 3.3]
 (:predict, :load) => [7.0 7.7; 9.0 9.9]

julia> [k => parent(parent(v)) for (k, v) in Impute.filter(ds; dims=:loc).data]
4-element Vector{Pair{Tuple{Symbol, Symbol}, Matrix{Union{Missing, Float64}}}}:
   (:train, :temp) => [1.0 1.1; missing 2.2; 3.0 3.3]
   (:train, :load) => [7.0; 8.0; 9.0]
 (:predict, :temp) => [1.0 missing; 2.0 2.2; 3.0 3.3]
 (:predict, :load) => [7.0; 8.1; 9.0]
source
Impute.imputeMethod
Impute.impute(ds, imp; dims)

Apply the imputation algorithm imp along the dims for all components of the KeyedDataset with that dimension.

Example

julia> using AxisKeys, Impute; using AxisSets: KeyedDataset, flatten;

julia> ds = KeyedDataset(
           flatten([
               :train => [
                   :temp => KeyedArray([1.0 1.1; missing 2.2; 3.0 3.3]; time=1:3, id=[:a, :b]),
                   :load => KeyedArray([7.0 7.7; 8.0 missing; 9.0 9.9]; time=1:3, loc=[:x, :y]),
                ],
                :predict => [
                   :temp => KeyedArray([1.0 missing; 2.0 2.2; 3.0 3.3]; time=1:3, id=[:a, :b]),
                   :load => KeyedArray([7.0 7.7; 8.1 missing; 9.0 9.9]; time=1:3, loc=[:x, :y]),
                ]
            ])...
       );

julia> [k => parent(parent(v)) for (k, v) in Impute.substitute(ds; dims=:time).data]  # KeyedArray printing isn't consistent in jldoctests
4-element Vector{Pair{Tuple{Symbol, Symbol}, Matrix{Union{Missing, Float64}}}}:
   (:train, :temp) => [1.0 1.1; 2.2 2.2; 3.0 3.3]
   (:train, :load) => [7.0 7.7; 8.0 8.0; 9.0 9.9]
 (:predict, :temp) => [1.0 1.0; 2.0 2.2; 3.0 3.3]
 (:predict, :load) => [7.0 7.7; 8.1 8.1; 9.0 9.9]

julia> [k => parent(parent(v)) for (k, v) in Impute.substitute(ds; dims=:loc).data]
4-element Vector{Pair{Tuple{Symbol, Symbol}, Matrix{Union{Missing, Float64}}}}:
   (:train, :temp) => [1.0 1.1; missing 2.2; 3.0 3.3]
   (:train, :load) => [7.0 7.7; 8.0 8.8; 9.0 9.9]
 (:predict, :temp) => [1.0 missing; 2.0 2.2; 3.0 3.3]
 (:predict, :load) => [7.0 7.7; 8.1 8.8; 9.0 9.9]
source
Impute.validateMethod
Impute.validate(ds::KeyedDataset, validator::Validator; dims=:)

Apply the validator to components in the KeyedDataset with the specified dims.

source
NamedDims.dimnamesMethod
dimnames(ds)

Returns a list of the unique dimension names within the KeyedDataset.

Example

julia> using AxisKeys; using NamedDims; using AxisSets: KeyedDataset;

julia> ds = KeyedDataset(
           :val1 => KeyedArray(rand(4, 3, 2); time=1:4, loc=-1:-1:-3, obj=[:a, :b]),
           :val2 => KeyedArray(rand(4, 3, 2) .+ 1.0; time=1:4, loc=-1:-1:-3, obj=[:a, :b]),
       );

julia> dimnames(ds)
3-element Vector{Symbol}:
 :time
 :loc
 :obj
source