Metadata
Manually reading JLSO files can be helpful when addressing issues deserializing objects or to simply to help with reproducibility.
using JLSO
jlso = read("breakfast.jlso", JLSOFile)JLSOFile([cost, food, time]; version="4.0.0", julia="1.5.4", format=:julia_serialize, compression=:gzip, image="")
Now we can manually access the serialized objects:
jlso.objectsDict{Symbol,Array{UInt8,1}} with 3 entries:
:cost => UInt8[0x1f, 0x8b, 0x08, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x03 … …
:food => UInt8[0x1f, 0x8b, 0x08, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x03 … …
:time => UInt8[0x1f, 0x8b, 0x08, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x03 … …Or deserialize individual objects:
jlso[:food]"☕️🥓🍳"
Maybe you need to figure out what packages you had installed in the save environment?
jlso.projectDict{String,Any} with 2 entries:
"deps" => Dict{String,Any}("Documenter"=>"e30172f5-a6a5-5a46-863b-614d45cd2…
"compat" => Dict{String,Any}("Documenter"=>"0.26")In extreme cases, you may need to inspect the full environment stack. For example, having a struct changed in a dependency.
jlso.manifestDict{String,Any} with 41 entries:
"Mocking" => Dict{String,Any}[Dict("deps"=>["ExprTools"],"git-tree-sha1…
"Pkg" => Dict{String,Any}[Dict("deps"=>["Dates", "LibGit2", "Libdl"…
"TimeZones" => Dict{String,Any}[Dict("deps"=>["Dates", "EzXML", "Mocking"…
"Documenter" => Dict{String,Any}[Dict("deps"=>["Base64", "Dates", "DocStri…
"BSON" => Dict{String,Any}[Dict("git-tree-sha1"=>"db18b5ea04686f73d2…
"Test" => Dict{String,Any}[Dict("deps"=>["Distributed", "Interactive…
"Zlib_jll" => Dict{String,Any}[Dict("deps"=>["Artifacts", "JLLWrappers",…
"IOCapture" => Dict{String,Any}[Dict("deps"=>["Logging"],"git-tree-sha1"=…
"Random" => Dict{String,Any}[Dict("deps"=>["Serialization"],"uuid"=>"9…
"Libdl" => Dict{String,Any}[Dict("uuid"=>"8f399da3-3557-5675-b5ff-fb8…
"JLSO" => Dict{String,Any}[Dict("deps"=>["BSON", "CodecZlib", "FileP…
"UUIDs" => Dict{String,Any}[Dict("deps"=>["Random", "SHA"],"uuid"=>"c…
"Distributed" => Dict{String,Any}[Dict("deps"=>["Random", "Serialization", …
"Serialization" => Dict{String,Any}[Dict("uuid"=>"9e88b42a-f829-5b0c-bbe9-9e9…
"SHA" => Dict{String,Any}[Dict("uuid"=>"ea8e919c-243c-51af-8825-aaa…
"REPL" => Dict{String,Any}[Dict("deps"=>["InteractiveUtils", "Markdo…
"Memento" => Dict{String,Any}[Dict("deps"=>["Dates", "Distributed", "JS…
"Syslogs" => Dict{String,Any}[Dict("deps"=>["Printf", "Sockets"],"git-t…
"CodecZlib" => Dict{String,Any}[Dict("deps"=>["TranscodingStreams", "Zlib…
⋮ => ⋮These project and manifest fields are just the dictionary representations of the Project.toml and Manifest.toml files found in a Julia Pkg environment. As such, we can also use Pkg.activate to construct and environment matching that used to write the file.
dir = joinpath(dirname(dirname(pathof(JLSO))), "test", "specimens")
jlso = read(joinpath(dir, "v4_bson_none.jlso"), JLSOFile)
jlso[:DataFrame]1355-element Array{UInt8,1}:
0x4b
0x05
0x00
0x00
0x02
0x74
0x61
0x67
0x00
0x07
⋮
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00
0x00Unfortunately, we can't load some objects in the current environment, so we might try to load the offending package only to find out it isn't part of our current environment.
try using DataFrames catch e @warn e end┌ Warning: ArgumentError("Package DataFrames not found in current path:\n- Run `import Pkg; Pkg.add(\"DataFrames\")` to install the DataFrames package.\n")
└ @ Main.ex-metadata-example none:1Okay, so we don't have DataFrames loaded and it isn't part of our current environment. Rather than adding every possible package needed to deserialize the objects in the file, we can use the Pkg.activate do-block syntax to:
- Initialize the exact environment needed to deserialize our objects
- Load our desired dependencies
- Migrate our data to a more appropriate long term format
using Pkg
# Now we can run our conversion logic in an isolated environment
mktempdir(pwd()) do d
cd(d) do
# Modify our Manifest to just use the latest release of JLSO
delete!(jlso.manifest, "JLSO")
Pkg.activate(jlso, d) do
@eval Main begin
using Pkg; Pkg.resolve(); Pkg.instantiate(; verbose=true)
using DataFrames, JLSO
describe($(jlso)[:DataFrame])
end
end
end
end| variable | mean | min | median | max | nunique | nmissing | eltype | |
|---|---|---|---|---|---|---|---|---|
| Symbol | Union… | Any | Union… | Any | Union… | Union… | DataType | |
| 1 | a | 3.0 | 1 | 3.0 | 5 | Int64 | ||
| 2 | b | 0.772432 | 0.512452 | 0.863122 | 0.907903 | Float64 | ||
| 3 | c | a | e | 5 | 0 | Any | ||
| 4 | d | 0.6 | 0 | 1.0 | 1 | Bool |
NOTE:
- Comparing
projectandmanifestdictionaries isn't ideal, but it's currently unclear if that should live here or in Pkg.jl. - The
Pkg.activateworkflow could probably be replaced with a macro