【发布时间】:2020-12-04 00:14:12
【问题描述】:
我做了一个小实验,我知道这只是因为列的不同数据类型包含在 CSV 中。请看下面的代码
julia> using DataFrames
julia> df = DataFrame(:a => [1.0, 2, missing, missing, 5.0], :b => [1.1, 2.2, 3, missing, 5],:c => [1,3,5,missing,6])
5×3 DataFrame
│ Row │ a │ b │ c │
│ │ Float64? │ Float64? │ Int64? │
├─────┼──────────┼──────────┼─────────┤
│ 1 │ 1.0 │ 1.1 │ 1 │
│ 2 │ 2.0 │ 2.2 │ 3 │
│ 3 │ missing │ 3.0 │ 5 │
│ 4 │ missing │ missing │ missing │
│ 5 │ 5.0 │ 5.0 │ 6 │
julia> df
5×3 DataFrame
│ Row │ a │ b │ c │
│ │ Float64? │ Float64? │ Int64? │
├─────┼──────────┼──────────┼─────────┤
│ 1 │ 1.0 │ 1.1 │ 1 │
│ 2 │ 2.0 │ 2.2 │ 3 │
│ 3 │ missing │ 3.0 │ 5 │
│ 4 │ missing │ missing │ missing │
│ 5 │ 5.0 │ 5.0 │ 6 │
julia> using Impute
julia> Impute.interp(df)
ERROR: InexactError: Int64(5.5)
Stacktrace:
[1] Int64 at ./float.jl:710 [inlined]
[2] convert at ./number.jl:7 [inlined]
[3] convert at ./missing.jl:69 [inlined]
[4] setindex! at ./array.jl:826 [inlined]
[5] (::Impute.var"#58#59"{Int64,Array{Union{Missing, Int64},1}})(::Impute.Context) at /home/synerzip/.julia/packages/Impute/GmIMg/src/imputors/interp.jl:67
[6] (::Impute.Context)(::Impute.var"#58#59"{Int64,Array{Union{Missing, Int64},1}}) at /home/synerzip/.julia/packages/Impute/GmIMg/src/context.jl:227
[7] _impute!(::Array{Union{Missing, Int64},1}, ::Impute.Interpolate) at /home/synerzip/.julia/packages/Impute/GmIMg/src/imputors/interp.jl:49
[8] impute!(::Array{Union{Missing, Int64},1}, ::Impute.Interpolate) at /home/synerzip/.julia/packages/Impute/GmIMg/src/imputors.jl:84
[9] impute!(::DataFrame, ::Impute.Interpolate) at /home/synerzip/.julia/packages/Impute/GmIMg/src/imputors.jl:172
[10] #impute#17 at /home/synerzip/.julia/packages/Impute/GmIMg/src/imputors.jl:76 [inlined]
[11] impute at /home/synerzip/.julia/packages/Impute/GmIMg/src/imputors.jl:76 [inlined]
[12] _impute(::DataFrame, ::Type{Impute.Interpolate}) at /home/synerzip/.julia/packages/Impute/GmIMg/src/imputors.jl:58
[13] #interp#105 at /home/synerzip/.julia/packages/Impute/GmIMg/src/Impute.jl:84 [inlined]
[14] interp(::DataFrame) at /home/synerzip/.julia/packages/Impute/GmIMg/src/Impute.jl:84
[15] top-level scope at REPL[15]:1
当我运行以下代码时不会出现此错误
julia> df = DataFrame(:a => [1.0, 2, missing, missing, 5.0], :b => [1.1, 2.2, 3, missing, 5])
5×2 DataFrame
│ Row │ a │ b │
│ │ Float64? │ Float64? │
├─────┼──────────┼──────────┤
│ 1 │ 1.0 │ 1.1 │
│ 2 │ 2.0 │ 2.2 │
│ 3 │ missing │ 3.0 │
│ 4 │ missing │ missing │
│ 5 │ 5.0 │ 5.0 │
julia> Impute.interp(df)
5×2 DataFrame
│ Row │ a │ b │
│ │ Float64? │ Float64? │
├─────┼──────────┼──────────┤
│ 1 │ 1.0 │ 1.1 │
│ 2 │ 2.0 │ 2.2 │
│ 3 │ 3.0 │ 3.0 │
│ 4 │ 4.0 │ 4.0 │
│ 5 │ 5.0 │ 5.0 │
现在我知道原因,但对如何解决它感到困惑。我在读取 CSV 时不能使用 eltype,因为在我的数据集中包含 171 列,并且它通常具有 Int 或 Float。卡在如何转换 Float64 中的所有列。
【问题讨论】: