【问题标题】:NimbleCSV : ElixirNimbleCSV:长生不老药
【发布时间】:2020-07-07 01:50:12
【问题描述】:

我正在尝试将 NimbleCSV 库用于个人项目,但遇到了一些问题...

NimbleCSV.define(MyParser, separator: ",", escape: "\"")

defmodule Siren do
  def parseCSV do
    IO.puts("Let's parse CSV file!")
    File.stream!("name.csv")
  |> MyParser.parse_stream
  |> Stream.map(fn [name, team, position, height, weight, age] ->
    %{name: name, team: team, position: position, height: String.to_integer(height), weight: String.to_integer(weight), age: String.to_integer(age)}
    end)
  |> Enum.map(&IO.puts(&1))
  end
end

正如你在上面看到的,我正在使用 Stream,但是当我启动我的 Mix 任务时它崩溃了:

➜  siren mix siren
Compiling 1 file (.ex)
Let's parse CSV file!
** (NimbleCSV.ParseError) unexpected escape character " in " \"Team\", \"Position\", \"Height(inches)\", \"Weight(lbs)\", \"Age\"\n"
    deps/nimble_csv/lib/nimble_csv.ex:427: MyParser.separator/5
    deps/nimble_csv/lib/nimble_csv.ex:360: anonymous fn/4 in MyParser.parse_stream/2
    (elixir 1.10.3) lib/stream.ex:902: Stream.do_transform_user/6
    (elixir 1.10.3) lib/stream.ex:1609: Enumerable.Stream.do_each/4
    (elixir 1.10.3) lib/enum.ex:3383: Enum.map/2
    (mix 1.10.3) lib/mix/task.ex:330: Mix.Task.run_task/3
    (mix 1.10.3) lib/mix/cli.ex:82: Mix.CLI.run_task/2

这是我的 CSV 文件:

"Name", "Team", "Position", "Height(inches)", "Weight(lbs)", "Age"
"Adam Donachie", "BAL", "Catcher", 74, 180, 22.99
"Paul Bako", "BAL", "Catcher", 74, 215, 34.69
"Ramon Hernandez", "BAL", "Catcher", 72, 210, 30.78
"Kevin Millar", "BAL", "First Baseman", 72, 210, 35.43
"Chris Gomez", "BAL", "First Baseman", 73, 188, 35.71
"Brian Roberts", "BAL", "Second Baseman", 69, 176, 29.39
"Miguel Tejada", "BAL", "Shortstop", 69, 209, 30.77
"Melvin Mora", "BAL", "Third Baseman", 71, 200, 35.07
"Aubrey Huff", "BAL", "Third Baseman", 76, 231, 30.19
"Adam Stern", "BAL", "Outfielder", 71, 180, 27.05
"Jeff Fiorentino", "BAL", "Outfielder", 73, 188, 23.88
"Freddie Bynum", "BAL", "Outfielder", 73, 180, 26.96
"Nick Markakis", "BAL", "Outfielder", 74, 185, 23.29
"Brandon Fahey", "BAL", "Outfielder", 74, 160, 26.11
"Corey Patterson", "BAL", "Outfielder", 69, 180, 27.55

问题一定来自我之前定义的转义字符,但我不明白为什么?这里的转义字符是什么?对我来说是 CSV 行中每个字符串的双引号。

【问题讨论】:

    标签: csv erlang elixir


    【解决方案1】:

    CSV 表示 逗号分隔值,它是具有自己的 RFC4180 的格式。一个人不能随心所欲地放置空格。将输入更改为如下所示的输入,一切都会正常工作。问题是 逗号后的空格,或者换句话说,转义字符没有紧跟分隔符。

    "Name","Team","Position","Height(inches)","Weight(lbs)","Age"
    "Adam Donachie","BAL","Catcher",74,180,22.99
    "Paul Bako","BAL","Catcher",74,215,34.69
    "Ramon Hernandez","BAL","Catcher",72,210,30.78
    "Kevin Millar","BAL","First Baseman",72,210,35.43
    "Chris Gomez","BAL","First Baseman",73,188,35.71
    "Brian Roberts","BAL","Second Baseman",69,176,29.39
    "Miguel Tejada","BAL","Shortstop",69,209,30.77
    "Melvin Mora","BAL","Third Baseman",71,200,35.07
    "Aubrey Huff","BAL","Third Baseman",76,231,30.19
    "Adam Stern","BAL","Outfielder",71,180,27.05
    "Jeff Fiorentino","BAL","Outfielder",73,188,23.88
    "Freddie Bynum","BAL","Outfielder",73,180,26.96
    "Nick Markakis","BAL","Outfielder",74,185,23.29
    "Brandon Fahey","BAL","Outfielder",74,160,26.11
    "Corey Patterson","BAL","Outfielder",69,180,27.55
    

    NimbleCSV 带有一个默认实现,NimbleCSV.RFC4180 它正是你使用的,所以你不需要定义你自己的解析器,使用默认的。

    defmodule Siren do
      def parseCSV do
        IO.puts("Let's parse CSV file!")
    
        File.stream!("name.csv")
        |> NimbleCSV.RFC4180.parse_stream()
        |> Stream.map(fn [name, team, position, height, weight, age] ->
          %{name: name, team: team, position: position,
            height: String.to_integer(height),
            weight: String.to_integer(weight),
            age: String.to_float(age) # NOTE float here!
          }
        end)
        |> Enum.to_list()
        |> IO.inspect()
      end
    end
    #⇒ [
    #  %{
    #    age: 22.99,
    #    height: 74,
    #    name: "Adam Donachie",
    #    position: "Catcher",
    #    team: "BAL",
    #    weight: 180
    #  },
    #  ...
    # ]
    

    【讨论】:

    • 您好,感谢 Aleksei Matiushkin。对于浮动评论,我有一个问题:当我的年龄是 30 甚至 30.00 时,它不起作用。确实 30 是一个整数,但它无论如何都不起作用......为什么?
    • CSV 文件:"Name","Team","Position","Height(inches)","Weight(lbs)","Age" "RamonHernandez","BAL","Catcher",72,210,30 错误消息(刚刚开始): ** (ArgumentError) 参数错误:erlang.binary_to_float("30") lib/siren.ex: 13:Siren.parseCSV/0 (elixir 1.10.3) 中的匿名 fn/1 lib/stream.ex:572: Stream.map/2 (elixir 1.10.3) 中的匿名 fn/4 lib/enum.ex:3686: Enumerable.List.reduce/3 (elixir 1.10.3) lib/stream.ex:931: Stream.do_list_transform/7 (elixir 1.10.3) lib/stream.ex:1609: Enumerable.Stream.do_each/4
    • 要从整数中得到浮点数,乘以 1.0,它们是不同的类型。
    • 但我必须乘以 if 是一个整数并且在 Stream.map 之前我认为是因为错误出现在这里。我们如何在 Elixir 中轻松实现这个动作条件?因为我没有将我的数据存储在 Elixir 对象中,不是吗?
    • 我不确定我是否遵循。你控制数据。如果您希望"30""30.0" 同时出现,请使用Float.parse/1 作为age |> Float.parse() |> elem(0)
    猜你喜欢
    • 2016-02-01
    • 2010-09-16
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多