【问题标题】:Ruby CSV read multiline fieldsRuby CSV 读取多行字段
【发布时间】:2012-10-06 14:14:21
【问题描述】:

我从 SQL 导出表和查询,其中一些字段是多行的。

Ruby (1.9+) 读取 CSV 的方式似乎是:

require 'csv'

CSV.foreach("exported_mysql_table.csv", {:headers=>true}) do |row|
    puts row
end

如果我的数据是这样的,效果会很好:

"id","name","email","potato"
1,"Bob","bob@bob.bob","omnomnom"
2,"Charlie","char@char.com","andcheese"
4,"Doug","diggyd@diglet.com","usemeltattack"

(第一行是标题/属性)

但如果我有:

"id","name","address","email","potato"
1,"Bob","--- 
- 101 Cottage row
- Lovely Village
- \"\"
","bob@bob.bob","omnomnom"
2,"Charlie","--- 
- 102 Flame Street
- \"\"
- \"\"
","char@char.com","andcheese"
4,"Doug","--- 
- 103 Dark Cave
- Next to some geo dude
- So many bats
","diggyd@diglet.com","usemeltattack"

然后我得到错误:

.rbenv/versions/1.9.3-p194/lib/ruby/1.9.1/csv.rb:1894:in `block (2 levels) in shift': Missing or stray quote in line 2 (CSV::MalformedCSVError)

这似乎是因为行尾没有右引号,因为它跨越了几行。

(我尝试了“FasterCSV”,从 ruby​​ 1.9 开始,该 gem 变成了“csv”)

【问题讨论】:

  • 首先您必须在传递给 csv 解析器之前重建跨域字段。

标签: mysql ruby csv fastercsv


【解决方案1】:

您的问题不是多行,而是格式错误的 CSV。

\" 和行尾后的空格替换为这样:

require 'csv' 

ml = %q{"id","name","address","email","potato" 
1,"Bob","---  
- 101 Cottage row 
- Lovely Village 
- \"\" 
","bob@bob.bob","omnomnom" 
2,"Charlie","---  
- 102 Flame Street 
- \"\" 
- \"\" 
","char@char.com","andcheese" 
4,"Doug","---  
- 103 Dark Cave 
- Next to some geo dude 
- So many bats 
","diggyd@diglet.com","usemeltattack"}

ml.gsub!(/\" \n/,"\"\n").gsub!(/\\\"/,"__")

CSV.parse(ml, {:headers=>true}) do |row|
  puts row
end

这给出了:

"id","name","address","email","potato"
1,"Bob","---  
- 101 Cottage row 
- Lovely Village 
- ____
","bob@bob.bob","omnomnom"
etc

如果您无法控制提供 CSV 的程序,则必须打开文件、读取内容、进行替换,然后解析 CSV。我在这里使用__,但您可以使用其他不冲突的字符。

【讨论】:

  • 感谢它的工作!但我认为应该是 gsub 而不是 gsub!
  • 你可以用 ml.gsub!(/\" \n/,"\"\n").gsub(/\\\"/,"__") 但第一个是必要的或者你将不得不使用非红宝石 ml = ml.gsub
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2017-01-12
  • 1970-01-01
  • 2016-09-19
  • 1970-01-01
  • 2021-08-25
相关资源
最近更新 更多