【问题标题】:Java How to change only field delimiter and not actual dataJava如何仅更改字段分隔符而不更改实际数据
【发布时间】:2019-01-14 05:36:28
【问题描述】:

我有一个 输入 CSV,它的数据用双引号括起来,字段分隔符是逗号 (,) 如下所示是 3 列和 1 行:

"Id","Description","LastModifiedDate","Quantity"
"101","this is a test message - "","" how are you, where are you from","2018-01-13","15.0"
"102","this is line break msg , "2019-01-01","13.0"
 where data goes to next line"

我只想将字段分隔符从逗号 (,) 更改为插入符号 (^) 所以在从 Input CSV 读取行时我写了 line.replace("\",\"", "\"^\"") ; 低于实际结果

"Id"^"Description"^"LastModifiedDate"
"101"^"this is a test message - ""^"" how are you, where are you from"^"2018-01-13"^"15.0"
"102"^"this is line break msg ^ "2019-01-01"^"13.0"
 where data goes to next line"

问题是使用上面的替换代码,它将所有逗号替换为我不想要的插入符号。 预期输出应如下:

"Id"^"Description"^"LastModifiedDate"
"101"^"this is a test message - "","" how are you, where are you from"^"2018-01-13"^"15.0"
"102"^"this is line break msg ^ "2019-01-01"^"13.0"
 where data goes to next line"

据我所知,这可以使用 Java 正则表达式来处理,但不幸的是,我不太擅长使用正则表达式,因此非常感谢任何帮助。

更新

         Regex1  : replaceAll("\",\"(?!\"\")", "\"^\"");

        Example1,
     "Id","Description","LastModifiedDate","Quantity"  -- header
     "101","hello-this,is test data"",""testing","2018-10-01","\"  -- input row1
    "101"^"hello-this,is test data""^""testing"^"2018-10-01"^"\"  -- post Regex1
     "101"^"hello-this,is test data"",""testing"^"2018-10-01"^"\"  -- expected

 In first row if data contains "","" it still gets replaced to ""^""


     Example2, 
       "Id","Description","LastModifiedDate","Quantity"  -- header 
       "102","""text in double quotes""","13.2" -- input row2
       "102","""text in double quotes"""^"13.2"  -- post with only Regex1
        "102"^""text in double quotes""^"13.2"  --  expected result

 So I tried one more regex after regex1 for second row scenario
Regex 2:  replaceAll(",\"\"\"(?!\"\")", "^\"\""); 

      regex2 along with regex1 partially worked but still, the row1 issue is not getting resolved.

所有这些场景都可以在1个replaceAll中处理吗?或者多个replaceAll也可以

【问题讨论】:

  • 那是你使用 CSV 解析器的时候。
  • 你的意思是打开 CSV 吗?你能分享一个如何做到这一点的例子吗?
  • 未引用的字段呢?字段内的换行符?有吗?
  • @WiktorStribiżew 所有字段都将始终用双引号引起来。是的,几个字段中有换行符。

标签: java regex csv


【解决方案1】:

我想这很适合你;

    text = text.replaceAll("\",\"(?!\")", "\"^\"");

\",\"(?!\") 这部分表示如果"\""后面没有跟"\"",则后面会匹配"\""。

【讨论】:

  • 这对于所有列仍然无法正常工作,它给我的输出如下:“Id”^“Description”^“LastModifiedDate”^“Quantity”“101”^“这是一条测试消息 - "","" 你好吗,你来自哪里",""^"2018-01-13"^"15.0" 在日期值之前为什么在插入符号之前有逗号和 2 个双引号?我只想要插入符号。似乎正则表达式正在部分工作。
  • 抱歉输出有 "^"15.0" 但有问题我没有看到那样。也许我误解了你。为了验证,如果有错误,我给你示例输入和输出警告我。 "Id","Description","LastModifiedDate""101","这是一条测试消息 - "",""你好吗,你来自哪里","2018-01-13" 正则表达式后: "Id"^"Description"^"LastModifiedDate""101"^"这是一条测试消息-"",""你好吗,你来自哪里"^"2018-01-13"这样对吗?跨度>
  • 我将附加列添加为 Quantity & 所以它的值为 15.0 列数是动态的,它不会总是 3 列的静态计数。除了你在正则表达式之后的结果是正确的之外,这就是预期的结果。
  • 当第四列添加时,我猜它会变成这样; "Id","Description","LastModifiedDate","Quantity","101","这是一条测试消息 - "",""你好吗,你来自哪里","2018-01-13", “15.0”。在我尝试的正则表达式之后是; "Id"^"Description"^"LastModifiedDate"^"Quantity""101"^"这是一条测试消息-"",""你好吗,你来自哪里"^"2018-01-13"^" 15.0"。错了吗?
  • 是的,这正是我所需要的
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2019-07-28
  • 1970-01-01
相关资源
最近更新 更多