忽略逗号作为 csv 文件中的千位分隔符答案

【问题标题】：ignoring commas as thousands separators in csv files忽略逗号作为 csv 文件中的千位分隔符
【发布时间】：2019-01-21 17:24:56
【问题描述】：

我有一些具有多行的数据集，例如下面的 data.frae df。

最后，我真的需要在字符串末尾的整数，在双引号之外的逗号之后。但是逗号作为千位分隔符似乎确实使事情复杂化。

保存每个计数的行标签会很有用（即 $5,000 - $9,999），但我可以不这样做。

下面的代码返回同一列中的行标签和计数。

谢谢

library(tidyverse)
text<-'"Text / some other text / some other text / $5,000-$9,999", 10,000.00'
df<-data.frame(text=text)
df %>% 
  separate(., text, into=c('a', 'b', 'c', 'd'), sep='/')

【问题讨论】：

标签： r csv tidyverse

【解决方案1】：

第二个separate 怎么样？

df %>% 
  separate(., text, into=c('a', 'b', 'c', 'd'), sep='/') %>%
  separate(d, into = c("d", "e"), sep = "\", ")

【讨论】：

谢谢，很漂亮。

【解决方案2】：

您可以使用 R Base 的正则表达式功能来完成您的任务。

library(tidyr)
text<-'"Text / some other text / some other text / $5,000-$9,999", 10,000.00'
df<-data.frame(text=text)
df %>% mutate(my_number = unlist(regmatches(text, gregexpr( ' [0-9](.*)$' ,text)))) %>%  
       mutate(my_number = as.integer(sub(',','', my_number))) %>%  
  head 


text   my_number
1 "Text / some other text / some other text / $5,000-$9,999", 10,000.00     
10000

【讨论】：