【问题标题】:Removing commas from strings and numbers从字符串和数字中删除逗号
【发布时间】:2018-02-05 03:06:57
【问题描述】:

使用R,如果是数字,如何删除逗号,如果是字母,如何用空格替换逗号?:

  Company        | Sales   |
  -------------------------
  go, go, llc    |2,550.40 |
  tires & more   |  500    |
  l-m tech       |1,000.67 |

样本数据:

data = matrix(c('go, go,llc', 'tires & more', 'l-m technology',
 formatC(2550.40, format="f", big.mark=",", digits=2), 500, 
 formatC(1000.67, format="f", big.mark=",", digits=2)), 
 nrow=3, 
 ncol=2)

预期输出:

  Company      | Sales  |
  -----------------------
  go go llc    |2550.40 |
  tires & more |  500   |
  l-m tech     |1000.67 |

我尝试过的:

data <- sapply(data, function(x){
           if (grepl("[[:punct:]]",x)){
              if (grepl("[[:digit:]]",x)){
                 x <- gsub(",","",x)
              }
              else{
                 x <- gsub(","," ",x)
              }
           }
        })

print(nrow(data)) # returns NULL

【问题讨论】:

  • 是不是就像company 列总是有字母和Sales 列号一样?
  • 不,有些公司里面有数字

标签: r comma string-substitution


【解决方案1】:

您可以使用嵌套的gsub 轻松做到这一点:

gsub(",", "", gsub("([a-zA-Z]),", "\\1 ", input)

内部模式匹配一​​个字母后跟一个逗号,并将其替换为仅该字母。外部的gsub 用空格替换所有剩余的逗号。

将其应用于您的矩阵:

    apply(data, 2, function(x) gsub(",", "", gsub("([a-zA-Z]),", "\\1 ", x)))
    #      [,1]             [,2]     
    # [1,] "go  go llc"     "2550.40"
    # [2,] "tires & more"   "500"    
    # [3,] "l-m technology" "1000.67"

【讨论】:

    猜你喜欢
    • 2012-03-09
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2010-09-05
    • 2011-04-24
    • 2011-06-24
    相关资源
    最近更新 更多