从数据框中的元素中删除一个字符答案

【问题标题】：Remove a character from elements in a dataframe从数据框中的元素中删除一个字符
【发布时间】：2017-07-26 01:32:44
【问题描述】：

我有一组数据，其中一些元素以“

Background: 18 <10 27 22 <3

Site: 30 44 23 <16 13

我使用x=read.file 制作数据框，然后尝试gsub("<","",x) 删除"<"，结果完全出乎意料，至少对我来说是这样。这就是我得到的结果。

[1] "1:2"       "c(18, 30)" "1:2"       "c(27, 23)" "c(2, 1)"   "1:2"

我不知道这意味着什么，也不知道为什么会这样。我将非常感谢您解释这里发生了什么，以及我应该如何实现我的目标。

【问题讨论】：

gsub 不能直接在 data.frame 上工作 - x[] <- lapply(x, gsub, pattern="<", replacement="") 我猜可能是你想要的。

标签： r dataframe gsub

【解决方案1】：

df <- read.table(header = TRUE, text = "Background   Site
                 18   30
                 <10  44
                 27   23
                 22  <16
                 <3   13", stringsAsFactors = FALSE)

您可以使用mutate_at 并将gsub 函数应用于您希望删除前面< 符号的变量（即Background 和Site）。

library(dplyr)
df %>% mutate_at(vars(Background, Site), 
                 funs(as.numeric(gsub("^<", "", .))))

输出是：

  Background Site
1         18   30
2         10   44
3         27   23
4         22   16
5          3   13

【讨论】：

看起来数据更像x <- read.table(text = "18 <10 27 22 <3\n30 44 23 <16 13")，只是从他们尝试的输出来看。

【解决方案2】：

使用readLines 读取文件，执行gsub，然后使用read.table 重新读取。没有使用任何包：

read.table(text = gsub("<", "", readLines("myfile")), as.is = TRUE)

如果数据不是来自文件但已经在数据框DF 中，则定义一个clean 函数，该函数清除DF 的列并将其应用于每个数字列：

clean <- function(x) as.numeric(gsub(">", "", x))
DF[-1] <- lapply(DF[-1], clean)

【讨论】：