如何在R中同时按列名过滤和过滤答案

【问题标题】：How to filter in & filter out by the column names in the same time in R如何在R中同时按列名过滤和过滤
【发布时间】：2019-07-23 19:30:41
【问题描述】：

我有一个数据框，其中的部分列以类似的模式命名。（例如：first_susceptibility_test_penicillins_ampicillin_bli、first_susceptibility_test_penicillins_ampicillin_bli_s）。
根据列名末尾是否存在"_s"，变量将具有不同级别的因子（分别为"Tested/Not tested" 和"Sensitive/Intermediate/Resistant"）。

我曾尝试将代码与which 和grepl 一起使用，但它不起作用。

[which(grepl("first_susceptibility_test", names(df), ignore.case=FALSE)&
  !grepl("_s", names(df), ignore.case=FALSE)]

有没有办法解决这个问题？

【问题讨论】：

您应该提供更多详细信息。什么不起作用，你得到一个错误（那么错误是什么）或错误的结果（那么，你得到什么结果）？...
对于给出的简单示例，我认为单独使用 df[, !grepl("_s", names(df), ignore.case=FALSE)] 应该可以工作。
那应该可以工作，例如试试iris[,which(grepl("Width", names(iris), ignore.case=FALSE) & !grepl("Petal", names(iris), ignore.case=FALSE))]你确定你的数据和你想的一样吗？
对不起。傻我。你的问题是"_s" 在"first_susceptibility_test" 之内所以你需要使用正则表达式将"_s" 锚定在字符串的末尾。
!grepl("_s$", names(df), perl = TRUE) $ 是字符串锚的结尾。

标签： r tidyverse

【解决方案1】：

感谢@Stephen Henderson，我已将变量数值变量更改为因子，并在多列（总共 223 个）中分配因子水平。这是一个代码（如果有人会遇到类似问题的情况）：

df[, grepl("susceptibility_test", names(df), perl = TRUE) & !grepl("_s$", names(df), perl = TRUE)] <- lapply(
  X = df[,grepl("susceptibility_test", names(df), perl = TRUE) & !grepl("_s$", names(df), perl = TRUE)],
  FUN = factor,
  levels = c(0, 1),
  labels = c("Not tested", "Tested"))

df[, grepl("susceptibility_test", names(df), perl = TRUE) & grepl("_s$", names(df), perl = TRUE)] <- lapply(
  X = df[, grepl("susceptibility_test", names(df), perl = TRUE) & grepl("_s$", names(df), perl = TRUE)],
  FUN = factor,
  levels = 0:2,
  labels = c("Sensitive", "Intermediate", "Resistant"))

假设代码可能更优雅，但使用ifelse。

【讨论】：

您能否格式化您的代码以适应行，不幸的是，目前的格式很难理解。