【发布时间】:2018-03-15 00:28:22
【问题描述】:
我正在尝试用相应的替换字符串替换字符向量中的多个模式。在做了一些研究之后,我发现了我认为能够做我想做的事情的包 gsubfn,但是当我运行下面的代码时,我没有得到我的预期输出(结果与我预期的结果相比,请参见问题的结尾)。
library(gsubfn)
# Our test data that we want to search through (while ignoring case)
test.data<- c("1700 Happy Pl","155 Sad BLVD","82 Lolly ln", "4132 Avent aVe")
# A list data frame which contains the patterns we want to search for
# (again ignoring case) and the associated replacement strings we want to
# exchange any matches we come across with.
frame<- data.frame(pattern= c(" Pl"," blvd"," LN"," ave"), replace= c(" Place", " Boulevard", " Lane", " Avenue"),stringsAsFactors = F)
# NOTE: I added spaces in front of each of our replacement terms to make
# sure we only grab matches that are their own word (for instance if an
# address was 45 Splash Way we would not want to replace "pl" inside of
# "Splash" with "Place
# The following set of paste lines are supposed to eliminate the substitute function from
# grabbing instances like first instance of " Ave" found directly after "4132"
# inside "4132 Avent Ave" which we don't want converted to " Avenue".
pat <- paste(paste(frame$pattern,collapse = "($|[^a-zA-Z])|"),"($|[^a-zA-Z])", sep = "")
# Here is the gsubfn function I am calling
gsubfn(x = test.data, pattern = pat, replacement = setNames(as.list(frame$replace),frame$pattern), ignore.case = T)
正在接收输出:
[1] "1700 Happy" "155 Sad" "82 Lolly" "4132 Avent"
预期输出:
[1] "1700 Happy Place" "155 Sad Boulevard" "82 Lolly Lane" "4132 Avent Avenue"
我关于为什么这不起作用的工作理论是,由于某些大小写差异(例如:在“155 Sad BLVD" 不 == " blvd" 即使由于 ignore.case 参数而可以被视为匹配项)。有人可以确认这是问题/指出我还有什么可能出错的地方,也许是一种解决这个问题的方法,它不需要我扩展我的模式向量以包括所有大小写排列(如果可能)?
【问题讨论】:
标签: r replace gsub case-sensitive