【发布时间】:2018-05-23 03:07:13
【问题描述】:
我有一个短语列表,我想用相似的单词替换其中的某些单词,以防拼写错误。
library(stringr)
a4 <- "I would like a cheseburger and friees please"
badwords.corpus <- c("cheseburger", "friees")
goodwords.corpus <- c("cheeseburger", "fries")
vect.corpus <- goodwords.corpus
names(vect.corpus) <- badwords.corpus
str_replace_all(a4, vect.corpus)
# [1] "I would like a cheeseburger and fries please"
一切正常,直到找到一个相似的字符串,并用另一个词替换它
如果我有如下模式:
"plea",正确的是"please",但是当我执行它时将其删除并替换为"pleased"。
我正在寻找的是,如果一个字符串已经正确,则不再对其进行修改,以防它找到类似的模式。
【问题讨论】:
-
能不能举个反例,我不清楚?
-
string<- c("tre", "tree", "teeasing", "tesing") goodwords<-c("tree", "three", "teasing", "testing") badwords<- c("tre", "thre", "teeasing", "tesing") vect.corpus <- goodwords names(vect.corpus) <- badwords a <- str_replace_all(string, vect.corpus) "tree" **"treee"** "teasing" "testing"
标签: r regex string text-mining text-processing