【发布时间】:2015-01-13 19:33:46
【问题描述】:
我想知道是否有一种快速的方法可以找到 2 个文本字符串之间的定向交集,例如
t1 <- "I have achieved my goals over the past 20 years and look forward for my next chalanges"
t2 <- " have achieved goals and look my chalanges some other words bla bla"
t1 isContainedIn t2 将返回 7,因为在 t1 中出现的 7 个单词也在 t2 中出现。 此外,t1 和 t2 是数据框中的 2 列,因此我需要将该函数应用于整个数据框并将结果列附加到我的原始数据框。 这就是我的数据框“data.selected”的样子:
keywords title
1 Samsung UN48H6350 48" Samsung UN48H6350 48" Full 1080p Smart HDTV 120Hz with Wi-Fi +$50 Visa Gift Card
2 Samsung UN48H6350 48" Samsung UN48H6350 48" Full HD Smart LED TV -Bundle- (See Below for Contents)
3 Samsung UN48H6350 48" Samsung UN48H6350 48" Class Full HD Smart LED TV -BUNDLE- See below Details
4 Samsung UN48H6350 48" Samsung UN48H6350 48" Full HD Smart LED TV With BD-H5100 Blu-ray Disc Player
5 Samsung UN48H6350 48" Samsung UN48H6350 48" Smart 1080p Clear Motion Rate 240 LED HDTV
6 Samsung UN48H6350 48" Samsung UN48H6350 - 48-Inch Full HD 1080p Smart HDTV 120Hz with Wi-Fi
7 Samsung UN48H6350 48" Samsung 6350 Series UN48H6350 48" 1080p HD LED LCD Internet TV NEW
8 Samsung UN48H6350 48" Samsung Un48h6350af 75" 1080p Led-lcd Tv - 16:9 - Hdtv 1080p - (un75h6350afxza)
9 Samsung UN48H6350 48" Samsung UN48H6350 - 48" HD 1080p Smart HDTV 120Hz Bundle
10 Samsung UN48H6350 48" Samsung UN48H6350 - 48-Inch Full HD 1080p Smart HDTV 120Hz with Wi-Fi, (R#416)
【问题讨论】:
标签: r nlp intersection text-mining