【发布时间】:2020-01-14 21:37:31
【问题描述】:
不确定这在 SAS 中是否可行;虽然我正在慢慢学习 SAS 中的任何事情都是可能的......
我有一个包含 600 名患者的数据集,在该数据集中我有一个评论变量。评论变量包含每个患者就他/她的护理陈述的几句话。例如,数据集如下所示:
ID Comment
1 Today we have great service. everyone was really nice.
2 The customer service team did not know what they were talking about and was rude.
3 Everyone was very helpful 5 stars.
4 Not very helpful at all.
5 Staff was nice.
6 All the people was really nice.
假设我确定了一些我感兴趣的关键词;例如友善、粗鲁和乐于助人。
有没有办法提取这些单词之前的 2 个字符串并生成频率表?
WORD Frequency
Was Really Nice 2
And Was Rude 1
Was Very Helpful 1
Not very helpful 1
我已经编写了一个代码,它可以帮助我识别关键词,这个代码创建了注释变量中每个单词的频率计数。
data PG_2 / view=PG_2;
length word $20;
set PG_1;
do i = 1 by 1 until(missing(word));
word = upcase(scan(COMMENT, i));
if not missing(word) then output;
end;
keep word;
run;
proc freq data=PG_2 order=freq;
table word / out=wordfreq(drop=percent);
run;
【问题讨论】:
标签: sas