【发布时间】:2021-08-06 19:42:10
【问题描述】:
我正在使用 R,我正在尝试正确计算我的标准差。
我的数据如下所示:
Target category wordproduced wordValue
wall A home .003
wall A table .005
widnow A cow .015
window B backyard .012
friend B dog .018
friend B chance .088
friend B spoon .002
big C country .009
big C pen .015
big C pub .012
money C palace .078
rail C wood .026
rail C ferrari .030
rail C car .062
science D phone .007
science D laboratory .009
science D side .019
water D ocean .013
water D river .020
所以,我有四个类别(A、B、C、D),我总共有 8 个单词。每个词都属于一个类别。
所以,如果我想计算目标单词产生的单词的平均值,我会编写这样的代码 ....
mydata %>%
group_by(category) %>%
summarise(TargetN = length(unique(Taregt)),
wPoroducedN = length(wordsproduced),
meanW = wProducedN/TargetN)
如果我使用 mean() 函数计算平均值,它会得到错误的平均值,因为它会计算目标中的每个单词。例如,类别 A 只有 2 个唯一词,但总共有 3 个。所以,我需要计算我的平均跳水量为 2。上面的代码解决了这个问题。但是在计算 SD 时,我得到了很多错误的答案或 NA。
例如,我试过这个...
mydata %>%
group_by(category) %>%
summarise(TargetN = length(unique(Taregt)),
wPoroducedN = length(wordsproduced),
meanW = wProducedN/TargetN,
SD = sd(length(wordproduced)))
在这里,我得到 NA.,而在其他代码中,我得到 0 o o 唯一目标的确切数量,等等。
我应该如何计算我的 SD?
添加可重现的数据.... **类别已更改为数字;相反,os ABCD 是 123(只有三个)
newDat <- structure(list(Target = c(
"permit",
"confusion",
"presion",
"transanction",
"sorprise",
"same",
"agony",
"prime",
"suffer",
"affect",
"car",
"neglect",
"intern",
"explore",
"image",
"pension",
"amature",
"terrified",
"importance",
"deal",
"replace",
"euforic",
"optimist",
"return",
"inmerse",
"doll",
"actor",
"singular",
"desctruction",
"dispute",
"tremor",
"profesional",
"redem",
"euforic",
"pen",
"pause",
"cultive",
"center",
"cheer",
"slace",
"recess",
"apple",
"introduction",
"despicable",
"offense",
"inteligent",
"hope",
"contender",
"stress",
"disgust"
), Category = c(
"3",
"1",
"1",
"1",
"1",
"1",
"1",
"2",
"2",
"2",
"2",
"2",
"1",
"1",
"2",
"2",
"1",
"1",
"2",
"1",
"1",
"1",
"1",
"2",
"1",
"1",
"3",
"1",
"1",
"1",
"1",
"1",
"1",
"1",
"2",
"3",
"1",
"3",
"1",
"2",
"2",
"1",
"1",
"1",
"1",
"2",
"1",
"3",
"1",
"1"
), wordproduced = c(
"liberty",
"intense",
"sad",
"serenity",
"afraid",
"sadness",
"hurt",
"freedom",
"depress",
"feeling",
"love",
"positive",
"river",
"palace",
"ilusion",
"stress",
"aliviated",
"violence",
"presion",
"damage",
"hate",
"happy",
"dwindle",
"spoon",
"kitchen",
"dog",
"backyard",
"alone",
"cat",
"confidence",
"fear",
"moving",
"house",
"ocean",
"territory",
"continent",
"sky",
"rainbow",
"approach",
"law",
"good",
"school",
"science",
"land",
"laboratory",
"engage",
"destiny",
"voice",
"arange",
"infertile"
), wordValue = c(
0.10,
0.09,
0.01,
0.1,
0.046,
0.316,
0.12,
0.03,
0.03,
0.02,
0.46,
0.19,
0.26,
0.070,
0.040,
0.01,
0.025,
0.03,
0.05,
0.089,
0.075,
0.03,
0.067,
0.04,
0.04,
0.1,
0.068,
0.055,
0.17,
0.075,
0.535,
0.06,
0.1,
0.12,
0.04,
0.08,
0.036,
0.1,
0.05,
0.050,
0.07,
0.05,
0.8,
0.05,
0.06,
0.08,
0.055,
0.04,
0.12,
0.049
)), row.names = c(NA, -50L), class = c("tbl_df",
"tbl", "data.frame"))
【问题讨论】:
-
你能提供一个可重现的例子吗?
-
我刚做了。谢谢推荐