【问题标题】:How Do I Compare Two Datasets element by element in R?如何在 R 中逐个元素地比较两个数据集?
【发布时间】:2016-12-08 06:14:04
【问题描述】:

我需要根据 A、B、C、D 多项选择方式的答案键检查 50 位不同学生的测试结果。

我有一个答案键的一维数据集,“答案”我读为 answers <- read.table("A1_Ans_only.txt", header = FALSE, sep = ",")

View(answers)

我有一个数据集“结果”,其中包含所有 50 名学生的所有答案。我把它读成results <- read.csv("Form A1_only.csv", header = FALSE)

View(results)

因此,当我尝试像 results==answers 或 `evaluate(results,answers)' 之类的东西时,评估是我编写的定义为 'evaluate

有人可以帮我评估结果数据框中的每个元素,以确定每个学生答对了哪些问题吗?

This is a small sample of results: 


structure(list(V1 = c(1L, 3L, 5L), V2 = c(NA, NA, NA), V3 = structure(c(2L, 
1L, 4L), .Label = c("A", "B", "C", "D"), class = "factor"), V4 =     structure(c(1L, 
1L, 1L), .Label = c("A", "B", "C", "D"), class = "factor"), V5 = structure(c(2L, 
2L, 3L), .Label = c("A", "B", "C", "D"), class = "factor"), V6 = structure(c(1L, 
1L, 1L), .Label = c("A", "B", "C"), class = "factor"), V7 = structure(c(1L, 
1L, 1L), .Label = c("A", "C", "D"), class = "factor"), V8 = structure(c(2L, 
1L, 2L), .Label = c("A", "B", "D"), class = "factor"), V9 = structure(c(1L, 
1L, 1L), .Label = c("A", "C", "D"), class = "factor"), V10 = structure(c(2L, 
2L, 1L), .Label = c("A", "B", "C"), class = "factor")), .Names = c("V1", 
"V2", "V3", "V4", "V5", "V6", "V7", "V8", "V9", "V10"), row.names = c(NA, 
3L), class = "data.frame")


This is the sample from answers: 

structure(list(V1 = structure(1L, .Label = "AAAAKEY", class = "factor"), 
V2 = NA, V3 = structure(1L, .Label = "C", class = "factor"), 
V4 = structure(1L, .Label = "A", class = "factor"), V5 = structure(1L, .Label = "C", class = "factor"), 
V6 = structure(1L, .Label = "A", class = "factor"), V7 = structure(1L, .Label = "A", class = "factor"), 
V8 = structure(1L, .Label = "B", class = "factor"), V9 = structure(1L, .Label = "A", class = "factor"), 
V10 = structure(1L, .Label = "B", class = "factor")), .Names = c("V1", 
"V2", "V3", "V4", "V5", "V6", "V7", "V8", "V9", "V10"), class = "data.frame", row.names = c(NA, 
-1L))

【问题讨论】:

    标签: r dataframe compare elements


    【解决方案1】:

    我们可以在复制“答案”后进行比较以使长度相等

    results==answers[col(results)]
    #     V1 V2    V3   V4    V5   V6   V7    V8   V9   V10
    #1 FALSE NA FALSE TRUE FALSE TRUE TRUE  TRUE TRUE  TRUE
    #2 FALSE NA FALSE TRUE FALSE TRUE TRUE FALSE TRUE  TRUE
    #3 FALSE NA FALSE TRUE  TRUE TRUE TRUE  TRUE TRUE FALSE
    

    “答案”列 V2 中的 NA 会导致 NA 输出,因为与 NA 的任何相等比较都会导致 NA。如果我们需要它为 FALSE,那么要么在之后将 NA 更改为 FALSE,要么使用 !is.na(answers)[col(results)] 执行 &

    【讨论】:

    • 所以我从中得到的是一个非常长的打印输出,每个观察结果都返回为“FALSE”那里到底发生了什么?
    • @Mr.T 它返回一个向量。您的预期输出到底是什么?在这里,“答案”的每一列都被复制,以便在“结果”的每一列中获得相同数量的观察结果,然后在两者之间进行元素比较
    • 我希望有一种方法可以显示学生的哪些答案是“真”或“假”,但它表明每个条目都是“假”。这是不准确的。我该如何解决这个问题?
    • @Mr.T 您能否使用dput 展示一个可重复的小示例并更新您的帖子,因为我无法从您的图片中复制并对其进行测试
    • 恐怕我不知道该怎么做。我试过' dput(results) '。我该怎么办?抱歉,这是我在 R 的第一个学期。
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2011-09-09
    • 2020-07-25
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2017-01-20
    相关资源
    最近更新 更多