鉴于强制引入的 NA，Sweave/R/Latex 中没有 pdflatex答案

【问题标题】：No pdflatex in Sweave/R/Latex given NAs introduced by coercion鉴于强制引入的 NA，Sweave/R/Latex 中没有 pdflatex
【发布时间】：2021-03-14 18:38:35
【问题描述】：

我在使用RStudio 和Sweave 生成pdf 文档时遇到问题。运行代码后，没有错误信息，也没有警告信息。然而，当我在控制台中输入 warnings() 时，我得到了一个列表，其中摘录了其中的一部分，其余的警告看起来完全一样：

 `Warning messages:
  1: In normality_test(df, i, j) : NAs introduced by coercion
  2: In normality_test(df, i, j) : NAs introduced by coercion
  3: In normality_test(df, i, j) : NAs introduced by coercion
  4: In normality_test(df, i, j) : NAs introduced by coercion
  5: In if (shapiro.test(df[, i]) > 0.05 & shapiro.test(df[,  ... :
  the condition has length > 1 and only the first element will be used
  6: In normality_test(df, i, j) : NAs introduced by coercion`

在我必须意识到警告之前，相应地在代码中丢弃了缺失值 (NA)。为了解决这个问题，我使用了命令df[is.na(df)] <- 0。它没有改变任何东西。同样的警告仍然存在。相反，我观察到数字的生成就像人们预期的那样。上面显示的所有警告代码，在RStudio 中运行时完美运行，但未通过sweave 链接。这似乎是矛盾和奇怪的。我拼命尝试了几个小时没有成功。你知道如何解决这个问题吗？

我正在使用penguins 数据集。这是使用的代码：

df <- read.csv("penguins.csv")
str(df)
#We transform the character variables type into factor ones
i <- sapply(df, is.character)
df[,i] <- lapply(df[,i], as.factor)
df[,8] <- as.factor(df[,8])
str(df)

normality_test <- function(df,i,j) {
df <- df[!is.na(df[,i])&!is.na(df[,j]),]
plot(c(0, 1), c(0, 1), ann = F, bty = 'n', type = 'n', xaxt = 'n', yaxt  = 'n')
if (shapiro.test(df[,i]) > 0.05 & shapiro.test(df[,j]) > 0.05){
res1 <- cor.test(df[,i],df[,j], 
                 method = "pearson")
text(.5, .5, paste("p.value:", round(res1$p.value,2), "\n r:",   round(res1$estimate,2)))
}
else {
res2 <- cor.test(df[,i],df[,j], 
                 method = "spearman")
text(.5,.5, paste("p.value:", round(res2$p.value,2), "rho:",    round(res2$estimate,2)))
  }
}
#We define the density function to include diagonal elements
hist_density <- function (df, i) {
tmp <- na.omit(df[,i])
hist(tmp, col = "light blue",
   probability = TRUE, main=NULL)
lines(density(tmp), col = "red", lwd = 1.5)
}

new_pairs <- function(df, x){
par(mar=c(1,1,1,1))
n_col<-sum(sapply(df, is.numeric))
par(mfrow=c(n_col,n_col))
n<-ncol(df)
for (i in 1:n){

 for (j in 1:n){
  
  if ((class(df[,i])!="factor" ) & (class(df[,j])!="factor") & i<j) {
    plot(df[,i], df[,j], col = df[,x])
   } 
   else if ((class(df[,i])!="factor") & (class(df[,j])!="factor") &  i==j)  {
    hist_density(df, i)
   } 
   else if ((class(df[,i])!="factor" ) & (class(df[,j])!="factor") &   i>j){
    normality_test(df,i,j)
   }
   else {NA}
     }
    }
   }


  new_pairs(df, 2)

【问题讨论】：

我认为警告是因为您在整列df[, i] 上使用if/else，因为if/else 未矢量化。即它需要一个 TRUE/FALSE 作为输入。可能你需要ifelse
你能显示完整的代码吗？当您说 shapiro.test(df[, i]) > 0.05 时，您是在测试 p 值。那么你需要提取pvalue，即shapiro.test(df[, i])$p.value > 0.05，因为输出是list
是的，在该代码中，您正在比较返回单个值的 class。在这里，shapiro.test(df[,i]) 返回一个列表。因此，如果它是 pvalue，则需要提取相关值，如我之前的评论中创建的逻辑。
我现在就去做。很高兴再次与您交谈。我会在一分钟内告诉你结果。
如果它在一个块中工作，该错误必须与其他一些代码有关。你能把它编织成一个单独的测试文件来确认吗

标签： r latex

【解决方案1】：

根据代码，如果我们从shapiro.test 中检查p.value，则使用$ 或[[ 提取该组件，因为shapiro.test 的输出是list

normality_test <- function(df,i,j) {
    df <- df[!is.na(df[,i])&!is.na(df[,j]),]
    plot(c(0, 1), c(0, 1), ann = F, bty = 'n', type = 'n', 
           xaxt = 'n', yaxt  = 'n')
    if (shapiro.test(df[,i])$p.value > 0.05 & 
        shapiro.test(df[,j])$p.value > 0.05){
                res1 <- cor.test(df[,i],df[,j], 
                 method = "pearson")
           text(.5, .5, paste("p.value:", round(res1$p.value,2), 
                   "\n r:",   round(res1$estimate,2)))
            } else {
            res2 <- cor.test(df[,i],df[,j], 
                     method = "spearman", exact = FALSE)
            text(.5,.5, paste("p.value:", round(res2$p.value,2), 
                  "rho:",    round(res2$estimate,2)))
            }
    }

# // We define the density function to include diagonal elements
hist_density <- function (df, i) {
    tmp <- na.omit(df[,i])
    hist(tmp, col = "light blue",
    probability = TRUE, main=NULL)
    lines(density(tmp), col = "red", lwd = 1.5)
    }

-创建使用上述函数的new_pairs函数

new_pairs <- function(df, x){
    par(mar=c(1,1,1,1))
    n_col<-sum(sapply(df, is.numeric))
    par(mfrow=c(n_col,n_col))
    n <- ncol(df)
    for (i in 1:n){

        for (j in 1:n){
  
      if ((class(df[,i])!="factor" ) & (class(df[,j])!="factor") & i<j) {
        plot(df[,i], df[,j], col = df[,x])
        } 
        else if ((class(df[,i])!="factor") & 
           (class(df[,j])!="factor") &  i==j)  {
            hist_density(df, i)
            } 
         else if ((class(df[,i])!="factor" ) & 
                  (class(df[,j])!="factor") &   i>j){
         normality_test(df,i,j)
       }
    else {NA}
     }
    }
   }

-测试

new_pairs(iris, 2)

【讨论】：