按类型打印 data.frame 列和颜色答案

【问题标题】：print data.frame column and color by type按类型打印 data.frame 列和颜色
【发布时间】：2016-01-06 17:35:55
【问题描述】：

在我的针织文档中，我正在尝试打印数据框的列。只是为了帮助可视化，我想更改输出颜色以根据另一列的值进行更改。我有一个简单的例子如下。

date_vector <- rep(NA, 10)
type_vector <- rep(NA, 10)
types <- c("A", "B")
CDate <- Sys.Date()
date_vector[1] <- as.character(CDate)
type_vector[1] <- sample(types, size = 1)
for (i in 2:10) {
  CDate <- as.Date(CDate) + rexp(n = 1, rate = 1/5)
  date_vector[i] <- as.character(CDate)
  type_vector[i] <- sample(types, size = 1)
}

test_df <- data.frame(Date=date_vector, Type=type_vector)

当我打印test_df$Date 时，我看到以下内容

date_vector
[1] "2016-01-06" "2016-01-07" "2016-01-22" "2016-01-28" "2016-01-29" "2016-02-01" "2016-02-04"
[8] "2016-02-12" "2016-02-13" "2016-02-15"

相反，希望看到以下内容

由于条目的类型如下

type_vector
[1] "A" "A" "B" "B" "A" "A" "B" "A" "B" "A"

所以蓝色代表A 类型的日期，绿色代表B 类型的日期。

【问题讨论】：

这个问题似乎与 LaTeX 无关。请注意，关于 R 和其他东西的问题仅在与 LaTeX 相关的部分是主题性的，这里似乎并非如此。
谢谢。这个问题也与knitr有关，所以我想在这里发帖。我可以将此问题迁移到stackoverflow吗？还是我需要重新输入？

标签： r latex knitr

【解决方案1】：

这个答案比问题更笼统。该问题要求一种根据另一列为数据框的一列着色的方法。这个答案解决了在向量中突出显示元素的更一般情况，具体取决于第二个逻辑向量，该向量指示要突出显示的元素。

原则上，这很简单：打印一个向量，突出显示另一个逻辑向量指示的元素。突出显示x 可以像将其包裹在\\textcolor{blue}{x} 或\\emph{x} 中一样简单。

实际上，并不是那么简单……print(x) 做了很多有用的事情：它将数据很好地排列在列中，在字符数据周围添加引号，包装输出以尊重 getOption("width)，添加第一个索引元素到每行输出，依此类推。问题是，我们不能使用print 打印突出显示的数据，因为print 转义了\\textcolor 中的反斜杠。此问题的standard solution 是使用cat 而不是print。但是，cat 不应用上面列出的任何漂亮格式。

因此，挑战在于编写一个函数来重现print 的一些/所需功能。这是一项相当复杂的任务，因此我将自己限制在以下主要功能上：

总线宽<= getOption("width")。
自动为非数字和非逻辑值添加引号（如果未设置quote）。
将第一个元素的索引添加到每行输出（如果printIndex = TRUE）。
对数字输入应用四舍五入 (digits)。

另外，这两个突出显示功能：

将x 的元素包裹在“突出显示模式”中，由condition 指示
在计算线宽时不要考虑高亮模式。这假设突出显示仅添加标记但不添加可见输出。

请注意，此函数缺少print 的重要功能，例如处理缺失值。此外，它将输入x 转换为字符（通过as.character）。这样的结果可能与print 不同，因为根本没有使用与输入类对应的 S3 方法 (print.*)。

printHighlighted <- function(x, condition = rep(FALSE, length(x)), highlight = "\\emph{%s}", printIndex = TRUE, width = getOption("width"), digits = getOption("digits"), quote = NULL) {

  stopifnot(length(x) == length(condition))
  stopifnot(missing(digits) || (!missing(digits) && is.numeric(x))) # Raise error when input is non-numeric but "digits" supplied.

  if (missing(quote)) {
    if (is.numeric(x) || is.logical(x)) {
      quote <- FALSE
    } else {
      quote <- TRUE
    }
  }

  nquotes <- 0

  if (!printIndex) {
    currentLineIndex <- ""
  }

  if (is.numeric(x)) {
    x <- round(x, digits = digits)
  }

  fitsInLine <- function(x, elementsCurrentLine, currentLineIndex, nquotes, width) {
    return(sum(nchar(x[elementsCurrentLine])) + # total width of elements in current line
             nchar(currentLineIndex) + # width of the index of the first element (if shown)
             sum(elementsCurrentLine) - 1 + # width of spaces between elements
             nquotes <= # width of quotes added around elements
             width)
  }

  x <- as.character(x)
  elementsCurrentLine <- rep(FALSE, times = length(x))


  for (i in seq_along(x)) {

    if (!any(elementsCurrentLine) && printIndex) { # this is a new line AND show index
      currentLineIndex <- sprintf("[%s] ", i)
    }

    elementsCurrentLine[i] <- TRUE # Add element i to current line. Each line holds at least one element. Therefore, if i is the first element of this line, add it regardless of line width. If there already are elements in the line, the previous loop iteration checked that this element will fit.

    if (i < length(x)) { # not the last element

      # check whether next element will fit in this line
      elementsCurrentLineTest <- elementsCurrentLine
      elementsCurrentLineTest[i + 1] <- TRUE

      if (quote) {
        nquotes <- sum(elementsCurrentLineTest) * 2
      }

      if (fitsInLine(x, elementsCurrentLineTest, currentLineIndex, nquotes, width)) {
        next # Next element will fit; do not print yet.
      }
    }

    # Next element won't fit in current line. Print and start a new line.

    # print
    toPrint <- x[elementsCurrentLine]
    toMarkup <- condition[elementsCurrentLine]

    toPrint[toMarkup] <- sprintf(fmt = highlight, toPrint[toMarkup]) # add highlighting

    if (quote) {
      toPrint <- sprintf('"%s"', toPrint)
    }

    cat(currentLineIndex)
    cat(toPrint)
    cat("\n")

    # clear line
    elementsCurrentLine <- rep(FALSE, times = length(x))
  }
}

要将此函数与knitr 一起使用，必须使用chunk option results = "asis"，否则输出将包装在verbatim 环境中，其中负责突出显示的标记显示而不是使用。

最后，为了重现正常块的外观，将整个块包裹在

\begin{knitrout}
\definecolor{shadecolor}{rgb}{0.969, 0.969, 0.969}\color{fgcolor}
\begin{kframe}
\begin{alltt}
<<your-chunk>>=
printHighlighted(...)
@
\end{alltt}
\end{kframe}
\end{knitrout}

示例

为了节省一些空间，该示例假定printHighlighted 的函数定义在文件printHighlighted.R 中可用。

\documentclass{article}
\begin{document}

Some text ....

\begin{knitrout}\definecolor{shadecolor}{rgb}{0.969, 0.969, 0.969}\color{fgcolor}\begin{kframe}\begin{alltt}
<<results = "asis", echo = FALSE>>=
source("printHighlighted.R")
data <- seq(from = as.Date("2015-01-15"), by = "day", length.out = 100)
cond <- rep(FALSE, 100)
cond[c(3, 55)] <- TRUE

printHighlighted(x = data, condition = cond, highlight = "\\textcolor{blue}{%s}", width = 60)
@
\end{alltt}\end{kframe}\end{knitrout}

Some text ....

\end{document}

事实证明这很长......如果有人认为这对于这样一个简单的问题来说有点过头了，我很乐意看到更短的解决方案。

【讨论】：