绘制不同变量的概率分布答案

【问题标题】：Plotting the probability distribution in different variables绘制不同变量的概率分布
【发布时间】：2016-01-06 12:09:57
【问题描述】：

我的数据 ret 包括 3 个变量和每个变量 208 行。

dim(ret)
[1] 208   3

我为不同的变量一一绘制。

hist(ret$AAPL,breaks
     freq=F,
     main="The probaility distribution of AAPL",
     xlab="AAPL")
hist(ret$GOOGL,breaks
     freq=F,
     main="The probaility distribution of GOOGL",
     xlab="GOOGL")
hist(ret$WMT,breaks
     freq=F,
     main="The probaility distribution of WMT",
     xlab="WMT")

现在，我尝试使用 sapply 函数一次单独绘制变量。

sapply(ret, function(x) hist(x,breaks=100,
                             freq=F,
                             main="the probaility distribution of x",
                             xlab="x"))

但是，"main="x 的概率分布"" 和 "xlab=x" 不起作用。我尝试将 colnames 放在 x 中。

此外，我还尝试将图中的线与变量放在一起。我使用的功能是

lines(density(,main="",lty=1,lwd=1)

如果我用变量单独绘制，我会这样做

hist(ret$AAPL,breaks
     freq=F,
     main="the probaility distribution of AAPL",
     xlab="AAPL")
lines(density(ret$AAPL),main="AAPL",lty=1,lwd=1)

但是如何使用 sapply 函数一起做呢？有人可以告诉我如何解决这些问题：使用 sapply 函数绘制不同变量的概率分布与概率密度线。

【问题讨论】：

标签： r plot histogram lines

【解决方案1】：

这样的？我假设您的 data.frame ret 有回报，而不是价格，但基本方法应该适用于任何一种情况。

# set up example - you have this already...
library(quantmod)    # for getSymbols
symbols <- c("AAPL", "GOOG", "WMT")
ret     <- do.call(merge,lapply(symbols,function(s)dailyReturn(Cl(getSymbols(s,src="google",auto.assign=FALSE)))))
names(ret) <- symbols
ret     <- data.frame(date=index(ret), ret)
# you start here...
plot.hist <- function(x,s) {
  hist(x,breaks=100,freq=F,xlab=s,main=paste("Histogram of",s), xlim=0.1*c(-1,1))
  lines(density(x, na.rm=TRUE),lty=1,lwd=1)
}
par(mfrow=c(3,1))
mapply(plot.hist, ret[,2:4], symbols)

这里有一些细微差别。

首先，您需要标题中的股票代码，而不是字符串“x”。为此，您需要如上所述使用paste(...)。

其次，当在 data.frame 上使用 sapply(...) 时，列确实会传递给函数，但列名不会。所以我们必须通过两者。最简单的方法是使用mapply(...)（阅读文档）。

最后，正如另一个答案中所指出的，您确实可以为此使用 ggplot，我也建议这样做：

library(ggplot2)
library(reshape2)    # for melt(...)
gg.df <- melt(ret, id="date", variable.name="Stock", value.name="Return")
ggplot(gg.df, aes(x=Return))+
  geom_histogram(aes(y=..density.., fill=Stock),color="grey80",binwidth=0.005)+
  stat_density(geom="line")+
  facet_grid(Stock~.)

【讨论】：

【解决方案2】：

我会为此任务使用一些 quantmod 和 ggplot2 包。如果您的问题确实是针对 hist 和 sapply 的，请忽略这一点。

library(ggplot2) # to access round_any
library(quantmod)

getSymbols(c("AAPL","GOOGL","WMT"))

A=data.frame('aapl',AAPL$AAPL.Close)
G=data.frame('goog',GOOGL$GOOGL.Close)
M=data.frame('wmt',WMT$WMT.Close)
names(A)=names(G)=names(M)=c('symbol','close')
d=rbind(A,G,M)

m <- ggplot(d, aes(x=close, colour=symbol, group=symbol))
m + geom_density(fill=NA)

或者如果你想要一个直方图

m <- ggplot(d, aes(x=close, colour=symbol, group=symbol))
m + geom_histogram(fill=NA,position='dodge')

【讨论】：