在一张图中绘制多个 pandas 数据框答案

【问题标题】：plot multiple pandas dataframes in one graph在一张图中绘制多个 pandas 数据框
【发布时间】：2018-01-10 01:29:00
【问题描述】：

我创建了 6 个不同的数据框，它们消除了它们自己的原始数据框的异常值。现在，我正在尝试在同一张图上绘制所有消除异常值的数据框。

这是我的代码，用于消除每个数据帧中的异常值：

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
plt.style.use("ggplot")

#---Original DataFrame
x = (g[0].time[:27236])
y = (g[0].data.f[:27236])
df = pd.DataFrame({'Time': x, 'Data': y})

#----Removes the outliers in a given DataFrame and plots a graph
newdf = df.copy()
newdf = df[~df.groupby('Data').transform( lambda x: abs(x-x.mean()) > 1.96*x.std()).values]
#newdf.plot('Time', 'Data')

#---Original DataFrame
x = (q[0].time[:47374])
y = (q[0].data.f[:47374])
df = pd.DataFrame({'Time': x, 'Data': y})

#----Removes the outliers in a given DataFrame and plots a graph
newdf = df.copy()
newdf2 = df[~df.groupby('Data').transform( lambda x: abs(x-x.mean()) > 1.96*x.std()).values]
#newdf2.plot('Time', 'Data')

#---Original DataFrame
x = (w[0].time[:25504])
y = (w[0].data.f[:25504])
df = pd.DataFrame({'Time': x, 'Data': y})

#----Removes the outliers in a given DataFrame and plots a graph
newdf = df.copy()
newdf3 = df[~df.groupby('Data').transform( lambda x: abs(x-x.mean()) > 1.96*x.std()).values]
#newdf3.plot('Time', 'Data')

#---Original DataFrame
x = (e[0].time[:47172])
y = (e[0].data.f[:47172])
df = pd.DataFrame({'Time': x, 'Data': y})

#----Removes the outliers in a given DataFrame and plots a graph
newdf = df.copy()
newdf4 = df[~df.groupby('Data').transform( lambda x: abs(x-x.mean()) > 1.96*x.std()).values]
#newdf4.plot('Time', 'Data')

#---Original DataFrame
x = (r[0].time[:21317])
y = (r[0].data.f[:21317])
df = pd.DataFrame({'Time': x, 'Data': y})

#----Removes the outliers in a given DataFrame and plots a graph
newdf = df.copy()
newdf5 = df[~df.groupby('Data').transform( lambda x: abs(x-x.mean()) > 1.96*x.std()).values]
#newdf5.plot('Time', 'Data')

#---Original DataFrame
x = (t[0].time[:47211])
y = (t[0].data.f[:47211])
df = pd.DataFrame({'Time': x, 'Data': y})

#----Removes the outliers in a given DataFrame and plots a graph
newdf = df.copy()
newdf6 = df[~df.groupby('Data').transform( lambda x: abs(x-x.mean()) > 1.96*x.std()).values]
#newdf6.plot('Time', 'Data')

如果我删除评论 newdf.plot() 我将能够单独绘制所有图表，但我希望它们都在一个图表上。

是的，我已经阅读了http://matplotlib.org/examples/pylab_examples/subplots_demo.html 但该链接没有任何在一个图表中包含多个绘图的示例。

我也读过这篇文章：http://pandas-docs.github.io/pandas-docs-travis/visualization.html 有一些非常棒的信息，但是在一个图中有多个图的示例使用相同的数据框。我有 6 个单独的数据框。我想到了解决我的问题的一种方法是将所有数据帧写入同一个 excel 文件，然后从 excel 中绘制它们，但这似乎过多，我不需要将这些数据保存到 excel 文件中。

我的问题是：如何在同一张图中绘制多个 pandas 数据框。

听从 Scott 的建议后的图表

图表或多或少应该是什么样子

【问题讨论】：

标签： python pandas matplotlib dataframe

【解决方案1】：

您需要在 pandas.dataframe.plot 中使用ax 参数。

在第一个 df.plot 上使用以抓住该轴上的句柄：

ax = newdf.plot()

然后在后续的绘图中使用 ax 参数。

newdf2.plot(ax=ax)
...
newdf5.plot(ax=ax)

【讨论】：

那种工作。它把我所有的图都放在一张图中，但它把我的所有数据都弄乱了。我将在原始问题中发布使我的数据看起来像什么以及它应该是什么样子的 jpg。
所有数据的规模都一样吗？也许使用多个图或至少多个 y 轴是有意义的。
好吧，在第二张 jpg 中，我提出了数据应该共享 x/y 轴的样子。所以这就是我想要的，我只是不需要像这里的示例中那样分隔图link其中三个不同的图共享两个 x/y 轴。并且多个图对我不起作用，因为所有这些数据都在同一个参数下，我想把它们放在一起。

【解决方案2】：

我错过了什么吗？通常，我只对多个数据框执行此操作：

fig = plt.figure()

for frame in [newdf, newdf2, newdf3, newdf4, newdf5]:
    plt.plot(frame['Time'], frame['Data'])

plt.xlim(0,18000)
plt.ylim(0,30)
plt.show()

【讨论】：

【解决方案3】：

答案 26 是非常好的解决方案。我已经在我的数据框上尝试过，如果 x 列是日期，那么几乎不需要更改，例如，

              Date    Key  Confirmed   Deaths
14184   2020-02-12  US_TX        1.0      0.0
14596   2020-02-13  US_TX        2.0      0.0
15007   2020-02-14  US_TX        2.0      0.0
15418   2020-02-15  US_TX        2.0      0.0
15823   2020-02-16  US_TX        2.0      0.0
...            ...    ...        ...      ...
270228  2020-11-07  US_TX   950549.0  19002.0
271218  2020-11-08  US_TX   956234.0  19003.0
272208  2020-11-09  US_TX   963019.0  19004.0
273150  2020-11-10  US_TX   973970.0  19004.0
274029  2020-11-11  US_TX   985380.0  19544.0

              Date    Key  Confirmed   Deaths
21969   2020-03-01  US_NY        1.0      0.0
22482   2020-03-02  US_NY        1.0      0.0
23014   2020-03-03  US_NY        2.0      0.0
23555   2020-03-04  US_NY       11.0      0.0
24099   2020-03-05  US_NY       22.0      0.0
...            ...    ...        ...      ...
270218  2020-11-07  US_NY   530354.0  33287.0
271208  2020-11-08  US_NY   533784.0  33314.0
272198  2020-11-09  US_NY   536933.0  33343.0
273140  2020-11-10  US_NY   540897.0  33373.0
274019  2020-11-11  US_NY   545718.0  33398.0

import pandas as pd
from matplotlib import pyplot as plt

firstPlot = firstDataframe.plot(x='Date') # where the 'Date' is the column with date.

secondDataframe.plot(x='Date', ax=firstPlot)

...
plt.show()

【讨论】：