使用 group by 聚合时间序列并创建具有多个系列的图表答案

【问题标题】：Aggregate time series with group by and create chart with multiple series使用 group by 聚合时间序列并创建具有多个系列的图表
【发布时间】：2018-12-21 12:49:50
【问题描述】：

我有时间序列数据，我想创建一个记录数的月度（x 轴）计数图表（折线图），按情绪分组（多条线）

数据看起来像这样

created_at                         id                   polarity  sentiment  
0  Fri Nov 02 11:22:47 +0000 2018  1058318498663870464  0.000000   neutral   
1  Fri Nov 02 11:20:54 +0000 2018  1058318026758598656  0.011905   neutral   
2  Fri Nov 02 09:41:37 +0000 2018  1058293038739607552  0.800000  positive   
3  Fri Nov 02 09:40:48 +0000 2018  1058292834699231233  0.800000  positive   
4  Thu Nov 01 18:23:17 +0000 2018  1058061933243518976  0.233333   neutral   
5  Thu Nov 01 17:50:39 +0000 2018  1058053723157618690  0.400000  positive   
6  Wed Oct 31 18:57:53 +0000 2018  1057708251758903296  0.566667  positive   
7  Sun Oct 28 17:21:24 +0000 2018  1056596810570100736  0.000000   neutral   
8  Sun Oct 21 13:00:53 +0000 2018  1053994531845296128  0.136364   neutral   
9  Sun Oct 21 12:55:12 +0000 2018  1053993101205868544  0.083333   neutral

到目前为止，我已经设法使用以下代码汇总到每月总数：

import pandas as pd

tweets = process_twitter_json(file_name) 
#print(tweets[:10])

df = pd.DataFrame.from_records(tweets)
print(df.head(10))

#make the string date into a date field    
df['tweet_datetime'] = pd.to_datetime(df['created_at'])
df.index = df['tweet_datetime']

#print('Monthly counts')
monthly_sentiment = df.groupby('sentiment')['tweet_datetime'].resample('M').count()

我正在为如何绘制数据而苦恼。

我是否可以转动情绪中的每个谨慎值字段作为单独的列
我试过 .unstack() 将情绪值转换为行，几乎就在那里，但问题是日期变成字符串列标题，这对图表没有好处

【问题讨论】：

标签： python-3.x dataframe charts

【解决方案1】：

好的，我更改了每月聚合方法并使用 Grouper 而不是重新采样，这意味着当我执行 unstack() 时，生成的数据框是垂直的（深而窄），日期为行而不是水平，日期为列标题这意味着当我开始绘制日期时，我不再遇到将日期存储为字符串的问题。

完整代码：

import pandas as pd

tweets = process_twitter_json(file_name) 

df = pd.DataFrame.from_records(tweets)


df['tweet_datetime'] = pd.to_datetime(df['created_at'])
df.index = df['tweet_datetime']

grouper = df.groupby(['sentiment', pd.Grouper(key='tweet_datetime', freq='M')]).id.count()
result = grouper.unstack('sentiment').fillna(0)

##=================================================
##PLOTLY - charts in Jupyter

from plotly import __version__
from plotly.offline import download_plotlyjs, init_notebook_mode, plot, iplot

print (__version__)# requires version >= 1.9.0

import plotly.graph_objs as go

init_notebook_mode(connected=True)

trace0 = go.Scatter(
    x = result.index,
    y = result['positive'],
    name = 'Positive',
    line = dict(
        color = ('rgb(205, 12, 24)'),
        width = 4)
)

trace1 = go.Scatter(
    x = result.index,
    y = result['negative'],
    name = 'Negative',
    line = dict(
        color = ('rgb(22, 96, 167)'),
        width = 4)
)    
trace2 = go.Scatter(
    x = result.index,
    y = result['neutral'],
    name = 'Neutral',
    line = dict(
        color = ('rgb(12, 205, 24)'),
        width = 4)
)

data = [trace0, trace1, trace2]

iplot(data)

【讨论】：