【发布时间】:2018-12-21 12:49:50
【问题描述】:
我有时间序列数据,我想创建一个记录数的月度(x 轴)计数图表(折线图),按情绪分组(多条线)
数据看起来像这样
created_at id polarity sentiment
0 Fri Nov 02 11:22:47 +0000 2018 1058318498663870464 0.000000 neutral
1 Fri Nov 02 11:20:54 +0000 2018 1058318026758598656 0.011905 neutral
2 Fri Nov 02 09:41:37 +0000 2018 1058293038739607552 0.800000 positive
3 Fri Nov 02 09:40:48 +0000 2018 1058292834699231233 0.800000 positive
4 Thu Nov 01 18:23:17 +0000 2018 1058061933243518976 0.233333 neutral
5 Thu Nov 01 17:50:39 +0000 2018 1058053723157618690 0.400000 positive
6 Wed Oct 31 18:57:53 +0000 2018 1057708251758903296 0.566667 positive
7 Sun Oct 28 17:21:24 +0000 2018 1056596810570100736 0.000000 neutral
8 Sun Oct 21 13:00:53 +0000 2018 1053994531845296128 0.136364 neutral
9 Sun Oct 21 12:55:12 +0000 2018 1053993101205868544 0.083333 neutral
到目前为止,我已经设法使用以下代码汇总到每月总数:
import pandas as pd
tweets = process_twitter_json(file_name)
#print(tweets[:10])
df = pd.DataFrame.from_records(tweets)
print(df.head(10))
#make the string date into a date field
df['tweet_datetime'] = pd.to_datetime(df['created_at'])
df.index = df['tweet_datetime']
#print('Monthly counts')
monthly_sentiment = df.groupby('sentiment')['tweet_datetime'].resample('M').count()
我正在为如何绘制数据而苦恼。
- 我是否可以转动情绪中的每个谨慎值 字段作为单独的列
- 我试过
.unstack()将情绪值转换为行, 几乎就在那里,但问题是日期变成字符串列 标题,这对图表没有好处
【问题讨论】:
标签: python-3.x dataframe charts