【问题标题】:Sankey bar chart diagramm with pandas or python带有熊猫或 python 的 Sankey 条形图图表
【发布时间】:2018-02-27 04:10:53
【问题描述】:

我想用任何可以与 matplotlib 接口的 python 模块制作一个像这样的条形图:

以下是示例数据以及我现在可以做什么的说明:

import pandas
from io import StringIO

text="""
Name                           1980              1982
A                    Administration            Budget
B                    Administration    Administration
C                    Administration    Administration
D                    Administration            Budget
E                    Administration            Budget
F                    Administration    Administration
G                    Administration    Administration
H                    Administration    Administration
"""

data=pandas.read_fwf(StringIO(text),header=1).set_index("Name")

count=pandas.DataFrame(index=["Administration","Budget"])
for col in data.columns:
    count[col]=data[col].value_counts()

count.T.plot(kind="bar",stacked=True)

当我绘制count 时,我得到以下堆积条形图:

我还可以通过做得到 1980 年到 1982 年间从行政部门转到预算部门的人数

pandas.crosstab(data["1980"],data["1982"])

给出:

1982            Administration  Budget
1980                                  
Administration               5       3

但是我不知道如何在条形图的每个部分之间绘制流量。有谁知道怎么做?

【问题讨论】:

标签: python pandas matplotlib bar-chart sankey-diagram


【解决方案1】:

您可以使用 pandas 的功能:crosstab 和 melt 为 sankey 准备数据:

from io import StringIO
import pandas as pd
import plotly
import chart_studio.plotly as py

text = """
Name                           1980              1982
A                    Administration            Budget
B                    Administration    Administration
C                    Administration    Administration
D                    Administration            Budget
E                    Administration            Budget
F                    Administration    Administration
G                    Administration    Administration
H                    Administration    Administration
"""
data = pd.read_fwf(StringIO(text),header=1)

# Make crosstab
data_cross = pd.crosstab(data['1980'], data['1982'])
print(data_cross)

# Make flat table
data_tidy = data_cross.rename_axis(None, axis=1).reset_index().copy()

# Make tidy table
formatted_data = pd.melt(data_tidy,
                         ['1980'],
                         var_name='1982',
                         value_name='Value')

import plotly.graph_objects as go

fig = go.Figure(data=[go.Sankey(
    node = dict(
      pad = 15,
      thickness = 20,
      line = dict(color = "black", width = 0.5),
      label = ["Administration", "Administration", "Budget"],
      color = ['blue', 'blue', 'green']
    ),
    link = dict(
        source = [0, 0], # indices correspond to labels...
        target = [1, 2],
        value = [5, 3],
        color = ['lightblue', 'lightgreen']
  ))])

fig.update_layout(title_text="Basic Sankey Diagram", font_size=10)
fig.show()

Snapshot of figure

【讨论】: