【问题标题】:How to plot a line plot on a bar plot with seaborn and using twinx如何使用 seaborn 和使用 twinx 在条形图上绘制线图
【发布时间】:2021-11-12 18:01:05
【问题描述】:

我正在尝试构建一个绘图,以便我可以根据空气中的化学物质检查鸟类死亡率。我面临的挑战是如何将情节叠加在一起。

下面是我写的代码。基本上给定 6 种化学物质,我有六个单独的轴,在每个轴中我绘制了所述化学物质与鸟类死亡率的关系,以周数作为 x 轴。

即使图表的来源来自一个数据框(X 轴没有不同)Seaborn 也会分别绘制这两个图表。

我尝试使用 bird_chem_df.plot() 和 vanilla matplotlib 来查看这是否仅限于 Seaborn,但没有运气。

很想知道究竟是什么导致 X 轴不同意,尽管数据来自同一来源并且它们共享相同的自变量。

如果你想玩数据 = https://drive.google.com/file/d/1KKkSoy3xQno_vE_-LSoviClmqC3UDcW6/view?usp=sharing

# Import the CSV in the link above
# bird_chem_df = pd.read_csv('...')

# Using `target_chemicals_chart` as our array, we will loop through each
# and plot it against bird mortality
target_chemicals_chart = ['Toluene', 'o-Xylene', 'm,p-Xylene','Ethylbenzene', 'Benzene', 'PM2.5']
target_chemicals_display = {
    'Toluene':[0,2.5],
    'o-Xylene':[0,2.5],
    'm,p-Xylene':[0,2.5],
    'Benzene':[0,2.5],
    'Ethylbenzene':[0,8], 
    'PM2.5':[0,25]
}
chemical_index = 0

# Setup the plot
fig, axs = plt.subplots(2,3,figsize=(12,7))
fig.subplots_adjust(left=7, bottom=7, right=9, top=9, wspace=0.2, hspace=0.5)

# Plot each axis
for row in axs:
    for col in row:
        target_chemical = target_chemicals_chart[chemical_index]
        col.set_title(f"{target_chemical} vs. Bird Mortality")
        sb.barplot(ax=col, x="week_number", y=target_chemical, data=bird_chem_df, ci=None, color='lightsteelblue')
        col.set_ylim(target_chemicals_display[target_chemical])
        bird = col.twinx()
        sb.lineplot(ax=bird, x="week_number", y="total_bird_deaths", data=bird_chem_df, color='red')
        col.set_xlabel("Weeks since Spill")
        col.set_ylabel("Average Result for Chemical - ug/m3")

        # Increase font size due to fig configurations
        for item in ([col.title, col.xaxis.label, col.yaxis.label] +
             col.get_xticklabels() + bird.get_yticklabels() + col.get_yticklabels()):
            item.set_fontsize(20)

        chemical_index += 1

【问题讨论】:

    标签: python pandas matplotlib seaborn


    【解决方案1】:
    • 条形图刻度的位置是 0 索引,线图刻度不是。
    • axs 的多维数组转换为一维数组,以便于访问和迭代。
    • 在这种情况下,条形图的 x 轴为 'week_number',与线图相同,因此两个图之间的刻度位置数量相同。因此,不要使用x='week_number' 绘图,而是通过指定x=ax.get_xticks() 将线图绘制到与条形图相同的刻度位置
    • 另一种选择是对两个图都使用x=bird_chem_df.index(因为索引是从0开始的RangeIndex),然后将xtick标签更改为bird_chem_df.week_number
    • python 3.8.11pandas 1.3.2matplotlib 3.4.3seaborn 0.11.2中测试
    import seaborn as sns
    import pandas as pd
    import matplotlib.pyplot as plt
    
    data = {'week_number': [19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34], 'Benzene': [0.62, 0.459, 0.542, 0.631, 0.56, 0.61, 0.691, 1.109, 0.524, 0.87, 0.896, 0.665, 0.898, 0.862, 0.611, 0.37], 'Ethylbenzene': [0.364, 0.204, 0.372, 0.36, 0.225, 0.412, 0.332, 0.659, 0.241, 1.7, np.nan, 1.2, 7.35, 0.352, 0.267, 0.154], 'PM2.5': [14.621, 12.561, 11.174, 18.307, 11.285, 20.202, 16.111, 13.057, 11.301, 12.214, 18.173, 21.308, 14.009, 14.111, 9.805, 7.818], 'Toluene': [1.339, 0.999, 1.18, 2.019, 1.217, 1.797, 1.478, 1.896, 1.552, 2.201, 1.101, 1.416, 1.215, 1.598, 1.356, 0.877], 'm,p-Xylene': [1.059, 0.842, 1.186, 1.116, 0.693, 1.372, 0.972, 2.103, 0.568, 1.783, 1.736, 1.486, 1.983, 1.082, 0.909, 0.354], 'o-Xylene': [0.525, 0.328, 0.356, 0.409, 0.265, 0.396, 0.32, 0.634, 0.266, 1.543, 0.74, 2.0, 0.93, 0.374, 0.328, 0.15], 'total_bird_deaths': [13, 14, 224, 87, 120, 165, 224, 252, 303, 416, 580, 537, 592, 713, 442, 798]}
    bird_chem_df = pd.DataFrame(data)
    
    target_chemicals_chart = ['Toluene', 'o-Xylene', 'm,p-Xylene','Ethylbenzene', 'Benzene', 'PM2.5']
    target_chemicals_display = {'Toluene': [0, 2.5], 'o-Xylene': [0, 2.5], 'm,p-Xylene': [0, 2.5], 'Benzene': [0, 2.5], 'Ethylbenzene': [0, 8], 'PM2.5': [0, 25]}
    chemical_index = 0
    
    # Setup the plot
    fig, axs = plt.subplots(2, 3, figsize=(15, 7))
    
    # convert axs to a 1-D array
    axs = axs.ravel()
    
    for ax in axs:
        target_chemical = target_chemicals_chart[chemical_index]
        ax.set_title(f"{target_chemical} vs. Bird Mortality")
        p1 = sns.barplot(ax=ax, x="week_number", y=target_chemical, data=bird_chem_df, ci=None, color='lightsteelblue')
    
        ax.set_ylim(target_chemicals_display[target_chemical])
        bird = ax.twinx()
    
        # plot against the same tick values as the bar plot, with x=ax.get_xticks()
        p2 = sns.lineplot(ax=bird, x=ax.get_xticks(), y="total_bird_deaths", data=bird_chem_df, color='red', marker='o')
    
        ax.set_xlabel("Weeks since Spill")
        ax.set_ylabel("Average Result for Chemical - ug/m3")
       
    fig.tight_layout()
    

    • 绘制所有化学品列
    data = {'week_number': [19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34], 'Benzene': [0.62, 0.459, 0.542, 0.631, 0.56, 0.61, 0.691, 1.109, 0.524, 0.87, 0.896, 0.665, 0.898, 0.862, 0.611, 0.37], 'Ethylbenzene': [0.364, 0.204, 0.372, 0.36, 0.225, 0.412, 0.332, 0.659, 0.241, 1.7, np.nan, 1.2, 7.35, 0.352, 0.267, 0.154], 'PM2.5': [14.621, 12.561, 11.174, 18.307, 11.285, 20.202, 16.111, 13.057, 11.301, 12.214, 18.173, 21.308, 14.009, 14.111, 9.805, 7.818], 'Toluene': [1.339, 0.999, 1.18, 2.019, 1.217, 1.797, 1.478, 1.896, 1.552, 2.201, 1.101, 1.416, 1.215, 1.598, 1.356, 0.877], 'm,p-Xylene': [1.059, 0.842, 1.186, 1.116, 0.693, 1.372, 0.972, 2.103, 0.568, 1.783, 1.736, 1.486, 1.983, 1.082, 0.909, 0.354], 'o-Xylene': [0.525, 0.328, 0.356, 0.409, 0.265, 0.396, 0.32, 0.634, 0.266, 1.543, 0.74, 2.0, 0.93, 0.374, 0.328, 0.15], 'total_bird_deaths': [13, 14, 224, 87, 120, 165, 224, 252, 303, 416, 580, 537, 592, 713, 442, 798]}
    bird_chem_df = pd.DataFrame(data)
    
    target_chemicals_display = {'Toluene': [0, 2.5], 'o-Xylene': [0, 2.5], 'm,p-Xylene': [0, 2.5], 'Benzene': [0, 2.5], 'Ethylbenzene': [0, 8], 'PM2.5': [0, 25]}
    target_chemicals_chart = ['Toluene', 'o-Xylene', 'm,p-Xylene','Ethylbenzene', 'Benzene', 'PM2.5']
    
    for target_chemical in target_chemicals_display:
        # Setup the plot
        fig, axs = plt.subplots(2, 3, figsize=(15, 7))
        axs = axs.ravel()
    
        for ax in axs:
            ax.set_title(f"{target_chemical} vs. Bird Mortality")
    
            p1 = sns.barplot(ax=ax, x=bird_chem_df.index, y=target_chemical, data=bird_chem_df, ci=None, color='lightsteelblue')
    
            ax.set_ylim(target_chemicals_display[target_chemical])
            bird = ax.twinx()
            p2 = sns.lineplot(ax=bird, x=bird_chem_df.index, y="total_bird_deaths", data=bird_chem_df, color='red', marker='o')
    
            # set the x-axis tick label to be the week numbers
            ax.set_xticks(ax.get_xticks())
            ax.set_xticklabels(bird_chem_df.week_number)
    
            ax.set_xlabel("Weeks since Spill")
            ax.set_ylabel("Average Result for Chemical - ug/m3")
    
    
        fig.tight_layout()
        plt.show()
    

    【讨论】:

    • 这让我吃惊 - 澄清一下,我什至创建了一个名为 week_number 的值列表,它实际上只是 [19 20 21 ... 34] 的 np.array,并将其用作 x 用于绘图和它仍然失败。 get_xticks 数据有什么特别之处,其中具有固定索引和值的数字数组无法工作?
    • @Adib xticks 的索引为 0,您的数组从 19 开始,而不是 0。19 到 34 是标签,而不是刻度位置。
    • 谢谢!我对化学品使用了一个数组,因为我想显示我的数据有一个特定的顺序。但是使用bird_chem_df.index然后设置xticklabels绝对是一种很pythonic的做法
    • @Adib 不客气,感谢您的咖啡。我们肯定会在波特兰进入南瓜香料拿铁季节
    猜你喜欢
    • 1970-01-01
    • 2021-07-28
    • 2019-03-27
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2021-06-18
    • 2019-07-29
    • 1970-01-01
    相关资源
    最近更新 更多