【问题标题】:using seaborn to plot list of numpy variables使用 seaborn 绘制 numpy 变量列表
【发布时间】:2019-10-23 23:22:15
【问题描述】:

我有一个类型为 numpy 的变量列表。我想使用 seaborn 将它们绘制在一个图中。

subscribers=bankData.loc[bankData['deposit']==1] # Only who subscribe in term deposition 

occupations=bankData['job'].unique().tolist()

admin=subscribers['age'].loc[subscribers['job']=='admin.'].values
technician=subscribers['age'].loc[subscribers['job']=='technician'].values
services=subscribers['age'].loc[subscribers['job']=='services'].values
management=subscribers['age'].loc[subscribers['job']=='management'].values
retired=subscribers['age'].loc[subscribers['job']=='retired'].values
blue_collar=subscribers['age'].loc[subscribers['job']=='blue-collar'].values
unemployed=subscribers['age'].loc[subscribers['job']=='unemployed'].values
enterpreneur=subscribers['age'].loc[subscribers['job']=='enterpreneur'].values
housemaid=subscribers['age'].loc[subscribers['job']=='housemaid'].values
unknown= subscribers['age'].loc[subscribers['job']=='unknown'].values
self_employed=subscribers['age'].loc[subscribers['job']=='self-employed'].values
student=subscribers['age'].loc[subscribers['job']=='student'].values

occpuation_age=[admin, technician,services, management, retired, blue_collar, unemployed, enterpreneur, housemaid,
                unknown, self_employed, student]

我希望每个箱线图在 occpuation_age 中显示一项。

【问题讨论】:

标签: python list numpy plot seaborn


【解决方案1】:

无需将数据框拆分为单独的 numpy 数组,只需在 seaborn 图中传递变量名称即可:

sns.boxplot(x='job', y='age', data=subscribers)

演示随机的种子数据:

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

np.random.seed(682019)
occupations = ['admin', 'technician', 'management', 'retired', 'blue_collar',
               'unemployed', 'enterpreneur', 'housemaid',
               'unknown', 'self_employed', 'student']
subscribers = pd.DataFrame({'job': np.random.choice(occupations, 100),
                            'age': np.random.uniform(0, 100, 100)})

print(subscribers.head(10))
#              job        age
# 0     technician   2.188924
# 1    blue_collar  40.868834
# 2     management  44.179859
# 3     technician  72.193644
# 4   enterpreneur  83.680639
# 5   enterpreneur  60.923324
# 6        student  99.163055
# 7     management  80.392648
# 8        unknown  96.985044
# 9  self_employed  92.147679

fig, ax = plt.subplots(figsize=(14,5))
sns.boxplot(y='age', x='job', data=subscribers, ax=ax)

plt.show()
plt.clf()
plt.close()

要按年龄中位数降序排序,请使用groupby().transform() 添加所需的聚合列,然后使用此列排序:

subscribers['job_mean'] = subscribers.groupby('job')['age'].transform('median')
subscribers = subscribers.sort_values('job_mean', ascending=False)

fig, ax = plt.subplots(figsize=(14,5))
sns.boxplot(y='age', x='job', data=subscribers, ax=ax)

plt.show()
plt.clf()
plt.close()

【讨论】:

  • 我怎样才能让它们在情节中降序
  • 如果按中位数,请参阅编辑添加用于排序的新聚合列。
猜你喜欢
  • 1970-01-01
  • 2018-03-29
  • 2019-04-09
  • 2020-09-19
  • 2018-11-28
  • 2017-12-09
  • 1970-01-01
  • 1970-01-01
  • 2014-10-31
相关资源
最近更新 更多