【问题标题】:Mathplotlib graph problemsMatplotlib 图形问题
【发布时间】:2022-01-02 23:20:48
【问题描述】:

我正在尝试使用 mathplotlib 显示来自气象站的数据。出于某种原因,我无法完全弄清楚图表上的最后一个值是随机的,在 x 轴上及时回溯。

x 轴是日期,

y轴是水位

y1轴为排放流量

这是结果的图片 Graph

import pandas as pd
import matplotlib.pyplot as plt

url_hourly = "https://dd.weather.gc.ca/hydrometric/csv/BC/hourly/BC_08MG005_hourly_hydrometric.csv"
url_daily  = "https://dd.weather.gc.ca/hydrometric/csv/BC/daily/BC_08MG005_daily_hydrometric.csv"
fields = ["Date","Water Level / Niveau d'eau (m)", "Discharge / Débit (cms)"]

#Read csv files  
hourly_data = pd.read_csv(url_hourly, usecols=fields)
day_data = pd.read_csv(url_daily, usecols=fields)

#Merge csv files
water_data = pd.concat([day_data,hourly_data])

#Convert date to datetime
water_data['Date'] = pd.to_datetime(water_data['Date']).dt.normalize()
water_data['Date'] = water_data['Date'].dt.strftime('%m/%d/%Y')

# CSV files contains 288 data entries per day (12per hour * 24hrs). Selecting every 288th element to represent one day
data_24hr = water_data[::288]

# Assigning columns to x, y, y1 axis
x = data_24hr[fields[0]]
y1 = data_24hr[fields[1]]
y2= data_24hr[fields[2]]

#Ploting the graph
fig, ax1 = plt.subplots()
ax2 = ax1.twinx()
curve1 = ax1.plot(x,y1, label='Water Level', color = 'r', marker="o")
curve2 = ax2.plot(x,y2,label='Discharge Volume', color = 'b',marker="o")
plt.plot()
plt.show()

任何提示将不胜感激,因为我对此很陌生

谢谢

【问题讨论】:

  • 为什么一个数据集叫daily,另一个叫hourly,每五分钟采样一次?
  • 无论如何,您需要删除由于两个数据集之间的重叠而导致的重复项,并且可能按时间排序并确保每天确实有 288 个条目。那么情节应该会如预期的那样发展。
  • 感谢您的提示!不知道为什么气象站这样命名他们的数据集......让人感到困惑。我发现重复的数据不知道我第一次是怎么错过的。我会尝试弄清楚如何删除它们。感谢您的宝贵时间

标签: python graph


【解决方案1】:

好的,我浏览了代码,删除了“日期”列中的重复项(如 Arne 所建议的)。哦,我使图形格式更具可读性。这张图没有回到过去:

import pandas as pd
import matplotlib.pyplot as plt
from matplotlib import ticker

url_hourly = "https://dd.weather.gc.ca/hydrometric/csv/BC/hourly/BC_08MG005_hourly_hydrometric.csv"
url_daily  = "https://dd.weather.gc.ca/hydrometric/csv/BC/daily/BC_08MG005_daily_hydrometric.csv"
fields = ["Date","Water Level / Niveau d'eau (m)", "Discharge / Débit (cms)"]

#Read csv files  
hourly_data = pd.read_csv(url_hourly, usecols=fields)
day_data = pd.read_csv(url_daily, usecols=fields)

#Merge csv files
water_data = pd.concat([day_data,hourly_data])

#Convert date to datetime
water_data['Date'] = pd.to_datetime(water_data['Date']).dt.normalize()
water_data['Date'] = water_data['Date'].dt.strftime('%m/%d/%Y')
# CSV files contains 288 data entries per day (12per hour * 24hrs). Selecting every 288th element to represent one day
data_24hr = water_data.iloc[::288]
data_24hr.drop_duplicates(subset="Date",inplace=True) #remove duplicates according to the date column
# Assigning columns to x, y, y1 axis
x = data_24hr[fields[0]]
y1 = data_24hr[fields[1]]
y2= data_24hr[fields[2]]

print(len(x), len(y1))
#Ploting the graph
fig, ax1 = plt.subplots()
ax2 = plt.twinx()
curve1 = ax1.plot(x, y1, label='Water Level', color = 'r', marker="o")
curve2 = ax2.plot(x, y2, label='Discharge Volume', color = 'b',marker="o")
fig.autofmt_xdate(rotation=90)
plt.show()

【讨论】:

  • 谢谢!正要弄清楚如何轮换这些日期,比我想象的要容易。
猜你喜欢
  • 2012-02-19
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2021-12-03
  • 2020-06-05
  • 1970-01-01
  • 1970-01-01
相关资源
最近更新 更多