【发布时间】:2018-07-02 21:19:12
【问题描述】:
所以我正在开发一个允许用户选择季节性时段的功能,它可以工作,但我想允许一些额外的功能。
现在我允许用户指定一个月,该函数返回一个新的数据框,其中包含指定月份的数据(参见代码和示例)。我想要的是允许用户选择开始的月份和日期(例如 1 月 5 日 - 3 月 18 日),并且只选择该范围内的日期。这可能吗?
我的代码如下:
import numpy as np
import pandas as pd
def seasonal_period(merged_dataframe, period):
"""Returns the seasonal period specified for the time series"""
start = period[0]
end = period[1]
merged_dataframe = merged_dataframe.loc[(merged_dataframe.index.month >= start) &
(merged_dataframe.index.month <= end)]
return merged_dataframe
# Testing the seasonal period with random data
df = pd.DataFrame(np.random.rand(10000, 3), index=pd.date_range('1/1/1980', periods=10000, freq='D'))
# Returns data between Jan and May
print(seasonal_period(merged_dataframe=df, period=[1, 5]))
打印出来:
0 1 2
1980-01-01 0.788608 0.113614 0.328662
1980-01-02 0.208422 0.974086 0.765795
1980-01-03 0.448420 0.004947 0.184313
1980-01-04 0.400208 0.194078 0.961875
1980-01-05 0.118263 0.406548 0.358848
1980-01-06 0.824994 0.969560 0.892299
1980-01-07 0.140431 0.642784 0.961061
1980-01-08 0.235443 0.236711 0.291453
1980-01-09 0.420899 0.083092 0.277860
1980-01-10 0.185541 0.640260 0.161851
1980-01-11 0.654466 0.742445 0.398733
1980-01-12 0.270931 0.500233 0.121283
1980-01-13 0.590752 0.057112 0.477629
1980-01-14 0.122973 0.997112 0.998513
1980-01-15 0.330342 0.175655 0.240798
1980-01-16 0.559489 0.426027 0.135564
1980-01-17 0.260714 0.493863 0.420336
1980-01-18 0.214587 0.890858 0.097045
1980-01-19 0.243018 0.285315 0.112326
1980-01-20 0.334157 0.630524 0.585468
1980-01-21 0.974340 0.023412 0.349269
1980-01-22 0.435924 0.709390 0.554518
1980-01-23 0.158202 0.288950 0.747733
1980-01-24 0.855350 0.066325 0.796400
1980-01-25 0.482685 0.962369 0.948844
1980-01-26 0.605162 0.185115 0.832465
1980-01-27 0.078977 0.886044 0.823400
1980-01-28 0.062488 0.841581 0.998819
1980-01-29 0.070578 0.836261 0.732075
1980-01-30 0.386692 0.413445 0.524926
... ... ... ...
2007-04-19 0.030180 0.295753 0.696634
2007-04-20 0.246591 0.245117 0.096647
2007-04-21 0.915289 0.264874 0.754863
2007-04-22 0.222286 0.041275 0.922791
2007-04-23 0.389606 0.149993 0.200387
2007-04-24 0.113636 0.923970 0.031243
2007-04-25 0.154459 0.587656 0.508116
2007-04-26 0.525778 0.056525 0.380457
2007-04-27 0.335463 0.343321 0.191828
2007-04-28 0.249183 0.361834 0.327324
2007-04-29 0.994158 0.108749 0.375496
2007-04-30 0.674535 0.527557 0.744897
2007-05-01 0.029355 0.227039 0.418219
2007-05-02 0.946061 0.251699 0.002965
2007-05-03 0.127731 0.479151 0.634638
2007-05-04 0.045522 0.800802 0.170384
2007-05-05 0.514632 0.426107 0.557497
2007-05-06 0.974910 0.757357 0.119415
2007-05-07 0.624626 0.287442 0.211390
2007-05-08 0.408227 0.720328 0.400762
2007-05-09 0.981552 0.399663 0.953638
2007-05-10 0.256625 0.301236 0.832127
2007-05-11 0.513227 0.649790 0.174498
2007-05-12 0.229353 0.089870 0.024055
2007-05-13 0.819985 0.470549 0.388860
2007-05-14 0.640930 0.530929 0.694122
2007-05-15 0.065560 0.084560 0.677467
2007-05-16 0.297165 0.949761 0.483062
2007-05-17 0.405513 0.320957 0.678885
2007-05-18 0.315292 0.773871 0.043010
[4222 rows x 3 columns]
Process finished with exit code 0
有什么建议吗?
【问题讨论】:
-
但这就是我被卡住的地方,因为如果我指定一个日期范围(例如 [1, 10]),那么我将得到一月到三月的月份和那些月份的 1 到 10 天,但我不会得到 1 月 1 日 - 3 月 10 日的范围。这有意义吗?
标签: python pandas indexing time