在数据框熊猫中选择日期范围答案

【问题标题】：selecting date range in dataframe pandas在数据框熊猫中选择日期范围
【发布时间】：2016-05-08 18:21:46
【问题描述】：

首先非常感谢那些提供帮助的人，当人们可以提供帮助时学习很有趣。

我没有切片和向下选择，我有一个带有

的数据框

             Unit   Name          Count   Month Year
2013-01-01   U1     fn ln         2       01    2013
2013-01-01   U1     fn1 ln1       200     01    2013
2013-02-01   U2     fn2 ln2       55      01    2013
...
2016-01-01   U1     fn3 ln3       2       01    2016
2016-01-01   U1     fn1 ln1       200     01    2016
2016-01-01   U2     fn5 ln5       55      01    2016

我想创建这些数据的各种切片。

首先是每个月的总体，接下来是每个单位的每个月的总体，然后是本月、过去三个月和过去 6 个月的个人

到目前为止的代码

# this works great groups by year per month (1 2013, 2014, 2015)...
group1=df.groupby('Month','Year')

# works great to select by unit
group2=df.groupby('Unit', 'Month', 'Year')

# now i want the top 10 individuals in each group
# this doesn't work
month_indiv = group2[['Name', 'Count']]

我认为问题在于 groupby 删除了重复项，但我不明白如何创建为我提供个人的视图。

【问题讨论】：

标签： python datetime pandas dataframe slice

【解决方案1】：

您可以通过to_period 将索引转换为periodindex 并通过unique 查找最近3 个月：

print df
           Unit     Name  Count  Month  Year
2013-01-01   U1    fn ln      2      1  2013
2013-02-01   U1    fn ln      2      2  2013
2013-02-01   U1  fn1 ln1    200      2  2013
2013-03-01   U2  fn2 ln2     55      3  2013
2013-04-01   U2  fn2 ln2     55      4  2013
2013-05-01   U2  fn2 ln2     55      5  2013
2016-01-01   U1  fn3 ln3      2      1  2016
2016-01-01   U1  fn1 ln1    200      1  2016
2016-01-01   U2  fn5 ln5     55      1  2016

#convert index to Periodindex
print df.index.to_period('M')
PeriodIndex(['2013-01', '2013-02', '2013-02', '2013-03', '2013-04', '2013-05',
             '2016-01', '2016-01', '2016-01'],
            dtype='int64', freq='M')

#last 3 unique values
print df.index.to_period('M').unique()[-3:]
PeriodIndex(['2013-04', '2013-05', '2016-01'], dtype='int64', freq='M')

print df.index.to_period('M').isin(df.index.to_period('M').unique()[-3:])
[False False False False  True  True  True  True  True]

#last 3 months
print  df.loc[df.index.to_period('M').isin(df.index.to_period('M').unique()[-3:])]
           Unit     Name  Count  Month  Year
2013-04-01   U2  fn2 ln2     55      4  2013
2013-05-01   U2  fn2 ln2     55      5  2013
2016-01-01   U1  fn3 ln3      2      1  2016
2016-01-01   U1  fn1 ln1    200      1  2016
2016-01-01   U2  fn5 ln5     55      1  2016

【讨论】：

这给了我这些年来所有 1 月的信息，我正在寻找 2016 年 1 月的所有信息。然后从 2015 年 11 月、2015 年 12 月和 2016 年 1 月开始。然后从过去 6 个月开始。非常感谢您的帮助。
是的，就是这样。很好的答案，它为我解释了很多关于 python 的方式。快速的侧面问题，我将使用 df.index.to_period 很多，将其创建为我用作基础的变量的 pythonic 方法是什么，然后为各种切片创建 [-x:]。 # 有效，但给了我 periodindex object month_list = df2.index.to_period('M').unique() # 不起作用，因为 values 不是 periodinex 的方法 month_list = month_list.values()