【发布时间】:2022-01-12 16:34:07
【问题描述】:
我有一个包含 n 列的数据框 df,每小时数据(date_i X1_i X2_i ... Xn_i)。
对于每一天,我都想获得每列的最小值。但是我找不到不遍历列的方法。
使用最小值很容易,因为 df.groupby(pd.Grouper(freq='D')).min() 似乎可以解决问题,但是当我尝试 nsmallest 方法时,我收到以下错误消息:
“无法访问 'DataFrameGroupBy' 对象的可调用属性 'nsmallest',请尝试使用 'apply' 方法”。
我尝试将 nsmallest 与“应用”方法一起使用,但被要求指定列...
如果有人有想法,那将非常有帮助
谢谢
PS:对不起格式,这是我第一次发帖
编辑:一些插图 我的数据是什么样子的:
0 1 ... 9678 9679
2022-01-08 00:00:00 18472.232746 28934.878033 ... 20668.503228 22079.457224
2022-01-08 01:00:00 19546.101746 30239.880033 ... 21789.779228 23330.190224
2022-01-08 02:00:00 22031.448746 33016.048033 ... 24278.199228 25990.503224
2022-01-08 03:00:00 24089.368644 36134.608919 ... 26327.332591 28089.134306
2022-01-08 04:00:00 24640.942644 36818.412919 ... 26894.204591 28736.705306
2022-01-08 05:00:00 23329.700644 35639.693919 ... 25555.199591 27379.323306
2022-01-08 06:00:00 20990.043644 33329.805919 ... 23137.500591 24917.126306
2022-01-08 07:00:00 18314.599644 30347.799919 ... 20167.500591 22022.524306
2022-01-08 08:00:00 17628.482226 31301.113041 ... 21665.296600 24202.625832
2022-01-08 09:00:00 15743.339226 29588.354041 ... 19912.297600 22341.947832
2022-01-08 10:00:00 15498.405226 29453.561041 ... 19799.009600 22131.170832
2022-01-08 11:00:00 14950.121226 28767.791041 ... 19328.678600 21507.167832
2022-01-08 12:00:00 13925.869226 27530.472041 ... 18404.139600 20460.316832
2022-01-08 13:00:00 17502.122226 30922.783041 ... 21990.380600 24008.382832
2022-01-08 14:00:00 19159.511385 34275.005187 ... 23961.590286 26460.214883
2022-01-08 15:00:00 20583.356385 35751.662187 ... 25315.380286 27793.800883
2022-01-08 16:00:00 20443.423385 35925.362187 ... 25184.576286 27672.536883
2022-01-08 17:00:00 15825.211385 31604.614187 ... 20646.669286 23145.311883
2022-01-08 18:00:00 11902.354052 28786.559805 ... 16028.363856 19313.677750
2022-01-08 19:00:00 13483.710052 30631.806805 ... 17635.338856 20948.556750
2022-01-08 20:00:00 16084.773323 33944.862396 ... 20627.810852 22763.962851
2022-01-08 21:00:00 18340.833323 36435.799396 ... 22920.037852 25240.320851
2022-01-08 22:00:00 15110.698323 33159.222396 ... 19794.355852 22102.416851
2022-01-08 23:00:00 15663.400323 33741.501396 ... 20180.693852 22605.909851
2022-01-09 00:00:00 19500.930751 39058.431760 ... 24127.257756 26919.289816
2022-01-09 01:00:00 20562.985751 40330.807760 ... 25123.488756 28051.573816
2022-01-09 02:00:00 23408.547751 43253.635760 ... 27840.447756 30960.372816
2022-01-09 03:00:00 25975.071191 45523.722743 ... 30274.316013 32276.174330
2022-01-09 04:00:00 27180.858191 46586.959743 ... 31348.131013 33414.631330
2022-01-09 05:00:00 26383.511191 45793.920743 ... 30598.931013 32605.280330
... ... ... ... ...
我用 min 函数得到了什么:
2022-01-08 11902.354052 27530.472041 ... 16028.363856 19313.677750
2022-01-09 14491.281907 30293.870235 ... 16766.428013 21386.135041
...
我想要的,例如 nsmallest(2)
2022-01-08 11902.354052 27530.472041 ... 16028.363856 19313.677750
13483.710052 28767.791041 ... 17635.338856 20460.316832
2022-01-09 14491.281907 30293.870235 ... 16766.428013 21386.135041
14721.392907 30722.928235 ... 17130.594013 21732.426041
...
【问题讨论】:
-
您能否提供您正在使用的示例数据,以及您的示例的预期输出?这将帮助人们了解您的数据是什么样的,并知道他们的解决方案是否是您正在寻找的。span>
标签: python pandas dataframe pandas-groupby