只是对@YaOzl 的回答的一点补充,以防有人会读到这个。
如果您的退货数据是包含多只股票的面板电子表格:
>>> prices = pandas.DataFrame(
{"StkCode":["StockA","StockA","StockA","StockA","StockA","StockB","StockB","StockB","StockB","StockB","StockC","StockC","StockC","StockC","StockC",],
"Price":[1035.23, 1032.47, 1011.78, 1010.59, 1016.03, 1007.95, 1022.75, 1021.52, 1026.11, 1027.04, 1030.58, 1030.42, 1036.24, 1015.00, 1015.20]}
)
这给了你:
Price StkCode
0 1035.23 StockA
1 1032.47 StockA
2 1011.78 StockA
3 1010.59 StockA
4 1016.03 StockA
5 1007.95 StockB
6 1022.75 StockB
7 1021.52 StockB
8 1026.11 StockB
9 1027.04 StockB
10 1030.58 StockC
11 1030.42 StockC
12 1036.24 StockC
13 1015.00 StockC
14 1015.20 StockC
那么您可以简单地将 .pct_change(k) 与 .groupby(StkCode) 结合使用。
而且它比使用迭代器快千倍...(我在我的数据集上进行了尝试,成功地将处理时间从 10 小时缩短到 20 秒!!)
>>> prices["Return"] = prices.groupby("StkCode")["Price"].pct_change(1)
给你:
Price StkCode Return
0 1035.23 StockA NaN
1 1032.47 StockA -0.002666
2 1011.78 StockA -0.020039
3 1010.59 StockA -0.001176
4 1016.03 StockA 0.005383
5 1007.95 StockB NaN
6 1022.75 StockB 0.014683
7 1021.52 StockB -0.001203
8 1026.11 StockB 0.004493
9 1027.04 StockB 0.000906
10 1030.58 StockC NaN
11 1030.42 StockC -0.000155
12 1036.24 StockC 0.005648
13 1015.00 StockC -0.020497
14 1015.20 StockC 0.000197