【发布时间】:2017-12-28 22:40:12
【问题描述】:
熊猫开发新手。如何使用先前看到的列中包含的值前向填充 DataFrame?
独立示例:
import pandas as pd
import numpy as np
O = [1, np.nan, 5, np.nan]
H = [5, np.nan, 5, np.nan]
L = [1, np.nan, 2, np.nan]
C = [5, np.nan, 2, np.nan]
timestamps = ["2017-07-23 03:13:00", "2017-07-23 03:14:00", "2017-07-23 03:15:00", "2017-07-23 03:16:00"]
dict = {'Open': O, 'High': H, 'Low': L, 'Close': C}
df = pd.DataFrame(index=timestamps, data=dict)
ohlc = df[['Open', 'High', 'Low', 'Close']]
这会产生以下 DataFrame:
print(ohlc)
Open High Low Close
2017-07-23 03:13:00 1.0 5.0 1.0 5.0
2017-07-23 03:14:00 NaN NaN NaN NaN
2017-07-23 03:15:00 5.0 5.0 2.0 2.0
2017-07-23 03:16:00 NaN NaN NaN NaN
我想从最后一个 DataFrame 变成这样:
Open High Low Close
2017-07-23 03:13:00 1.0 5.0 1.0 5.0
2017-07-23 03:14:00 5.0 5.0 5.0 5.0
2017-07-23 03:15:00 5.0 5.0 2.0 2.0
2017-07-23 03:16:00 2.0 2.0 2.0 2.0
这样之前在“关闭”中看到的值会向前填充整行,直到看到新的填充行。像这样填充“关闭”列很简单:
column2fill = 'Close'
ohlc[column2fill] = ohlc[column2fill].ffill()
print(ohlc)
Open High Low Close
2017-07-23 03:13:00 1.0 5.0 1.0 5.0
2017-07-23 03:14:00 NaN NaN NaN 5.0
2017-07-23 03:15:00 5.0 5.0 2.0 2.0
2017-07-23 03:16:00 NaN NaN NaN 2.0
但是有没有办法用这些行的“关闭”值填充 03:14:00 和 03:16:00 行?有没有办法使用一个前向填充而不是先填充“关闭”列一步完成?
【问题讨论】:
标签: python pandas dataframe data-science data-cleaning