【问题标题】:How to remove the DATE TIME that have NAN vaules如何删除具有 NAN 值的 DATE TIME
【发布时间】:2016-08-25 06:45:11
【问题描述】:

如何删除“oo”列中带有 NAN 值的 DATE 和 TIME。

这是我的 csv

日期、时间、开、高、低、关、音量 02/03/1997,09:04:00,3046.00,3048.50,3046.00,3047.50,505
02/03/1997,09:05:00,3047.00,3048.00,3046.00,3047.00,162
02/03/1997,09:06:00,3047.50,3048.00,3047.00,3047.50,98
02/03/1997,09:07:00,3047.50,3047.50,3047.00,3047.50,228
02/03/1997,09:08:00,3048.00,3048.00,3047.50,3048.00,136
02/03/1997,09:09:00,3048.00,3048.00,3046.50,3046.50,174
02/03/1997,09:10:00,3046.50,3046.50,3045.00,3045.00,134
02/03/1997,09:11:00,3045.50,3046.00,3044.00,3045.00,43
02/03/1997,09:12:00,3045.00,3045.50,3045.00,3045.00,214
02/03/1997,09:13:00,3045.50,3045.50,3045.50,3045.50,8
02/03/1997,09:14:00,3045.50,3046.00,3044.50,3044.50,152
02/03/1997,09:15:00,3044.00,3044.00,3042.50,3042.50,126
02/03/1997,09:16:00,3043.50,3043.50,3043.00,3043.00,128
02/03/1997,09:17:00,3042.50,3043.50,3042.50,3043.50,23
02/03/1997,09:18:00,3043.50,3044.50,3043.00,3044.00,51
02/03/1997,09:19:00,3044.50,3044.50,3043.00,3043.00,18
02/03/1997,09:20:00,3043.00,3045.00,3043.00,3045.00,23
02/03/1997,09:21:00,3045.00,3045.00,3044.50,3045.00,51
02/03/1997,09:22:00,3045.00,3045.00,3045.00,3045.00,47
02/03/1997,09:23:00,3045.50,3046.00,3045.00,3045.00,77
02/03/1997,09:24:00,3045.00,3045.00,3045.00,3045.00,131
02/03/1997,09:25:00,3044.50,3044.50,3043.50,3043.50,138
02/03/1997,09:26:00,3043.50,3043.50,3043.50,3043.50,6
02/03/1997,09:27:00,3043.50,3043.50,3043.00,3043.00,56
02/03/1997,09:28:00,3043.00,3044.00,3043.00,3044.00,32
02/03/1997,09:29:00,3044.50,3044.50,3044.50,3044.50,63
02/03/1997,09:30:00,3045.00,3045.00,3045.00,3045.00,28

这是我的代码。

exp = pd.read_csv('example.txt', parse_dates = [["DATE", "TIME"]], index_col=0)

exp['oo'] = opcl.OPEN.resample("5Min").first() 
print exp['oo']

我明白了

 DATE_TIME
 1997-02-03 09:04:00       NaN
 1997-02-03 09:05:00    3047.0
 1997-02-03 09:06:00       NaN
 1997-02-03 09:07:00       NaN
 1997-02-03 09:08:00       NaN
 1997-02-03 09:09:00       NaN
 1997-02-03 09:10:00    3046.5

我想删除 'oo' 列中包含 NaN 值的所有 DATE_TIME 行。 我试过了。

  exp['oo'] = exp['oo'].dropna()

但我得到了同样的结果。 我看了都扔了http://pandas.pydata.org/pandas-docs/stable/missing_data.html

并查看了整个网站。

我想让我的 csv 阅读器保持不变,但 idk。

如果有人可以提供帮助,我们将不胜感激,非常感谢您的宝贵时间。

【问题讨论】:

  • opcl 上面没有定义。

标签: python-2.7 date datetime pandas


【解决方案1】:

我想你想要这个:

>>> exp.OPEN.resample("5Min", how='first')

DATE_TIME
1997-02-03 09:00:00    3046.0
1997-02-03 09:05:00    3047.0
1997-02-03 09:10:00    3046.5
1997-02-03 09:15:00    3044.0
1997-02-03 09:20:00    3043.0
1997-02-03 09:25:00    3044.5
1997-02-03 09:30:00    3045.0
Freq: 5T, Name: OPEN, dtype: float64

【讨论】:

  • 某些原因,当我这样做时,我仍然会得到所有的 NaN
  • 您使用的是哪个版本的 Pandas? pd.__version__
  • 在上面的示例数据中使用 0.18.0 对我来说效果很好。你能复制吗? exp.shape 是什么?应该是 (27, 5)。
  • 74,(只是额外输入评论)
  • 您是否正确解析数据? pd.read_csv(...).shape 是什么?