pandas 数据框 KeyError oop答案

【问题标题】：pandas data frame KeyError ooppandas 数据框 KeyError oop
【发布时间】：2020-09-16 01:20:00
【问题描述】：

这个脚本的目的是读取一个 csv 文件。

该文件包含外汇数据。

该文件有 7 列 Date、Time、Open、High、Low、Close 和 Volume，以及大约 600k 行。

在抓取日期和时间后，脚本必须进行一些日期时间计算，例如月份和日期。

然后使用 TA-LIB 库进行一些技术分析。

代码如下：

import pandas as pd
import talib


class Data:
    def __init__(self):
        self.df = pd.DataFrame()
        self.names = ['Date', 'Time', 'Open', 'High', 'Low', 'Close', 'Volume']
        self.open = self.df['Open'].astype(float)
        self.high = self.df['High'].astype(float)
        self.low = self.df['Low'].astype(float)
        self.close = self.df['Close'].astype(float)

    def file(self, file):
        self.df = pd.read_csv(file, names=self.names,
                              parse_dates={'Release Date': ['Date', 'Time']})
        return self.df

    def date(self):
        self.df['Release Date'] = pd.to_datetime(self.df['Release Date'])

    def year(self):
        self.df['year'] = pd.to_datetime(self.df['Release Date']).dt.year

    def month(self):
        self.df['year'] = pd.to_datetime(self.df['Release Date']).dt.month

    def day(self):
        self.df['year'] = pd.to_datetime(self.df['Release Date']).dt.day

    def dema(self):
        # DEMA - Double Exponential Moving Average
        self.df['DEMA'] = talib.DEMA(self.close, timeperiod=30)

    def ema(self):
        # EMA - Exponential Moving Average
        self.df['EMA'] = talib.EMA(self.close, timeperiod=30)

    def HT_TRENDLINE(self):
        # HT_TRENDLINE - Hilbert Transform - Instantaneous Trendline
        self.df['HT_TRENDLINE '] = talib.HT_TRENDLINE(self.close)

    def KAMA(self):
        # KAMA - Kaufman Adaptive Moving Average
        self.df['KAMA'] = talib.KAMA(self.close, timeperiod=30)

    def ma(self):
        # MA - Moving average
        self.df['MA'] = talib.MA(self.close, timeperiod=30, matype=0)

    def print(self):
        return print(self.df.head())


x = Data()
x.file(r"D:\Projects\Project Forex\USDJPY.csv")
x.print()

这是错误：

Traceback (most recent call last):

  File "C:\Users\Sayed\miniconda3\lib\site-packages\pandas\core\indexes\base.py", line 2646, in get_loc
    return self._engine.get_loc(key)

  File "pandas\_libs\index.pyx", line 111, in pandas._libs.index.IndexEngine.get_loc

  File "pandas\_libs\index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc

  File "pandas\_libs\hashtable_class_helper.pxi", line 1619, in pandas._libs.hashtable.PyObjectHashTable.get_item

  File "pandas\_libs\hashtable_class_helper.pxi", line 1627, in pandas._libs.hashtable.PyObjectHashTable.get_item
KeyError: 'Open'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):

  File "C:/Users/Sayed/PycharmProjects/project/Technical Analysis.py", line 55, in <module>
    x = Data()

  File "C:/Users/Sayed/PycharmProjects/project/Technical Analysis.py", line 9, in __init__
    self.open = self.df['Open'].astype(float)

  File "C:\Users\Sayed\miniconda3\lib\site-packages\pandas\core\frame.py", line 2800, in __getitem__
    indexer = self.columns.get_loc(key)

  File "C:\Users\Sayed\miniconda3\lib\site-packages\pandas\core\indexes\base.py", line 2648, in get_loc
    return self._engine.get_loc(self._maybe_cast_indexer(key))

  File "pandas\_libs\index.pyx", line 111, in pandas._libs.index.IndexEngine.get_loc

  File "pandas\_libs\index.pyx", line 138, in pandas._libs.index.IndexEngine.get_loc

  File "pandas\_libs\hashtable_class_helper.pxi", line 1619, in pandas._libs.hashtable.PyObjectHashTable.get_item

  File "pandas\_libs\hashtable_class_helper.pxi", line 1627, in pandas._libs.hashtable.PyObjectHashTable.get_item

KeyError: 'Open'

【问题讨论】：

错误提示没有Open列，检查你的csv。
有开列
@SayedGouda 编辑您的问题以列出df.columns 的输出。您的错误表明 Open 列不存在。也许有大小写差异？
pd.read_csv(r"D:\Projects\Project Forex\USDJPY.csv").columns 返回什么？
同样的错误

标签： python pandas oop ta-lib

【解决方案1】：

在 __init__ 函数中，您正在初始化没有任何列的空 DataFrame。但是在 1 行之后，您正在尝试将 DataFrame 的 Open 列转换为浮点数。

def __init__(self):
    self.df = pd.DataFrame() # No columns
    self.names = ['Date', 'Time', 'Open', 'High', 'Low', 'Close', 'Volume']
    self.open = self.df['Open'].astype(float) # ERROR: 'Open' column does not exist
    self.high = self.df['High'].astype(float)
    self.low = self.df['Low'].astype(float)
    self.close = self.df['Close'].astype(float)

把你的初始化函数改成这个，它应该可以工作了！

def __init__(self):
    self.names = ['Date', 'Time', 'Open', 'High', 'Low', 'Close', 'Volume']
    self.df = pd.DataFrame(columns=self.names) # Empty dataframe with columns
    self.open = self.df['Open'].astype(float) # Now 'Open' column exists
    self.high = self.df['High'].astype(float)
    self.low = self.df['Low'].astype(float)
    self.close = self.df['Close'].astype(float)

【讨论】：

问题中的错误在线x = Data()。您是否收到与建议的更改相同的错误？我试过运行它，它似乎正在工作