Python - Pandas - KeyError：“[索引（['questions']，dtype ='object'）]都不在[列]中”答案

【问题标题】：Python - Pandas - KeyError: "None of [Index(['questions'], dtype='object')] are in the [columns]"Python - Pandas - KeyError：“[索引（['questions']，dtype ='object'）]都不在[列]中”
【发布时间】：2021-10-29 03:00:33
【问题描述】：

我是 Python 新手，目前正在尝试编写一个代码，它会自动建议特定问题的答案。我在运行以下代码时遇到了这个问题：

import pandas as pd
df=pd.read_csv("Book11.csv", encoding= 'cp1252');
df.columns=["question","answers"]

df

print(df)

import re
import gensim
from gensim.parsing.preprocessing import remove_stopwords

def clean_sentence(sentence,stopwords=False):
    sentence = sentence.lower().strip()
    sentence = re.sub(r'[^a-z0-9\s]','',sentence)

    if stopwords:
        sentence = remove_stopwords(sentence)

    return sentence

def get_cleaned_sentences (df, stopwords=False):
    sents=df[["questions"]];
    cleaned_sentences=[]

    for index,row in df.iterrows():
        #print(index.row)
        cleaned=clean_sentence(row["questions"], stopwords);
        cleaned_sentences.append(cleaned);
    return cleaned_sentences;

cleaned_sentences=get_cleaned_sentences(df, stopwords=True)
print(cleaned_sentences);

-在 Colab 上运行时 - 运行良好 - 在 Windows 下的本地 Python 3.9.1 上运行时 - 它工作正常 - 在 Ubuntu VM 上运行时，运行相同的代码只会给我以下错误：KeyError: "None of [Index(['questions'], dtype='object')] are in the [columns]"

我已经尝试了搜索上述错误后找到的所有解决方法，但没有成功。

我不明白为什么这可以在两个环境中无缝运行。

非常感谢。

【问题讨论】：

标签： python pandas dataframe

【解决方案1】：

在windows电脑上，尝试读取并将编码改为utf8：

import pandas as pd
df=pd.read_csv("Book11.csv", encoding= 'cp1252')

df.to_csv("Book11-utf8.csv", encoding='utf-8', index_col=None)

将 utf8 csv 文件复制到 VM。

然后在VM机器上，尝试用utf8读取Book11-utf8.csv文件：

df=pd.read_csv("Book11-utf8.csv", encoding= 'utf-8')

【讨论】：

这样做并且错误现在更改为“ValueError：长度不匹配：预期轴有 3 个元素，新值有 2 个元素”。非常感谢！
啊，尝试更新的解决方案，在写入utf8时删除索引。
这样做了，删除了索引，现在回到最初的错误。
文件是否被读取？您是否检查了 df 的列是否相同？在运行任何其他进程之前尝试df.keys()
嗨，是的，文件读取。它显示了 print(df) 的输出，df.keys() 的输出是：Index(['question', 'answers'], dtype='object')。我不明白为什么这只发生在虚拟机上，在其他两个环境中，相同的代码运行没有任何问题。