无法在 Python 中将数据框保存到 csv答案

【问题标题】：Can't save dataframe to csv in Python无法在 Python 中将数据框保存到 csv
【发布时间】：2019-03-06 18:52:52
【问题描述】：

我正在尝试保存一个数据框，我已经操纵该数据框来计算重复行的平均值和中位数以及总数。但是，该脚本似乎运行没有问题，但没有实际输出我请求的文件。任何人都可以就正在发生的事情给我任何建议吗？

这是我正在使用的代码：

"""Separate and combine frequencies of like relations, 
then produce extra columns with mean and median of these to
get a better overall picture of each relation"""

import numpy as np
import pandas as pd
from numpy.random.mtrand import pareto

def sort_table(fname):
    #read in file
    parent_child_rel = pd.read_csv(fname)
    print(parent_child_rel)

    #drop first column
    parent_child_rel = parent_child_rel.iloc[:,1:]
    print(parent_child_rel)


    #put all upper case
    parent_child_rel = parent_child_rel.apply(lambda x:x.astype(str).str.upper())

    print(parent_child_rel.dtypes) 

    #change datatype to float for nnmbers
    parent_child_rel['Hits'] = parent_child_rel['Hits'].astype('float') 
    parent_child_rel['Score'] = parent_child_rel['Score'].astype('float')

    #group and provide totals and means for hits and score
    aggregated = parent_child_rel.groupby(['parent', 'child'], as_index=False).aggregate({'Hits': np.sum, 'Score': [np.mean, np.median]})


    print(aggregated.dtypes)

    print(aggregated)

    with open('./Sketch_grammar/aggregated_relations_SkG_1.csv', 'a') as outfile:
        aggregated.to_csv(outfile)


def main():
    sort_table('./Sketch_grammar/parent_child_SkG_relations.csv')


if __name__ == '__main__':
    main ()

【问题讨论】：

在 pandas 中使用 to_csv 时不需要使用文件处理程序 open。它为您处理。尝试改用aggregated.to_csv('./Sketch_grammar/aggregated_relations_skG_1.csv', mode='a')。我不清楚为什么这会与你正在做的不同，但那是代码的一部分，我觉得很奇怪，所以值得一试。

标签： python csv dataframe pandas-groupby

【解决方案1】：

您无需打开文件即可将其保存为 CSV。只需指定to_csv 函数的路径即可。

另外，fname 参数中有文件名，因此无需手动重新编写。

您的代码将是：

"""Separate and combine frequencies of like relations, 
then produce extra columns with mean and median of these to
get a better overall picture of each relation"""

import numpy as np
import pandas as pd
from numpy.random.mtrand import pareto

def sort_table(fname):
    #read in file
    parent_child_rel = pd.read_csv(fname)
    print(parent_child_rel)

    #drop first column
    parent_child_rel = parent_child_rel.iloc[:,1:]
    print(parent_child_rel)


    #put all upper case
    parent_child_rel = parent_child_rel.apply(lambda x:x.astype(str).str.upper())

    print(parent_child_rel.dtypes) 

    #change datatype to float for nnmbers
    parent_child_rel['Hits'] = parent_child_rel['Hits'].astype('float') 
    parent_child_rel['Score'] = parent_child_rel['Score'].astype('float')

    #group and provide totals and means for hits and score
    aggregated = parent_child_rel.groupby(['parent', 'child'], as_index=False).aggregate({'Hits': np.sum, 'Score': [np.mean, np.median]})


    print(aggregated.dtypes)

    print(aggregated)

    aggregated.to_csv(fname)


def main():
    sort_table('./Sketch_grammar/parent_child_SkG_relations.csv')


if __name__ == '__main__':
    main ()

如果您不想添加带有索引的额外列（您可能不想），那么您应该指定它：

aggregated.to_csv(fname, index = False)

正如@brittenb 所建议的，您想将数据附加到文件中，因此您应该使用mode = "a"

aggregated.to_csv(fname, mode = "a")

【讨论】：

不要忘记mode='a'，因为 OP 希望根据 open(..., 'a') as outfile 中的原始代码附加数据。
谢谢大家 - 成功了！