【发布时间】:2019-03-06 18:52:52
【问题描述】:
我正在尝试保存一个数据框,我已经操纵该数据框来计算重复行的平均值和中位数以及总数。但是,该脚本似乎运行没有问题,但没有实际输出我请求的文件。任何人都可以就正在发生的事情给我任何建议吗?
这是我正在使用的代码:
"""Separate and combine frequencies of like relations,
then produce extra columns with mean and median of these to
get a better overall picture of each relation"""
import numpy as np
import pandas as pd
from numpy.random.mtrand import pareto
def sort_table(fname):
#read in file
parent_child_rel = pd.read_csv(fname)
print(parent_child_rel)
#drop first column
parent_child_rel = parent_child_rel.iloc[:,1:]
print(parent_child_rel)
#put all upper case
parent_child_rel = parent_child_rel.apply(lambda x:x.astype(str).str.upper())
print(parent_child_rel.dtypes)
#change datatype to float for nnmbers
parent_child_rel['Hits'] = parent_child_rel['Hits'].astype('float')
parent_child_rel['Score'] = parent_child_rel['Score'].astype('float')
#group and provide totals and means for hits and score
aggregated = parent_child_rel.groupby(['parent', 'child'], as_index=False).aggregate({'Hits': np.sum, 'Score': [np.mean, np.median]})
print(aggregated.dtypes)
print(aggregated)
with open('./Sketch_grammar/aggregated_relations_SkG_1.csv', 'a') as outfile:
aggregated.to_csv(outfile)
def main():
sort_table('./Sketch_grammar/parent_child_SkG_relations.csv')
if __name__ == '__main__':
main ()
【问题讨论】:
-
在 pandas 中使用
to_csv时不需要使用文件处理程序open。它为您处理。尝试改用aggregated.to_csv('./Sketch_grammar/aggregated_relations_skG_1.csv', mode='a')。我不清楚为什么这会与你正在做的不同,但那是代码的一部分,我觉得很奇怪,所以值得一试。
标签: python csv dataframe pandas-groupby