如何将一些csv文件分成两部分，然后在python中将它们合并为一个csv文件答案

【问题标题】：How to divide some csv files into two parts, and then merge them as an csv file in python如何将一些csv文件分成两部分，然后在python中将它们合并为一个csv文件
【发布时间】：2015-07-25 00:15:08
【问题描述】：

我已经看过This了，但是不一样，这是基于列的。

有40个csv文件；文件 1.csv、文件 2.csv。 ...，file40.csv 在名为 pathImage 的文件夹中。它们已被以下代码准确合并，here:

import pandas as pd
df = pd.concat([pd.read_csv('file%d.csv' % x) for x in range(1,41)])
df.to_csv('output.csv')

我想知道的是，我们如何将它们分成两部分，然后将上述代码合并为两个 csv 文件，part1 和 part2？

更新： 我想要两个 csv 文件，例如 train.csv 和 test.csv 文件。训练文件是原始 csv 文件的 %80，其余为测试 csv 文件。

感谢您的帮助。

【问题讨论】：

标签： python csv merge

【解决方案1】：

与其使用 np.array_split，不如使用 np.split()

# Perfunctory imports.
import pandas as pd
import numpy as np

# Confirm the length of your initial dataframe.
len(df)

# Create two new data frames, via split, split it multiplying the 
# length of the original dataframe by .80 (80%).
# np.split returns a list of arrays, so we can use the tuple 
# syntax to simply assign to two separate data frames in one go.
# The larger of the two based on the split will be the first df.
train_df, test_df  = np.split(lahman_data, [int(len(df) * .80)])

# Output the two dataframes to separate files
train_df.to_csv('train.csv')
test_df.to_csv('test.csv')

如果您使用 iPython，您应该能够通过以下方式确认发生了什么：

help(np.split())

希望这会有所帮助！

【讨论】：

【解决方案2】：

numpy 函数 array_split 应该可以解决问题：

import numpy as np
np.array_split(df,2)

它将返回一个包含 2 个 pandas 对象的数组。

【讨论】：

我想要两个 csv 文件，例如 train.csv 和 test.csv 文件。训练文件是原始 csv 文件的 %80，其余为测试 csv 文件。