【发布时间】:2018-08-17 12:59:03
【问题描述】:
在下面的示例 df 中,我试图找到一种基于 ';' 拆分列标题('1;2'、'4'、'5;6')的方法存在并复制这些拆分列中的行值。 (我的实际 df 来自导入的 csv 文件,所以通常我有大约 50-80 个需要拆分的列标题)
下面是我的输出代码
import pandas as pd
import numpy as np
#
data = np.array([['Market','Product Code','1;2','4','5;6'],
['Total Customers',123,1,500,400],
['Total Customers',123,2,400,320],
['Major Customer 1',123,1,100,220],
['Major Customer 1',123,2,230,230],
['Major Customer 2',123,1,130,30],
['Major Customer 2',123,2,20,10],
['Total Customers',456,1,500,400],
['Total Customers',456,2,400,320],
['Major Customer 1',456,1,100,220],
['Major Customer 1',456,2,230,230],
['Major Customer 2',456,1,130,30],
['Major Customer 2',456,2,20,10]])
df =pd.DataFrame(data)
df.columns = df.iloc[0]
df = df.reindex(df.index.drop(0))
print (df)
0 Market Product Code 1;2 4 5;6
1 Total Customers 123 1 500 400
2 Total Customers 123 2 400 320
3 Major Customer 1 123 1 100 220
4 Major Customer 1 123 2 230 230
5 Major Customer 2 123 1 130 30
6 Major Customer 2 123 2 20 10
7 Total Customers 456 1 500 400
8 Total Customers 456 2 400 320
9 Major Customer 1 456 1 100 220
10 Major Customer 1 456 2 230 230
11 Major Customer 2 456 1 130 30
12 Major Customer 2 456 2 20 10
下面是我想要的输出
0 Market Product Code 1 2 4 5 6
1 Total Customers 123 1 1 500 400 400
2 Total Customers 123 2 2 400 320 320
3 Major Customer 1 123 1 1 100 220 220
4 Major Customer 1 123 2 2 230 230 230
5 Major Customer 2 123 1 1 130 30 30
6 Major Customer 2 123 2 2 20 10 10
7 Total Customers 456 1 1 500 400 400
8 Total Customers 456 2 2 400 320 320
9 Major Customer 1 456 1 1 100 220 220
10 Major Customer 1 456 2 2 230 230 230
11 Major Customer 2 456 1 1 130 30 30
12 Major Customer 2 456 2 2 20 10 10
理想情况下,我想在“read_csv”级别执行这样的任务。有什么想法吗?
【问题讨论】: