【发布时间】:2019-12-12 02:55:03
【问题描述】:
我在 csv 中有数据 - 2 列,第一列包含成员 id,第二列包含键值对中的特征(嵌套在另一个之下)。
我看到在线代码可以转换简单的键值对,但不能像上面显示的那样转换数据
【问题讨论】:
-
我对python很陌生
标签: python-3.x
我在 csv 中有数据 - 2 列,第一列包含成员 id,第二列包含键值对中的特征(嵌套在另一个之下)。
我看到在线代码可以转换简单的键值对,但不能像上面显示的那样转换数据
【问题讨论】:
标签: python-3.x
我是用这个XlsxWriter 包完成的,所以首先你必须通过运行pip install XlsxWriter 命令来安装它。
import csv # to read csv file
import xlsxwriter # to write xlxs file
import ast
# you can change this names according to your local ones
csv_file = 'data.csv'
xlsx_file = 'data.xlsx'
# read the csv file and get all the JSON values into data list
data = []
with open(csv_file, 'r') as csvFile:
# read line by line in csv file
reader = csv.reader(csvFile)
# convert every line into list and select the JSON values
for row in list(reader)[1:]:
# csv are comma separated, so combine all the necessary
# part of the json with comma
json_to_str = ','.join(row[1:])
# convert it to python dictionary
str_to_dict = ast.literal_eval(json_to_str)
# append those completed JSON into the data list
data.append(str_to_dict)
# define the excel file
workbook = xlsxwriter.Workbook(xlsx_file)
# create a sheet for our work
worksheet = workbook.add_worksheet()
# cell format for merge fields with bold and align center
# letters and design border
merge_format = workbook.add_format({
'bold': 1,
'border': 1,
'align': 'center',
'valign': 'vcenter'})
# other cell format to design the border
cell_format = workbook.add_format({
'border': 1,
})
# create the header section dynamically
first_col = 0
last_col = 0
for index, value in enumerate(data[0].items()):
if isinstance(value[1], dict):
# this if mean the JSON key has something else
# other than the single value like dict or list
last_col += len(value[1].keys())
worksheet.merge_range(first_row=0,
first_col=first_col,
last_row=0,
last_col=last_col,
data=value[0],
cell_format=merge_format)
for k, v in value[1].items():
# this is for go in deep the value if exist
worksheet.write(1, first_col, k, merge_format)
first_col += 1
first_col = last_col + 1
else:
# 'age' has only one value, so this else section
# is for create normal headers like 'age'
worksheet.write(1, first_col, value[0], merge_format)
first_col += 1
# now we know how many columns exist in the
# excel, and set the width to 20
worksheet.set_column(first_col=0, last_col=last_col, width=20)
# filling values to excel file
for index, value in enumerate(data):
last_col = 0
for k, v in value.items():
if isinstance(v, dict):
# this is for handle values with dictionary
for k1, v1 in v.items():
if isinstance(v1, list):
# this will capture last 'type' list (['Grass', 'Hardball'])
# in the 'conditions'
worksheet.write(index + 2, last_col, ', '.join(v1), cell_format)
else:
# just filling other values other than list
worksheet.write(index + 2, last_col, v1, cell_format)
last_col += 1
else:
# this is handle single value other than dict or list
worksheet.write(index + 2, last_col, v, cell_format)
last_col += 1
# finally close to create the excel file
workbook.close()
我注释掉了大部分行,以便更好地理解并降低复杂性,因为您对 Python 非常陌生。如果您没有得到任何意义,请告诉我,我会尽可能多地解释。另外我使用了enumerate() python Built-in Function。检查这个我直接从原始文档中获得的小例子。这个enumerate() 在对列表中的项目进行编号时很有用。
返回一个枚举对象。 iterable 必须是序列、迭代器或其他支持迭代的对象。
enumerate()返回的迭代器的__next__()方法返回一个元组,其中包含一个计数(从 start 开始,默认为 0)和通过迭代 iterable 获得的值。
>>> seasons = ['Spring', 'Summer', 'Fall', 'Winter']
>>> list(enumerate(seasons))
[(0, 'Spring'), (1, 'Summer'), (2, 'Fall'), (3, 'Winter')]
>>> list(enumerate(seasons, start=1))
[(1, 'Spring'), (2, 'Summer'), (3, 'Fall'), (4, 'Winter')]
这是我的 csv 文件,
这是excel文件的最终输出。我刚刚合并了重复的标头值(matchruns 和 conditions)。
【讨论】: