【发布时间】:2018-02-23 04:59:04
【问题描述】:
我想同时写入同一工作簿的多个工作表。 代码如下:
import threading
import xlsxwriter
import time
def write_to_w1(w1, data):
print('task1 executing....')
for row, item in enumerate(data):
w1.write(row, 0, item, row_format)
def write_to_w2(w2, data):
print('task2 executing....')
for row, item in enumerate(data):
w2.write(row, 0, item, row_format)
def write_to_w3(w3, data):
print('task3 executing....')
for row, item in enumerate(data):
w3.write(row, 0, item, row_format)
start = time.time()
data1 = [i for i in range(0,500000)]
data2 = [i for i in range(0,500000)]
data3 = [i for i in range(0,500000)]
workbook = xlsxwriter.Workbook('~/Desktop/threading.xlsx')
row_format = workbook.add_format({'bold': False, 'align': 'left', 'text_wrap': True, 'valign': 'vcenter'})
w1 = workbook.add_worksheet('w1')
w2 = workbook.add_worksheet('w2')
w3 = workbook.add_worksheet('w3')
t1 = threading.Thread(target=write_to_w1, args=(w1, data1), name='t1')
t2 = threading.Thread(target=write_to_w2, args=(w2, data2), name='t2')
t3 = threading.Thread(target=write_to_w3, args=(w3, data3), name='t3')
# starting thread 1
t1.start()
# starting thread 2
t2.start()
# starting thread 3
t3.start()
# wait until thread 1 is completely executed
t1.join()
# wait until thread 2 is completely executed
t2.join()
# wait until thread 3 is completely executed
t3.join()
# both threads completely executed
print("Done!")
workbook.close()
end = time.time()
print('total time ==>', end-start)
在使用顺序执行进行基准测试时,并行版本大约需要 52 秒,而顺序版本需要 50 秒来执行。
是什么导致了这种性能下降?同步是问题还是写入单个工作簿是问题?
【问题讨论】:
-
我不确定这算作退化。时间跨度足够长,其他进程可能会干扰,我不确定百分比差异是否显着。
-
顺便说一句,您的基准到底是什么?您总共进行了多少次跑步,每个类别中最快的跑步次数是多少?
-
实际的方法是一个接一个地依次调用函数。运行次数只有一次。
-
虽然我确实运行了多次来仔细检查。结果每次都一样
-
写入少量的数据,说一些需要 1 秒的东西,运行 100 次并选择最小值。
标签: multithreading python-3.x parallel-processing xlsxwriter