【发布时间】:2018-02-28 00:34:11
【问题描述】:
我想并行化一个非常简单的列表理解:
nlp = spacy.load(model)
texts = sorted(X['text'])
# TODO: Parallelize
docs = [nlp(text) for text in texts]
但是,当我尝试像这样使用 multiprocessing 模块中的 Pool 时:
docs = Pool().map(nlp, texts)
它给了我以下错误:
Traceback (most recent call last):
File "main.py", line 117, in <module>
main()
File "main.py", line 99, in main
docs = parse_docs(X)
File "main.py", line 81, in parse_docs
docs = Pool().map(nlp, texts)
File "C:\Users\james\AppData\Local\Programs\Python\Python36-32\lib\multiprocessing\pool.py", line 260, in map
return self._map_async(func, iterable, mapstar, chunksize).get()
File "C:\Users\james\AppData\Local\Programs\Python\Python36-32\lib\multiprocessing\pool.py", line 608, in get
raise self._value
File "C:\Users\james\AppData\Local\Programs\Python\Python36-32\lib\multiprocessing\pool.py", line 385, in _handle_tasks
put(task)
File "C:\Users\james\AppData\Local\Programs\Python\Python36-32\lib\multiprocessing\connection.py", line 206, in send
self._send_bytes(_ForkingPickler.dumps(obj))
File "C:\Users\james\AppData\Local\Programs\Python\Python36-32\lib\multiprocessing\reduction.py", line 51, in dumps
cls(buf, protocol).dump(obj)
AttributeError: Can't pickle local object 'FeatureExtracter.<locals>.feature_extracter_fwd'
是否可以在不必使对象可腌制的情况下进行这种并行计算?我对与joblib 等第三方库相关的示例持开放态度。
编辑:我也试过了
docs = Pool().map(nlp.__call__, texts)
那也没用。
【问题讨论】:
标签: python multithreading parallel-processing python-multiprocessing joblib