【发布时间】:2013-12-18 15:27:15
【问题描述】:
在我的代码中,我有两个假设的任务:一个从生成器获取 url 并使用 Twisted 的合作器批量下载它们,另一个获取下载的源并异步解析它。我正在尝试将所有获取和解析任务封装到单个 Deferred 对象中,该对象在下载所有页面并解析所有源时回调。
我想出了以下解决方案:
from twisted.internet import defer, task, reactor, threads
from twisted.web.client import getPage
BATCH_SIZE = 5
def main_task():
result = defer.Deferred()
state = {'count': 0, 'done': False}
def on_parse_finish(r):
state['count'] -= 1
if state['done'] and state['count'] == 0:
result.callback(True)
def process(source):
deferred = parse(source)
state['count'] += 1
deferred.addCallback(on_parse_finish)
def fetch_urls():
for url in get_urls():
deferred = getPage(url)
deferred.addCallback(process)
yield deferred
def on_finish(r):
state['done'] = True
deferreds = []
coop = task.Cooperator()
urls = fetch_urls()
for _ in xrange(BATCH_SIZE):
deferreds.append(coop.coiterate(urls))
main_tasks = defer.DeferredList(deferreds)
main_tasks.addCallback(on_finish)
return defer.DeferredList([main_tasks, result])
# `main_task` is meant to be used with `blockingCallFromThread`
# The following should block until all fetch/parse tasks are completed:
# threads.blockingCallFromThread(reactor, main_task)
代码有效,但我觉得好像我遗漏了一些明显的东西,或者不知道一个简单的 Twisted 模式会使这变得更简单。有没有更好的方法来返回一个在所有获取和解析完成时回调的 Deferred?
【问题讨论】:
-
Undefined name 'parse',Undefined name 'get_urls',Undefined name 'task_finished'。如果问题中的示例代码实际运行,则确保问题的答案正确会容易得多:)。
标签: python asynchronous callback twisted deferred