【问题标题】:ipython 0.13 zmq errorsipython 0.13 zmq 错误
【发布时间】:2013-03-27 21:07:46
【问题描述】:

我遇到了 ipython 集群的奇怪行为。计算完成,但许多结果从未到达客户端(并且引擎在完成第一次计算后处于空闲状态)。

我怀疑 zmq 有问题,因为 1) 我不时看到以下错误:

  File "/data/misc/nano/python/env_stable/lib/python2.7/site-packages/IPython/parallel/client/asyncresult.py", line 118, in get
    if not self.ready():
  File "/data/misc/nano/python/env_stable/lib/python2.7/site-packages/IPython/parallel/client/asyncresult.py", line 132, in ready
    self.wait(0)
  File "/data/misc/nano/python/env_stable/lib/python2.7/site-packages/IPython/parallel/client/asyncresult.py", line 142, in wait
    self._ready = self._client.wait(self.msg_ids, timeout)
  File "/data/misc/nano/python/env_stable/lib/python2.7/site-packages/IPython/parallel/client/client.py", line 1058, in wait
    self.spin()
  File "/data/misc/nano/python/env_stable/lib/python2.7/site-packages/IPython/parallel/client/client.py", line 1015, in spin
    self._flush_results(self._task_socket)
  File "/data/misc/nano/python/env_stable/lib/python2.7/site-packages/IPython/parallel/client/client.py", line 814, in _flush_results
    idents,msg = self.session.recv(sock, mode=zmq.NOBLOCK)
  File "/data/misc/nano/python/env_stable/lib/python2.7/site-packages/IPython/zmq/session.py", line 642, in recv
    idents, msg_list = self.feed_identities(msg_list, copy)
  File "/data/misc/nano/python/env_stable/lib/python2.7/site-packages/IPython/zmq/session.py", line 673, in feed_identities
    idx = msg_list.index(DELIM)
ValueError: '<IDS|MSG>' is not in list

Additionally IPython.zmq has two test failures:

======================================================================
ERROR: test_send (IPython.zmq.tests.test_session.TestSession)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/clusterdata/python/env_stable/lib/python2.7/site-packages/IPython/zmq/tests/test_session.py", line 76, in test_send
    socket = MockSocket(zmq.Context.instance(),zmq.PAIR)
  File "/clusterdata/python/env_stable/lib/python2.7/site-packages/IPython/zmq/tests/test_session.py", line 34, in __init__
    self.data = []
  File "/clusterdata/python/env_stable/lib/python2.7/site-packages/zmq/sugar/attrsettr.py", line 38, in __setattr__
    self.__class__.__name__, upper_key)
AttributeError: MockSocket has no such option: DATA

======================================================================
ERROR: test_send (IPython.zmq.tests.test_session.TestSession)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/clusterdata/python/env_stable/lib/python2.7/site-packages/zmq/tests/__init__.py", line 108, in tearDown
    raise RuntimeError("context could not terminate, open sockets likely remain in test")
RuntimeError: context could not terminate, open sockets likely remain in test

----------------------------------------------------------------------

我使用 pyzmq 13.0.0(由 pip 安装)和 zeromq 3.2.2,由 pyzmq 的安装程序编译。我使用 ipython 13.1 和 python 2.7.3。

关于这可能是什么的任何建议,如果不是,我如何找出更多信息为什么会发生这些错误?

更新:原来速度变慢是由于 ipcontroller 的任务队列很长,然后占用了 100% 的 CPU 并严重滞后。这是一个单独的问题,但我仍然希望能提供有关上述内容的反馈。

【问题讨论】:

  • MockSocket 错误只影响测试本身,并在 0.13.2 release candidate here 中修复。
  • 知道其他错误可能是什么吗?同样根据更新,ipcontroller 是否应该像地狱一样滞后,队列中有 4000 个作业(如果它不滞后几百个作业)?
  • 这显然不应该,但这并不意味着您的系统有问题。如果您有很多工作,我强烈建议您将 TaskScheduler.hwm 设置为较大的值。
  • 实际上任何大的值都没有帮助,但 0 有效。

标签: ipython pyzmq ipython-parallel


【解决方案1】:

由 cmets 中的 @minrk 回答。 ZMQ错误不重要,性能是调度造成的,通过设置TaskScheduler.hwm=0解决。

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2015-05-29
    • 2012-07-10
    • 1970-01-01
    • 1970-01-01
    • 2013-08-21
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多