【发布时间】:2015-12-04 11:04:47
【问题描述】:
我在 ghost.py 版本:0.2.3 我想在网页中获取 javascript 变量的值。 当我运行这个简单的脚本时,我收到一个错误“无法加载请求的页面”:
from ghost import Ghost
ghost = Ghost()
with ghost.start() as session:
page, extra_resources = session.open("http://www.offi.fr/concerts/les-3-arts-3305/belle-epoque-944532.html")
js_variable, _ = session.evaluate('map.mapUrl', expect_loading=True)
print js_variable
这里是 ipython 中的结果:
---------------------------------------------------------------------------
TimeoutError Traceback (most recent call last)
<ipython-input-19-3c24eef8745a> in <module>()
1 with ghost.start() as session:
2 page, extra_resources = session.open("http://www.offi.fr/concerts/les-3-arts-3305/belle-epoque-944532.html")
----> 3 js_variable, _ = session.evaluate('map.mapUrl', expect_loading=True)
4 print js_variable
5
/usr/local/lib/python2.7/dist-packages/ghost/ghost.pyc in wrapper(self, *args, **kwargs)
179 func(self, *args, **kwargs)
180 return self.wait_for_page_loaded(
--> 181 timeout=kwargs.pop('timeout', None))
182 return func(self, *args, **kwargs)
183 return wrapper
/usr/local/lib/python2.7/dist-packages/ghost/ghost.pyc in wait_for_page_loaded(self, timeout)
1194 """
1195 self.wait_for(lambda: self.loaded,
-> 1196 'Unable to load requested page', timeout)
1197 resources = self._release_last_resources()
1198 page = None
/usr/local/lib/python2.7/dist-packages/ghost/ghost.pyc in wait_for(self, condition, timeout_message, timeout)
1172 while not condition():
1173 if time.time() > (started_at + timeout):
-> 1174 raise TimeoutError(timeout_message)
1175 self.sleep()
1176 if self.wait_callback is not None:
TimeoutError: Unable to load requested page
你能帮我指出我错在哪里吗? 如果有任何替代方法来获取 javascript 变量值?
非常感谢。
【问题讨论】:
标签: javascript python html dom web-scraping