【问题标题】:Cannot take screenshot with 0 width无法截取宽度为 0 的屏幕截图
【发布时间】:2019-07-16 15:12:24
【问题描述】:

我正在尝试截取 Bootstrap 模式中的元素的屏幕截图。经过一番挣扎,我终于想出了这个代码:

driver.get('https://enlinea.sunedu.gob.pe/')
driver.find_element_by_xpath('//div[contains(@class, "img_publica")]').click()

WebDriverWait(driver, 10).until(EC.visibility_of_element_located((By.ID, 'modalConstancia')))
driver.find_element_by_xpath('//div[contains(@id, "modalConstancia")]').click()
active_element = driver.switch_to.active_element
active_element.find_elements_by_id('doc')[0].send_keys(graduate.id)

# Can't take this screenshot
active_element.find_elements_by_id('captchaImg')[0].screenshot_as_png('test.png')

错误是:

Traceback (most recent call last):
  File "/home/cesar/Development/manar/venv/lib/python3.7/site-packages/rq/worker.py", line 812, in perform_job
    rv = job.perform()
  File "/home/cesar/Development/manar/venv/lib/python3.7/site-packages/rq/job.py", line 588, in perform
    self._result = self._execute()
  File "/home/cesar/Development/manar/venv/lib/python3.7/site-packages/rq/job.py", line 594, in _execute
    return self.func(*self.args, **self.kwargs)
  File "./jobs/sunedu.py", line 82, in scrap_document_number
    record = scrap_and_recognize(driver, graduate)
  File "./jobs/sunedu.py", line 33, in scrap_and_recognize
    active_element.find_elements_by_id('captchaImg')[0].screenshot_as_png('test.png')
  File "/home/cesar/Development/manar/venv/lib/python3.7/site-packages/selenium/webdriver/remote/webelement.py", line 567, in screenshot_as_png
    return base64.b64decode(self.screenshot_as_base64.encode('ascii'))
  File "/home/cesar/Development/manar/venv/lib/python3.7/site-packages/selenium/webdriver/remote/webelement.py", line 557, in screenshot_as_base64
    return self._execute(Command.ELEMENT_SCREENSHOT)['value']
  File "/home/cesar/Development/manar/venv/lib/python3.7/site-packages/selenium/webdriver/remote/webelement.py", line 633, in _execute
    return self._parent.execute(command, params)
  File "/home/cesar/Development/manar/venv/lib/python3.7/site-packages/selenium/webdriver/remote/webdriver.py", line 321, in execute
    self.error_handler.check_response(response)
  File "/home/cesar/Development/manar/venv/lib/python3.7/site-packages/selenium/webdriver/remote/errorhandler.py", line 242, in check_response
    raise exception_class(message, screen, stacktrace)
selenium.common.exceptions.WebDriverException: Message: unknown error: unhandled inspector error: {"code":-32000,"message":"Cannot take screenshot with 0 width."}
  (Session info: chrome=75.0.3770.100)
  (Driver info: chromedriver=74.0.3729.6 (255758eccf3d244491b8a1317aa76e1ce10d57e9-refs/branch-heads/3729@{#29}),platform=Linux 4.4.0-154-generic x86_64)

经过一些调试,我意识到元素没有宽度或高度:

(Pdb) active_element.find_elements_by_id('captchaImg')[0].rect
{'height': 0, 'width': 0, 'x': 0, 'y': 0}
(Pdb) active_element.find_elements_by_id('captchaImg')[0].size
{'height': 0, 'width': 0}

我认为这是失败的原因。有没有办法解决这个问题?


这些是步骤:

  1. 点击链接:

  1. 等待模态并填充第一个输入:

  1. 尝试截取验证码图片:

如果我在浏览器中检查元素(保存验证码图像的span),我可以看到它实际上是 100x50:

【问题讨论】:

  • 您可能不会击败验证码。您不能期望通过浏览器检查元素时看到的内容与脚本所看到的内容相同,即使它访问同一页面也是如此。 Captcha 很聪明,它会知道你正在尝试抓取页面,但它不会起作用。尝试截取整个页面而不是只截取该元素。
  • webdriver Firefox 提供正确的大小,但函数 screenshot() 始终保存整页。
  • @c0lon 验证码不是有情众生,它只是页面上的另一个元素,可以使用 selenium 进行交互和克服。

标签: python python-3.x selenium web-scraping


【解决方案1】:

好的,我已经弄清楚为什么您不断收到 Cannot take screenshot with 0 width. 错误。原因是页面上有多个验证码,使用非特定选择器会为您提供隐藏的验证码图像(可能在另一个模态窗口下)。所以增加特异性应该会给你正确的形象。

代码如下:

from contextlib import contextmanager
from logging import getLogger

from selenium.common.exceptions import TimeoutException
from selenium.webdriver import Chrome, ChromeOptions
from selenium.webdriver.common.by import By
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait

logger = getLogger(__name__)


@contextmanager
def get_chrome() -> Chrome:
    opts = ChromeOptions()
    # opts.headless = True
    logger.debug('Running Chrome')
    driver = Chrome(options=opts)
    driver.set_window_size(1000, 600)
    logger.debug('Chrome started')
    yield driver
    driver.close()


def wait_selector_present(driver: Chrome, selector: str, timeout: int = 5):
    cond = EC.presence_of_element_located((By.CSS_SELECTOR, selector))
    try:
        WebDriverWait(driver, timeout).until(cond)
    except TimeoutException as e:
        raise ValueError(f'Cannot find {selector} after {timeout}s') from e


def wait_selector_visible(driver: Chrome, selector: str, timeout: int = 5):
    cond = EC.visibility_of_any_elements_located((By.CSS_SELECTOR, selector))
    try:
        WebDriverWait(driver, timeout).until(cond)
    except TimeoutException as e:
        raise ValueError(f'Cannot find any visible {selector} after {timeout}s') from e


if __name__ == '__main__':
    with get_chrome() as c:
        captcha_sel = '#consultaForm #captchaImg img'
        modal_sel = '[data-target="#modalConstancia"]'

        url = 'https://enlinea.sunedu.gob.pe/'
        c.get(url)

        wait_selector_present(c, modal_sel)
        modal = c.find_element_by_css_selector(modal_sel)
        modal.click()

        wait_selector_visible(c, captcha_sel)
        captcha_img = c.find_element_by_css_selector(captcha_sel)
        captcha_img.screenshot('captcha.png')

结果:

【讨论】:

  • 嗨。这很有意义。我很快就会试试这个,然后回复你。谢谢!!
  • 老兄,你真棒。我稍微更改了代码,但事实上,问题是我需要在选择器中更具体。
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2019-12-29
  • 1970-01-01
  • 2014-09-26
  • 2012-03-10
相关资源
最近更新 更多