【问题标题】:Why does selenium with docker crash?为什么硒与 docker 崩溃?
【发布时间】:2022-01-24 16:40:36
【问题描述】:

嗯,我第一次尝试使用 docker。我需要它来让 selenium 在服务器上运行。所以我可以把这个脚本放到任何服务器上,它会毫无痛苦地工作。但突然间它就不起作用了,它就是不起作用。我试图在谷歌上搜索,但没有。现在我在这里寻求帮助。也许我错过了一些东西,也许我在某个地方搞砸了,现在我看到了解决方案。解决办法是下载chromedriver,拒绝运行selenium-standalone的想法。

启动硒

options = webdriver.ChromeOptions()
options.add_argument('headless')
options.add_argument('--ignore-certificate-errors')
options.add_argument('--ignore-ssl-errors')
# options.add_argument('--verbose')
options.add_argument('--ignore-gpu-blacklist')
options.add_argument('--use-gl')
options.add_argument("--no-sandbox")
options.add_argument('--disable-web-security')
options.add_experimental_option("excludeSwitches", ['enable-logging'])
options.add_argument('--user-agent={}'.format(random.choice(headers.headers)))
driver = webdriver.Remote("http://selenium:4444/wd/hub", options=options, desired_capabilities=DesiredCapabilities.CHROME)

Dockerfile 去:

FROM python:3.8.12

ENV PYTHONBUFFERED 1
COPY ./requirements.txt /requirements.txt
RUN pip install -r /requirements.txt

RUN mkdir /app
COPY ./app /app
WORKDIR /app

Docker 撰写:

version: '3'

services:
  selenium:
    image: selenium/standalone-chrome
    ports:
      - 4444:4444
    restart: always

  app:
    build:
      context: .
    volumes:
      - ./app:/app
    command: sh -c "python3 main.py"
    depends_on:
      - selenium

现在我得到了错误 Selenium 使用这个奇怪的 SIGTERM 记录

Starting Selenium Grid Standalone...

18:13:41.962 INFO [LoggingOptions.configureLogEncoding] - Using the system default encoding

18:13:41.969 INFO [OpenTelemetryTracer.createTracer] - Using OpenTelemetry for tracing

18:13:42.973 INFO [NodeOptions.getSessionFactories] - Detected 8 available processors

18:13:43.035 INFO [NodeOptions.report] - Adding chrome for {"browserVersion": "96.0","browserName": "chrome","platformName": "Linux","se:vncEnabled": true} 1 times

18:13:43.055 INFO [Node.<init>] - Binding additional locator mechanisms: relative, name, id

18:13:43.090 INFO [LocalDistributor.add] - Added node 2100e52d-e04f-44e7-84e9-71a0e936cd32 at http://172.18.0.2:4444. Health check every 120s

18:13:43.092 INFO [GridModel.setAvailability] - Switching node 2100e52d-e04f-44e7-84e9-71a0e936cd32 (uri: http://172.18.0.2:4444) from DOWN to UP

18:13:43.286 INFO [Standalone.execute] - Started Selenium Standalone 4.1.1 (revision e8fcc2cecf): http://172.18.0.2:4444

18:14:02.359 INFO [LocalDistributor.newSession] - Session request received by the distributor: 

 [Capabilities {browserName: chrome, goog:chromeOptions: {args: [headless, --ignore-certificate-errors, --ignore-ssl-errors, --ignore-gpu-blacklist, --use-gl, --no-sandbox, --disable-web-security, --user-agent={'user-agent':...], excludeSwitches: [enable-logging], extensions: []}, pageLoadStrategy: normal}]

Starting ChromeDriver 96.0.4664.45 (76e4c1bb2ab4671b8beba3444e61c0f17584b2fc-refs/branch-heads/4664@{#947}) on port 39843

On[l1y6 4l0o2c8a3l2 4c2o.n4n3e5c]t[iSoEnVsE RaEr]e:  ablilnodw(e)d .f

Plaeialseed :s eCea nhntottp sa:s/s/icghnr ormeeqdureisvteerd. cahdrdormeisusm .(o9r9g)/

security-considerations for suggestions on keeping ChromeDriver safe.

ChromeDriver was started successfully.

18:14:03.044 INFO [ProtocolHandshake.createSession] - Detected dialect: W3C

18:14:03.080 INFO [LocalDistributor.newSession] - Session created by the distributor. Id: 9f36e0e8d2e2417b85b035b546082c87, Caps: Capabilities {acceptInsecureCerts: false, browserName: chrome, browserVersion: 96.0.4664.110, chrome: {chromedriverVersion: 96.0.4664.45 (76e4c1bb2ab46..., userDataDir: /tmp/.com.google.Chrome.k2Wjlf}, goog:chromeOptions: {debuggerAddress: localhost:40301}, networkConnectionEnabled: false, pageLoadStrategy: normal, platformName: linux, proxy: Proxy(), se:cdp: ws://172.18.0.2:4444/sessio..., se:cdpVersion: 96.0.4664.110, se:vnc: ws://172.18.0.2:4444/sessio..., se:vncEnabled: true, se:vncLocalAddress: ws://172.18.0.2:7900, setWindowRect: true, strictFileInteractability: false, timeouts: {implicit: 0, pageLoad: 300000, script: 30000}, unhandledPromptBehavior: dismiss and notify, webauthn:extension:credBlob: true, webauthn:extension:largeBlob: true, webauthn:virtualAuthenticators: true}

Trapped SIGTERM/SIGINT/x so shutting down supervisord...

2021-12-23 18:14:54,146 WARN received SIGTERM indicating exit request

2021-12-23 18:14:54,147 INFO waiting for xvfb, vnc, novnc, selenium-standalone to die

2021-12-23 18:14:55,148 INFO stopped: selenium-standalone (terminated by SIGTERM)

2021-12-23 18:14:56,151 INFO stopped: novnc (terminated by SIGTERM)

2021-12-23 18:14:57,154 INFO stopped: vnc (terminated by SIGTERM)

2021-12-23 18:14:57,154 INFO waiting for xvfb to die

2021-12-23 18:14:58,156 INFO stopped: xvfb (terminated by SIGTERM)

Shutdown complete

最后是脚本错误

urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='selenium', port=4444): Max retries exceeded with url: /wd/hub/session (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f6ece62a100>: Failed to establish a new connection: [Errno 111] Connection refused'))

我只是不知道它为什么会崩溃。为什么它无法建立连接。

【问题讨论】:

  • headers.headers 解析为什么?
  • 你在哪里运行你的测试代码?在容器内还是在码头外?
  • headers.headers 是一个自动生成的文件,包含 5k 个用户代理
  • 我确实在容器中运行我的测试

标签: python python-3.x docker selenium selenium-webdriver


【解决方案1】:

此错误消息...

[Capabilities {browserName: chrome, goog:chromeOptions: {args: [headless, --ignore-certificate-errors, --ignore-ssl-errors, --ignore-gpu-blacklist, --use-gl, --no-sandbox, --disable-web-security, --user-agent={'user-agent':...], excludeSwitches: [enable-logging], extensions: []}, pageLoadStrategy: normal}]

...暗示所有 Capabilities 都是完美的,但 。因此,您会看到 MaxRetryError

urllib3.exceptions.MaxRetryError: HTTPConnectionPool(host='selenium', port=4444): Max retries exceeded with url: /wd/hub/session (Caused by NewConnectionError('<urllib3.connection.HTTPConnection object at 0x7f6ece62a100>: Failed to establish a new connection: [Errno 111] Connection refused'))

解决方案

您需要通过add_argument() 传递一个有效的UserAgent,如下所示:

from fake_useragent import UserAgent

options = webdriver.ChromeOptions()
ua = UserAgent()
userAgent = ua.random
print(userAgent)
options.add_argument(f'user-agent={userAgent}')

参考文献

您可以在以下位置找到一些相关的详细讨论:

【讨论】:

  • @Ilya 您在哪里向我们展示或在您的问题中提到了choice(headers.headers),或者在发布答案之前发表了评论?你还没有添加@,我也可以通过它收到通知。
猜你喜欢
  • 1970-01-01
  • 1970-01-01
  • 2020-05-01
  • 2012-04-20
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 1970-01-01
  • 2016-08-14
相关资源
最近更新 更多