如何抓取画布标签，为什么它在我的浏览器中不可见？答案

【问题标题】：How to scrape a canvas tag and why is it not visible in my browser?如何抓取画布标签，为什么它在我的浏览器中不可见？
【发布时间】：2025-12-18 19:55:01
【问题描述】：

This image has the highlighted html content and red circle is the portion that needs to be scraped 电话号码在画布标签中。我尝试抓取标签，但它返回“您的浏览器不支持 HTML5 画布标签。”

https://www.mudah.my/malaysia/cars-for-sale/audi?o=1

这是一个链接，其中包含必须抓取联系人的汽车列表任何关于如何解决此问题的建议表示赞赏。

for link in car_links:
    print('link: ', link)
    driver.get(link)

    try:
        dealer_name = driver.find_element_by_xpath('/html/body/div[1]/div[6]/div/div[2]/div[1]/div[4]/div/div[1]/div[2]/div[1]/a').text
        print(dealer_name)
        try:
            driver.execute_script("arguments[0].scrollIntoView(true);",WebDriverWait(driver, 20).until(EC.presence_of_element_located((By.XPATH, '/html/body/div[1]/div[6]/div/div[2]/div[1]/div[5]/button[2]'))))
            button1 = WebDriverWait(driver, 20).until(EC.element_to_be_clickable(
                (By.XPATH, '/html/body/div[1]/div[6]/div/div[2]/div[1]/div[5]/button[2]')))
            button1.click()
            phone = driver.find_element_by_id('phone-image').text
            print(phone)
        except:
            print('No name')
            print('No phone no')
    except:
        pass

【问题讨论】：

标签： python html selenium web-scraping

【解决方案1】：

您试图在单击前一个元素后立即获取元素文本。电话号码需要一些时间才能出现在那里。因此，您只需要在此之前添加一些等待或延迟。像这样：

from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

wait = WebDriverWait(driver, 20)

button1.click()
phone = wait.until(EC.visibility_of_element_located((By.ID, 'phone-image'))).text

【讨论】：

【解决方案2】：

电话号码以 Json 形式存储在页面中。要获取电话号码，您可以：

import json
import requests
from bs4 import BeautifulSoup

url = "https://www.mudah.my/Audi+RS6+4+0+AVANT+TFSI+QUATTRO+Unreg+2016-87091288.htm"

soup = BeautifulSoup(requests.get(url).content, "html.parser")
data = json.loads(soup.select_one("#__NEXT_DATA__").contents[0])
# uncomment this to print all data:
# print(json.dumps(data, indent=4))
ad_id = soup.select_one("[gravity-itemid]")["gravity-itemid"]

ad_data = data["props"]["initialState"]["adDetails"]["byID"][ad_id]

print("Phone:", ad_data["attributes"]["phone"])

打印：

Phone: 0183888798

【讨论】：

知道如何在 selenium 中做到这一点吗？
@trialaccount 您可以使用 selenium 加载页面，然后将源提供给 beautifulsoup。脚本将是相同的。
我发现很难实现
我在soup => soup = BeautifulSoup(driver, "html.parser")中做了以下更改
但我收到一个错误提示 TypeError: object of type 'WebDriver' has no len()