【问题标题】:How can I scrape text from tooltip using selenium? Page does not contain tooltip html如何使用 selenium 从工具提示中抓取文本?页面不包含工具提示 html
【发布时间】:2021-03-07 10:48:56
【问题描述】:

我正在尝试使用 selenium 从页面中抓取项目。目前我被困在从项目工具提示中抓取文本。

我看到使用开发人员工具 (f12) 处理页面中的一项的 HTML:

<div class="offer-page-sale-item offer-page-sale-item__actual" data-real-id="12345" data-id="12345-4">
    <!-- Render product -->
    <div class="item ">
        <div class="discount price bare">
            <div class="t1">
                <span class="value">1</span>
                <span class="cents">69</span>
                <span class="eur">€</span>
            </div>
        </div>

        <div class="tags tags_primary">
            <div class="tag i tooltipstered"></div>
        </div>

        <div class="img"></div>

        <div class="text">
            <div class="title">[name of the product]</div>
            <div class="pack">
                <span class="ptitle"></span>
                <span class="price"> </span>
            </div>
        </div>
        <div class="infopop">
            <div class="in"><span></span>
                <div class="arrow_top"></div>
                <div class="arrow_bottom"></div>
            </div>
        </div>
    </div>                        
</div>

如果我从页面源代码 (ctrl+u) 中查看 html 以获得相同的项目:

<div class="offer-page-sale-item offer-page-sale-item__actual" data-real-id="12345" data-id="12345-4">
    <!-- Render product -->
    <div class="item ">
        <div class="discount price bare">
            <div class="t1">
                <span class="value">1</span>
                <span class="cents">69</span>
                <span class="eur">€</span>
            </div>
        </div>

        <div class="tags tags_primary">
            <div title="[product price valid from]-[product price valid until]" class="tag i"></div>
        </div>

        <div class="img"></div>

        <div class="text">
            <div class="title">[name of the product]</div>
            <div class="pack">
                <span class="ptitle"></span>
                <span class="price"> </span>
            </div>
        </div>
        <div class="infopop">
            <div class="in"><span></span>
                <div class="arrow_top"></div>
                <div class="arrow_bottom"></div>
            </div>
        </div>
    </div>                        
</div>

所以唯一的区别在于&lt;div class="tags tags_primary"&gt; 标签内部。 因为我在使用页面源时实际上可以看到文本,我想我应该能够捕获它? 但是,Selenium 驱动程序只给了我class="tag i tooltipstered" 标签,而不是class="tag i",它有我需要的title 属性。

我已经尝试过:

  1. 使用Actions class MoveToElement(),但仍然找不到工具提示标题。
  2. 使用IJavaScriptExecutor 获得innerHtml&lt;div class="tags tags_primary"&gt; 没有运气。

如果有人有任何想法,将不胜感激。

更新: 将@PDHide phyton 代码重写为 c#。 为 IWebDriver 创建了扩展:

public static IWebElement WaitUntilVisible(this IWebDriver driver, By itemSpecifier, int secondsTimeout = 10)
{
    var wait = new WebDriverWait(driver, new TimeSpan(0, 0, secondsTimeout));
    var element = wait.Until<IWebElement>(driver =>
    {
        try
        {
            var elementToBeDisplayed = driver.FindElement(itemSpecifier);
            if (elementToBeDisplayed.Displayed)
            {
                return elementToBeDisplayed;
            }
            return null;
        }
        catch (StaleElementReferenceException)
        {
            return null;
        }
        catch (NoSuchElementException)
        {
            return null;
        }
    });
    return element;
}

然后用它从工具提示中收集文本:

using(IWebDriver _driver = new ChromeDriver())
{
    _driver.Navigate().GoToUrl("https://www.maxima.lt/akcijos");
    _driver.WaitUntilVisible(By.CssSelector("#CybotCookiebotDialogBodyLevelButtonLevelOptinAllowallSelectionWrapper #CybotCookiebotDialogBodyLevelButtonLevelOptinAllowAll")).Click();
    _driver.WaitUntilVisible(By.ClassName("close")).Click();
    var tool = _driver.WaitUntilVisible(By.CssSelector("[class='tag i tooltipstered']"));
    Actions actions = new Actions(_driver);
    actions.MoveToElement(tool).Build().Perform();
    var tooltip = _driver.WaitUntilVisible(By.ClassName("tooltipster-content"));
    Console.WriteLine(tooltip.Text);
}

【问题讨论】:

  • 你可以在 maxima.lt/akcijos 看到我的例子
  • 你有那个网站的链接
  • @PDHide 请参阅我之前的评论。网站链接是this

标签: c# selenium selenium-webdriver web-scraping


【解决方案1】:
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.common.by import By
from selenium import webdriver




driver=webdriver.Chrome()
driver.get("https://www.maxima.lt/akcijos")

#close cookies 
WebDriverWait(driver, 10).until(
    EC.visibility_of_element_located(
        (By.CSS_SELECTOR, '#CybotCookiebotDialogBodyLevelButtonLevelOptinAllowallSelectionWrapper #CybotCookiebotDialogBodyLevelButtonLevelOptinAllowAll'))
).click()


#close ad pop up
WebDriverWait(driver, 10).until(
    EC.visibility_of_element_located(
        (By.CLASS_NAME, 'close'))
).click()

#find information icon
tool = WebDriverWait(driver, 10).until(
    EC.visibility_of_element_located(
        (By.CSS_SELECTOR, '[class="tag i tooltipstered"]'))
)


driver.maximize_window()

#move to information icon
webdriver.ActionChains(driver).move_to_element(tool).perform()

#find the tool tip
tooltip = WebDriverWait(driver, 10).until(
    EC.visibility_of_element_located(
        (By.CLASS_NAME, 'tooltipster-content'))
)

#print the content
print(tooltip.text)

这是你可以在c#中使用相同逻辑序列的python代码

【讨论】:

    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2019-09-19
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多