如何获取元素的文本但不包括子元素文本答案

【问题标题】：How to get text of element but excluding the sub-elements text如何获取元素的文本但不包括子元素文本
【发布时间】：2015-02-16 21:33:16
【问题描述】：

我想获取元素的文本而不包括其元素的文本。我尝试过 getText()，但它返回的文本包含所有子元素文本。

在以下示例中：当我从第一个 div 检索文本时，它返回包含其所有子元素的文本。

<div class="row”>
    <div class="col-lg-4 section”>
        <div class="col-md-12”>
            inseam 28 30 32
        </div> 
    </div>
        <div class="col-lg-5 section”>
        <div class="col-md-13”>
            inseam 28 34 36
        </div> 
    </div>
</div>

请告诉我如何在 java 中使用 webdriver。

谢谢肖恩

【问题讨论】：

分享编写的代码，这将有助于更好地理解您的问题。
Shoaib，我只是想阅读不包括子元素文本的元素文本。 List el = driver.findElements(By.xpath("*")); for ( WebElement e : el ) { e.getText();}

标签： selenium-webdriver jquery-selectors webdriver

【解决方案1】：

When I retrieved text from the first div with class 'row', it returns text that includes all its subelements.

发生这种情况是因为您从父 div 中检索了文本，因此子 div 的所有 innerHTML/文本与它们一起被检索。

以下是仅检索必要的 innerHTML/文本的方法：

1- for 'inseam 28 30 32'：

String text = driver.findElement(By.xpath("//div[@class='col-md-12']")).getText();

或

String text = driver.findElement(By.className("col-md-12")).getText();

2- for 'inseam 28 34 36'：

String text = driver.findElement(By.xpath("//div[@class='col-md-13']")).getText();

或

String text = driver.findElement(By.className("col-md-13")).getText();

【讨论】：

【解决方案2】：

没有专门使用 Selenium 尝试过，但使用 jQuery，您可以使用 contents() 获取所有元素包括原始文本节点，通过 nodeType 3（文本节点）过滤，然后采用 @ 987654324@，在您的示例中：

JSFiddle: http://jsfiddle.net/TrueBlueAussie/p33gcfk2/1/

var text = $('.row').contents().filter(function () {
    return this.nodeType == 3;
}).first();
alert(text.text());

【讨论】：

谢谢。我将尝试弄清楚如何使用 selenium 运行此 jquery 代码。

【解决方案3】：

这是因为您试图获取父标签的文本。如果你想获得特定孩子的标签，你必须一直到达那里。您可以使用“nth-child”或“nth-of-type”。例如，在这种情况下，如果您想返回此文本“inseam 28 34 36”。

CSS选择器将是“div.row div:nth-of-type(3)”或者你可以直接指定div类“div.col-md-13”

关于选择器的更多信息可以参考这篇文章https://saucelabs.com/resources/selenium/css-selectors

【讨论】：

Shesh，我想读取一个元素的文本，而不包括子元素的文本。
肖恩，您能否告诉我您实际上想从您提供的上述 html 代码中获取什么文本。在上面的代码中， div.row 将返回我之前评论中提到的所有元素的文本。此外， div.row 本身没有任何文本。在上述代码中，只有两个元素会返回文本，即 div.col-md-12 = "inseam 28 30 32" 和 div.col-md-13 = "inseam 28 34 36"。
Shesh，我正在尝试读取字体并与页面上所有元素的某些字体进行比较。在许多情况下，带有文本的元素有多个没有任何文本的父元素，并且设置的字体与显示的文本不同。所以，我想从比较中删除没有文本的元素。但是我无法使用 getText() 来做到这一点，因为它正在返回所有子元素的文本。
肖恩，如果我不正确，您想测试两个文本的字体大小是否相等。如果这是正确的，那么这是错误的方法，您无法通过 getText() 获取字体值。要测试字体，您必须使用 getCSSValue() 或使用 javascript 执行器。

【解决方案4】：

我一直在寻找同样的东西，对于那些可以指定 WebElement 或 WebElements 列表的人来说，这是我的解决方案：

def remove_child_text_from_webelement(webelement):
    # Declaring the current text for this webelement
    current_text = webelement.text
    # Getting its childs elements in a list
    childs_list = webelement.find_elements_by_xpath('./*')
    # Manipulating text to remove child text from parents
    childrens_text_list = [child.text for child in childs_list]
    #return (childrens_text_list,type(childrens_text_list))
    for children_text in childrens_text_list:
        match_index = current_text.find(children_text)
        if match_index != -1:
            match_length = len(children_text)
            current_text = current_text[0:match_index] + current_text[match_index+match_length:]
    return current_text

现在您可以执行以下操作：

[remove_child_text_from_webelement(e) for e in browser.find_elements_by_xpath('//div[contains(@class,"person")]')]

【讨论】：