如何获取锚标签内的元素？答案

【问题标题】：How to get the elements inside of the anchor tag?如何获取锚标签内的元素？
【发布时间】：2018-07-19 08:55:11
【问题描述】：

对不起，我对 Selenium 和 python 中的网络抓取非常陌生。我正在尝试抓取超市网站的内容，该网站在 html 中有以下部分

<div class="itemDescription">
            <meta itemprop="priceCurrency" content="INR">
            <meta itemprop="price" content="23.00">
        <h4 class=""><strong class="price js-effective-mrp" data-currency="₹">₹ 23.00 </strong>
                                    <s class="js-actual-mrp" style="display:none;"></s>
                                <br><a href="/fresh-onion-red-v-1-kg-p.php" class="">Fresh Onion Red <span class="item-quantity">1 Kg</span></a></h4>
                    </div>

我需要产品的价格、数量和名称。

以下是我编写的代码，但它没有正确解析元素。

div = driver.find_element_by_class_name('itemDescription')
sname =div.find_element_by_css_selector('a').get_attribute('href')
squantity =driver.find_elements_by_class_name('item-quantity')
sprice = driver.find_elements_by_xpath('//*[contains(concat( " ", @class, " " ), concat( " ", "js-effective-mrp", " " ))]')

请帮忙

【问题讨论】：

标签： python selenium selenium-webdriver web-scraping web-crawler

【解决方案1】：

试试这个 xPath 来回价格：

//strong[@class='price js-effective-mrp' and @data-currency='₹']

或者如果您想要所有货币：

//strong[@class='price js-effective-mrp']

此链接：

//div[@class='itemDescription']//a

这个是数量：

//span[@class = 'item-quantity']

例子：

sname = driver.find_element_by_xpath("//div[@class='itemDescription']//a")
squantity = driver.find_element_by_xpath("//span[@class = 'item-quantity']")
sprice = driver.find_element_by_xpath("//strong[@class='price js-effective-mrp' and @data-currency='₹']")

print(squantity.text) # prints quantity
print(sname.text) # prints name
print(sprice.text) # prints price

根据您的反馈，您无法从列表中获取文本，但您可以从列表中的每个元素中获取文本，如下所示：

sname_list = driver.find_elements_by_xpath("//div[@class='itemDescription']//a")
for sname in sname_list:
    print(sname.text) # print the text of every element in the list

【讨论】：

你好安德烈，感谢您的快速响应。我不知道我是否听起来很愚蠢，但这是解析它的正确方法，因为我收到了无效的语法错误。 sname = driver.find_element_by_xpath(//a[@href = '/fresh-onion-red-v-1-kg-p.php']) squantity =driver.find_elements_by_xpath(//span[@class= 'item-quantity '])
我已经在答案中添加了infor，请看一下
我的链接为空白 //a[@href = '/fresh-onion-red-v-1-kg-p.php'] 还有其他方法吗以动态方式获取描述（sname）（在本例中为 -Fresh Onion Red ），因为我需要它用于网页中的所有元素。
我已经把xPath改成链接了，请试试
谢谢你，安德烈以上所有工作都很好。 :)

【解决方案2】：

你为什么不试试这个逻辑，Xcuse with syntex，

String strProductDescription = driver.find_element_by_class_name('itemDescription').getText();
String arrString = strProductDescription.split("SomeDelimeter");

【讨论】：

【解决方案3】：

MRP你可以试试这个：

sprice = driver.find_element_by_css_selector('strong.price.js-effective-mrp')

然后您可以使用 .text 方法提取文本：

print(sprice.text)

数量

squantity = driver.find_element_by_css_selector('span.item-quantity')
print(squantity.text)

姓名：

sname = driver.find_element_by_xpath("//div[@class='itemDescription']//a")
print(sname.text)

请注意，使用此 xpath : //div[@class='itemDescription']//a 您将获得此 Fresh Onion Red 1 Kg 作为输出。

基本上新鲜洋葱红是一个文本节点。

【讨论】：

非常感谢@cruisepandey，价格的事情奏效了。 :) 你能帮我数量和描述吗？也只是想告诉你，这将是页面上所有元素的 selenium 自动化过程。
有没有一种方法可以对网页上的所有元素进行迭代。？我试过了，但我得到了错误