Python 3.6 - 在 soup.findAll() 字符串中查找一个单词 [关闭]答案

【问题标题】：Python 3.6 - Find a word in a soup.findAll() string [closed]Python 3.6 - 在 soup.findAll() 字符串中查找一个单词 [关闭]
【发布时间】：2019-01-06 03:29:28
【问题描述】：

下面是一个基本的代码示例，希望对您有所帮助：

from bs4 import BeautifulSoup
import requests
import csv

with open('URLs.csv', newline='') as f_urls:
csv_urls = csv.reader(f_urls)

for line in csv_urls:
    page = requests.get(line[0])
    soup = BeautifulSoup(page.text, 'html.parser')
    for results in soup.findAll('a', {'data-tn-element':'jobTitle'}):
        if "Scientist" in results:
            continue # Won't this continue just loop back to the for results in...loop, not the for line in csv_urls loop?
        else:
            print(results.text)

...CSV 文件中的 URL 所在位置：

https://www.indeed.ca/jobs?q=data+scientist%2C+data+analyst%2C+python&l=Canada&jt=fulltime&start=20
https://www.indeed.ca/jobs?q=data+scientist,+data+analyst,+python&l=Canada&jt=fulltime

...所以在这段代码的上下文中，它首先读取第一个 URL 并找到该页面上的所有职位。如果抓取表中的任何职位包含“科学家”一词（其中任何一个），它应该继续返回到“for line in csv_urls:”行并从列表中的下一个 URL 重新开始。如果它们不包含该单词，则打印结果。

这是一个基本示例，而不是我在实际代码中使用的示例，但应用程序是相同的。我认为问题可能出在 continue 所在的位置，因为我需要它跳回“for line in csv_urls:”循环。

希望这对那些投资的人来说更“切题”。谢谢？

【问题讨论】：

给我们一个语法有效、可执行的例子怎么样？
没有足够的信息来重现。
只是想知道是否有人知道如何在使用 BS4 从网站拉取的表格中找到特定单词。
stackoverflow.com/help/how-to-ask
@DYZ 谢谢，但我的整个代码不需要回答我的问题。但是，让我使用我提供的帮助信息创建一个脚本。很快就会更新

标签： python python-3.x beautifulsoup

【解决方案1】：

您必须致电.text，否则将无法匹配

for line in csv_urls:
    page = requests.get(line[0])
    soup = BeautifulSoup(page.text, 'html.parser')
    for results in soup.findAll('a', {'data-tn-element':'jobTitle'}):
        if "Scientist" in results.text:
            break
            # stop this loop, continue to loop "csv_urls"
            # even the rest has no "Scientist"
        else:
            print(results.text)

【讨论】：

谢谢它有点工作，但不完全是我希望的方式。我将用一个应该有帮助的示例脚本来更新我的问题。
见上面我的更新。谢谢！
如果找到Scientist，你想停止循环for results in soup....并继续循环for line in csv_urls:？
回答编辑看看。
完美！我知道我错过了一些东西（即休息）。非常感谢，我可以使用我的原始代码。