【发布时间】:2021-09-27 08:37:33
【问题描述】:
我是 python 新手,我确实在尝试刮。由于某种原因,每个职位发布都保存在“a”标签而不是 div 下,div 也包含 href。 这是项目输出:print(item)
<a class="tapItem fs-unmask result job_e0fb3e5f520856c0 resultWithShelf sponTapItem tapItem-noPadding desktop" data-hide-spinner="true" data-jk="e0fb3e5f520856c0" data-mobtk="1favs1gn0t5v1800" href="/company/Acentury/jobs/New-Graduate-Software-Developer-e0fb3e5f520856c0?fccid=5c6453896b020232&vjs=3" id="job_e0fb3e5f520856c0" rel="nofollow" target="_blank"><div class="slider_container"><div class="slider_list"><div class="slider_item"><div class="job_seen_beacon"><table cellpadding="0" cellspacing="0" class="jobCard_mainContent" role="presentation"><tbody><tr><td class="resultContent"><div class="heading4 color-text-primary singleLineTitle tapItem-gutter"><h2 class="jobTitle jobTitle-color-purple jobTitle-newJob"><div class="new topLeft holisticNewBlue desktop"><span class="label">new</span></div><span title="New Graduate Software Developer">New Graduate Software Developer</span></h2></div><div class="heading6 company_location tapItem-gutter"><pre><span class="companyName">Acentury</span><div class="companyLocation">Richmond Hill, ON<span class="remote-bullet">•</span><span>Temporarily Remote</span></div></pre></div><div class="heading6 tapItem-gutter metadataContainer"><div class="metadata salary-snippet-container"><span class="salary-snippet">$44,182 - $126,699 a year</span></div></div><div class="heading6 error-text tapItem-gutter"></div></td></tr></tbody></table><table class="jobCardShelfContainer" role="presentation"><tbody><tr class="jobCardShelf"><td class="shelfItem indeedApply"><span class="iaIcon"></span><span class="ialbl iaTextBlack">Easily apply</span></td></tr><tr class="underShelfFooter"><td><div class="heading6 tapItem-gutter result-footer"><div class="job-snippet"><ul style="list-style-type:circle;margin-top: 0px;margin-bottom: 0px;padding-left:20px;">
<li>Work with senior <b>developers</b> to develop front-end features on our current platform through entire R&D cycle from design to implementation and official release.</li>
</ul></div><span class="date">Today</span><span class="result-link-bar-separator">·</span><button aria-expanded="false" class="sl resultLink more_links_button" type="button">More...</button></div><div class="tab-container"><div class="more-links-container result-tab" role="presentation"><div class="more_links"><button class="close-button" title="Close" type="button"></button><ul><li><span class="mat">View all <a href="/Acentury-jobs">Acentury jobs</a> - <a href="/jobs-in-Richmond-Hill,-ON">Richmond Hill jobs</a></span></li><li><span class="mat">Salary Search: <a href="/career/software-engineer/salaries/Richmond-Hill--ON?campaignid=serp-more&fromjk=e0fb3e5f520856c0&from=serp-more">New Graduate Software Developer salaries in Richmond Hill, ON</a></span></li></ul></div></div></div></td></tr></tbody></table><div aria-live="polite"></div></div></div><div class="slider_sub_item"></div></div></div><div class="kebabMenu"><button aria-expanded="false" aria-haspopup="true" aria-label="Job actions" class="kebabMenu-button"><svg fill="none" height="24" viewbox="0 0 24 24" width="24" xmlns="http://www.w3.org/2000/svg"><path d="M12 7C13.1 7 14 6.1 14 5C14 3.9 13.1 3 12 3C10.9 3 10 3.9 10 5C10 6.1 10.9 7 12 7ZM12 10C10.9 10 10 10.9 10 12C10 13.1 10.9 14 12 14C13.1 14 14 13.1 14 12C14 10.9 13.1 10 12 10ZM12 17C10.9 17 10 17.9 10 19C10 20.1 10.9 21 12 21C13.1 21 14 20.1 14 19C14 17.9 13.1 17 12 17Z" fill="#2d2d2d"></path></svg></button></div></a>
我的代码是
divs = soup.find_all('a', class_ = 'tapItem')
for item in divs:
for people in item.find_all('a'):
print(people)
for ok in people.find_all('a', class_ = 'tapItem'):
linkJob1 = ok.get('href')
print(linkJob1)
人物不包含第一个'a'标签,而是其他标签,我该如何解决这个问题?谢谢
网址:https://ca.indeed.com/jobs?q=software+developer&l=Toronto%2C+ON&start=0
预期结果是每个职位/卡片的href
【问题讨论】:
-
网址是什么以及预期结果的示例?
-
ca.indeed.com/… 预期结果是每个职位/卡片的href
标签: python web-scraping beautifulsoup href