【发布时间】:2024-04-24 17:45:01
【问题描述】:
如何使用 Python 从这个 javascript 页面中获取职业路径职位?
这是我的代码 sn-p,返回的汤没有我需要的任何文本数据!
import requests
from bs4 import BeautifulSoup
import json
import re
from selenium import webdriver
from selenium.webdriver.firefox.firefox_binary import FirefoxBinary
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
# get BeautifulSoup object
def get_soup(url):
"""
This function returns the BeautifulSoup object.
Parameters:
url: the link to get soup object for
Returns:
soup: BeautifulSoup object
"""
req = requests.get(url)
soup = BeautifulSoup(req.text, 'html.parser')
return soup
# get selenium driver object
def get_selenium_driver():
"""
This function returns the selenium driver object.
Parameters:
None
Returns:
driver: selenium driver object
"""
options = webdriver.FirefoxOptions()
options.add_argument('-headless')
driver = webdriver.Firefox(executable_path=r"geckodriver", firefox_options = options)
return driver
# get soup obj using selenium
def get_soup_using_selenium(url):
"""
Given the url of a page, this function returns the soup object.
Parameters:
url: the link to get soup object for
Returns:
soup: soup object
"""
options = webdriver.FirefoxOptions()
options.add_argument('-headless')
driver = webdriver.Firefox(executable_path=r"geckodriver", firefox_options = options)
driver.get(url)
driver.implicitly_wait(3)
html = driver.page_source
soup = BeautifulSoup(html, 'html.parser')
driver.close()
return soup
title = "PHP%2BDeveloper"
location = "San%2BDiego,%2BCalifornia,%2BUs,%2BCA"
years_of_experirence = "0"
sort_by_filter = "mostProbableTransition"
url = "https://www.dice.com/career-paths?title={}&location={}&experience={}&sortBy={}".format(title, location, years_of_experirence , sort_by_filter)
career_paths_page_soup = get_soup(url)
【问题讨论】:
-
发布你的代码。到目前为止你做了什么研究。
-
请记住,我们不为您工作,我们会在您遇到困难时为您提供帮助。所以你至少应该学习,尝试,如果你失败了,我们会帮助你。
-
谢谢大家,非常抱歉!请检查代码sn-p!
-
页面由 java 脚本呈现。所以在这种情况下请求不会帮助你。但是,由于你已经为 selenium 编码,你可以调用该函数
career_paths_page_soup=get_soup_using_selenium(url)并且还提到了你的期望值是什么从页面返回。
标签: python selenium web-scraping beautifulsoup python-requests