【问题标题】:Submitting a form and getting results Python Requests Post提交表单并获取结果 Python Requests Post
【发布时间】:2020-09-22 15:08:03
【问题描述】:

我正在尝试创建一个脚本来提交表单并将结果返回给我。我可以从 URL 中提取表单信息,但无法更新表单的字段或获得响应。

我目前有:

import requests
from bs4 import BeautifulSoup as bs

url = 'https://dos.elections.myflorida.com/campaign-finance/contributions/'
response = requests.get(url)
soup = bs(response.text)
form_info = soup.find_all('action')
print(form_info[0]['action'])

哪个有效并返回:

'/cgi-bin/contrib.exe'

这个表单应该可以默认提交,所以我再试试:

session = requests.Session()
BASE_URL = 'https://dos.elections.myflorida.com'
headers = {'User-Agent': "Mozilla/5.0" , 'referer' :'{}/campaign-finance/contributions/'.format(BASE_URL)}
data = {'Submit' : 'Submit'}
res = session.post( '{}/cgi-bin/contrib.exe'.format(BASE_URL), data = data, headers = headers )

我收到 502 响应。由于this post.

https://dos.elections.myflorida.com/campaign-finance/contributions/

结果将我重定向到:

https://dos.elections.myflorida.com/cgi-bin/contrib.exe

SIM 的解决方案奏效了,谢谢!!

【问题讨论】:

    标签: python html web-scraping python-requests


    【解决方案1】:

    尝试以下使用默认搜索获取所需内容:

    import requests
    from bs4 import BeautifulSoup
    
    link = 'https://dos.elections.myflorida.com/campaign-finance/contributions/'
    post_url = 'https://dos.elections.myflorida.com/cgi-bin/contrib.exe'
    
    with requests.Session() as s:
        s.headers['User-Agent'] = 'Mozilla/5.0 (Windows NT 6.1; ) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/83.0.4103.61 Safari/537.36'
        r = s.get(link)
        soup = BeautifulSoup(r.text,"lxml")
        payload = {i['name']:i.get('value','') for i in soup.select('input[name]')}
        payload['election'] = '20201103-GEN'
        payload['search_on'] = '1'
        payload['CanNameSrch'] = '2'
        payload['office'] = 'All'
        payload['party'] = 'All'
        payload['ComNameSrch'] = '2'
        payload['committee'] = 'All'
        payload['namesearch'] = '2'
        payload['csort1'] = 'NAM'
        payload['csort2'] = 'CAN'
        payload['queryformat'] = '2'
        r = s.post(post_url,data=payload)
        print(r.text)
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2016-09-11
      • 2016-12-16
      • 2013-05-17
      • 1970-01-01
      • 1970-01-01
      • 2015-03-09
      • 1970-01-01
      • 2019-10-23
      相关资源
      最近更新 更多