【问题标题】:Error 403: Request disallowed by robots.txt on Python错误 403:Python 上的 robots.txt 不允许请求
【发布时间】:2017-04-01 06:49:22
【问题描述】:

我正在尝试在 python 上使用 mechanize 填写表格。当我运行代码时,我得到一个错误:

错误 403:robots.txt 不允许请求。

我查看了之前回答的类似问题的问题,发现添加 br.set_handle_robots(False) 应该可以解决它,但我仍然遇到同样的错误。那么我在这里错过了什么?

import re
import mechanize
from mechanize import Browser
br = mechanize.Browser()
br.set_handle_equiv(False)
br.set_handle_robots(False)
br.addheaders = [('User-agent','Mozilla/5.0 (X11; Linux x86_64; rv:18.0)Gecko/20100101 Firefox/18.0 (compatible;)'),('Accept', '*/*')]
text = "1500103233"
browser = Browser()
browser.open("http://kuhs.ac.in/results.htm")
browser.select_form(nr=0)
browser['Stream']=['Medical']
browser['Level']=['UG']
browser['Course']=['MBBS']
browser['Scheme']=['MBBS 2015 Admissions']
browser['Year']=['Ist Year MBBS']
browser['Examination']=['First Professional MBBS Degree Regular(2015 Admissions) Examinations,August2016']
browser['Reg No']=text
response = browser.submit()

【问题讨论】:

    标签: python mechanize


    【解决方案1】:
    1. 你设置br = mechanize.Browser(),然后你设置browser = Browser()
    2. 链接:http://kuhs.ac.in/results.htm如果从页面源码可以看到,来源是:src="http://14.139.185.148/kms/index.php/results/create"
    3. 从页面源中您可以看到表单的名称。在你的情况下Stream</labelname="Results[streamId]"

    所以,你可以试试这个:

    import mechanize
    br = mechanize.Browser()
    br.set_handle_equiv(False)
    br.set_handle_robots(False)
    br.addheaders = [('User-agent','Mozilla/5.0 (X11; Linux x86_64; rv:18.0)Gecko/20100101 Firefox/18.0 (compatible;)'),('Accept', '*/*')]
    text = "1500103233"
    br.open("http://14.139.185.148/kms/index.php/results/create").read()
    for forms in br.forms():
        print forms
    br.select_form(nr=0)
    br['Results[streamId]']=['1',] #Medical
    #etc..
    response = br.submit()
    print response.read()
    

    你可以在这里看到:Submitting a form with mechanize (TypeError: ListControl, must set a sequence)

    希望这会有所帮助,它对我有用!

    【讨论】:

      猜你喜欢
      • 2013-09-20
      • 1970-01-01
      • 2011-12-23
      • 2018-05-24
      • 2016-10-25
      • 2016-03-25
      • 2023-04-07
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多