【问题标题】:parameter error in python script & TOR proxy serverpython脚本和TOR代理服务器中的参数错误
【发布时间】:2015-06-16 05:12:30
【问题描述】:

我是 Python 中的菜鸟。 我的老板让我在运行 TOR 代理服务器的情况下运行这个 python 脚本。他告诉我这样传递这个参数: python 下载YP.py /Users/myfolder/ japan http://www.jpyellow.com/company 1 222299

他在 MAC 上进行了配置。我正在使用窗户。所以我的参数是这样的: python 下载YP.py C:\rrrb japan http://www.jpyellow.com/company 1 222299

但我收到错误:

> Traceback (most recent call last):
File "C:\Users\USER\yp1\code\DownloadYP.py", line 92, in <module>
WebPage(path, country, url, lowerlimit,upperlimit)
File "C:\Users\USER\yp1\code\DownloadYP.py", line 23, in __init__
fout = open(self.dir+"/limit.txt",'wb')
IOError: [Errno 2] No such file or directory: 'C:\\rrr/japan/limit.txt'

我的代码是:

  • 下载YP.py

    def __init__(self, path, country, url, lower=0,upper=9999):
        self.dir = str(path)+"/"+ str(country)
        self.url = url
        try:
          fin = open(self.dir+"/limit.txt",'r')
            limit = fin.readline()
            limits = str(limit).split(",")
            lower = int(limits[0])
            upper = int(limits[1])
            fin.close()
        except:
    

    这是第 23 行

              **fout = open(self.dir+"/limit.txt",'wb')**
            limits = str(lower)+","+str(upper)
            fout.write(limits)
            fout.close()  
        self.process_instances(lower,upper)
    
    
        def process_instances(self,lower,upper):
                try:
                    os.stat(self.dir)
                except:
                    os.mkdir(self.dir)
                for count in range(lower,upper+1):
                    if count == upper:
                        print "all downloaded, quitting the app!!"
                        break
                    targetURL = self.url+"/"+str(count)
                    print "Downloading :" + targetURL
                    req = urllib2.Request(targetURL)
                    try:
                        response = urllib2.urlopen(req)
                        the_page = response.read()  
                        if the_page.find("Your IP suspended")>=0:
                            print "The IP is suspended"
                            fout = open(self.dir+"/limit.txt",'wb')
                            limits = str(count)+","+str(upper)
                            fout.write(limits)
                            fout.close()  
                            break
                        if the_page.find("Too many requests")>=0:
                            print "Too many requests"
                            print "Renew IP...."
                            fout = open(self.dir+"/limit.txt",'wb')
                            limits = str(count)+","+str(upper)
                            fout.write(limits)
                            fout.close()
                            break
    
    
    #subprocess.Popen("sudo  /Users/myfolder/Documents/workspace/DataMining/ip_renew.py", shell=True)
                        if the_page.find("404 error")>=0:
                            print "the page not exist"
                            continue
                        self.saveHTML(count, the_page)
                    except:
                        print "The URL cannot be fetched"
                        pass
                        #raise
    
        def saveHTML(self,count, content):
            fout = open(self.dir+"/"+str(count)+".html",'wb')
            fout.write(content)
            fout.close()
    
        if __name__ == '__main__':
    
        if len(sys.argv) !=6:
            print "cannot process!!! Five Parameters are required to run the          process."
            print "Parameter 1 should be the path where to save the data, eg, /Users/myfolder/data/"
            print "Parameter 2 should be the name of the country for which data is collected, eg, japan"
            print "Parameter 3 should be the URL from which the data to collect, eg, http://www.jpyellow.com/company/"
            print "Parameter 4 should be the lower limit of the company id, eg, 11 "
            print "Parameter 5 should be the upper limit of the company id, eg, 1000 "
            print "The output will be saved as the HTML file for each company in the target folder's country"
            exit()
        else:
            path = str(sys.argv[1])
            country = str(sys.argv[2])
            url = str(sys.argv[3])
            lowerlimit = int(sys.argv[4])
            upperlimit = int(sys.argv[5])
    

    这是第 92 行

            **WebPage(path, country, url, lowerlimit,upperlimit)**
    

我已经下载了 TorVPN 来运行代理服务器.. 并运行这个脚本。那么为什么会发生错误呢?这是可以下载网站的脚本。

【问题讨论】:

    标签: python sockets subprocess tor proxy-server


    【解决方案1】:

    问题出在DownloadYP.py -

    您没有文件 - C:\\rrr\japan\limit.txt

    我建议在上述目录中创建一个具有该名称的虚拟文件,然后尝试再次运行该脚本。

    另外,附带说明 - 您正在混合来自 unix 的 os 路径分隔符并在 windows 中使用它,而不是您需要使用 os.path.join() 函数,以便 python 能够处理跨平台的 os 路径分隔符。代码就像 -

    import os
    self.dir = os.path.join(str(path),str(country))
    

    另外,打开文件时,需要使用os.path.join,而不是直接指定路径分隔符-

    fin = open(os.path.join(self.dir,"limit.txt"),'r')
    

    【讨论】:

    • 很高兴它对你有用。如果您接受答案,如果他们对您有所帮助,也将不胜感激,因为这将保持社区保持及时提供好的答案的动力。 (不只是这个,对于您的任何问题)
    猜你喜欢
    • 1970-01-01
    • 2012-03-04
    • 1970-01-01
    • 2021-03-17
    • 1970-01-01
    • 1970-01-01
    • 2013-08-01
    • 1970-01-01
    • 1970-01-01
    相关资源
    最近更新 更多