【发布时间】:2016-12-25 09:52:12
【问题描述】:
我一直在尝试让我的脚本以这样一种方式循环,即它将输出加载到 1 个文件中,然后当它完成加载所有内容时,将值移动到输出文件 2 中,擦除输出文件 1 中的值并开始重新加载它们,然后当它们下降时将值移动到输出二(覆盖旧的)重复。
到目前为止,我已经相当成功,不知道还有什么可以添加到我的脚本中,希望这里的人知道为什么我在中途随机出现“”Unboundlocalerror:在分配之前引用局部变量“Val”错误”加载过程中,当我有一个非常小的输入文件时,脚本会执行我想要的操作。
有谁知道我可以如何更改我的脚本来修复该错误,我试图了解它为什么会发生但不能。
我已经尝试彻底研究它,但我发现的建议都没有奏效(或者我实施不正确,我附上了我的脚本。谢谢!
import urllib2,re,urllib,urlparse,csv,sys,time,threading,codecs,shutil
from bs4 import BeautifulSoup
def extract(url):
try:
sys.stdout.write('0')
# global file
page = urllib2.urlopen(url).read()
soup = BeautifulSoup(page, 'html.parser')
product = soup.find("div", {"class": "js-product-price"})
price = product.findNext('div',{'class': 'js-price-display'}).getText().strip()
oos = product.findNext('p', attrs={'class': "price-oos"})
if oos is None:
oos = 'In Stock'
else:
oos = oos.getText()
val = url + "," + price + "," + oos + "," + time.ctime() + '\n'
# ifile.write(val)
sys.stdout.write('1')
except Exception as e:
print e
return val
while True:
ifile = open('output.csv', "w", 0)
inputs = csv.reader(open('input.csv'))
# inputs = csv.reader(codecs.open('input.csv', 'rU', 'utf-16'))
ifile.write('URL' + "," + 'Price' + "," + 'Stock' + "," + "Time" + '\n')
for i in inputs:
ifile.write(extract(i[0]))
ifile.close()
更新:
感谢大家的帮助!这是我的新脚本:
import urllib2,re,urllib,urlparse,csv,sys,time,threading,codecs,shutil
from bs4 import BeautifulSoup
def extract(url):
try:
sys.stdout.write('0')
# global file
page = urllib2.urlopen(url).read()
soup = BeautifulSoup(page, 'html.parser')
product = soup.find("div", {"class": "js-product-price"})
price = product.findNext('div',{'class': 'js-price-display'}).getText().strip()
oos = product.findNext('p', attrs={'class': "price-oos"})
if oos is None:
oos = 'In Stock'
else:
oos = oos.getText()
val = url + "," + price + "," + oos + "," + time.ctime() + '\n'
# ifile.write(val)
sys.stdout.write('1')
except Exception as e:
print e
else:
return val
while True:
ifile = open('output.csv', "w", 0)
inputs = csv.reader(open('input.csv'))
# inputs = csv.reader(codecs.open('input.csv', 'rU', 'utf-16'))
ifile.write('URL' + "," + 'Price' + "," + 'Stock' + "," + "Time" + '\n')
for i in inputs:
val_to_write = extract(i[0])
if val_to_write:
ifile.write(val_to_write)
ifile.close()
shutil.copy('output.csv', 'output2.csv')
print("finished")
使用上面的脚本,我现在收到错误:“ValueError: I/O operation on closed file”。谢谢
【问题讨论】:
标签: python-2.7 loops web-scraping beautifulsoup