【发布时间】:2018-06-05 17:25:42
【问题描述】:
这是我的 scrap.py 代码
from bs4 import BeautifulSoup as soup
from urllib.request import urlopen as uReq
website = "https://houston.craigslist.org/search/cta"
uClient = uReq(website)
page_html = uClient.read()
uClient.close()
soup_html = soup(page_html, "html.parser")
result_html = soup_html.findAll("p", {"class":"result-info"})
filename = "products.csv"
f = open(filename, "w", encoding='utf8')
headers = "car_name, price\n"
f.write(headers)
for container in result_html:
carname = container.a.text
price_container = container.findAll('span', {'class':'result-price'})
price = price_container[0].text
f.write(carname + "," + price + "\n")
f.close()
在终端上,它工作正常,但是当我循环它时,它会给出以下错误..
Traceback (most recent call last):
File "scrap.py", line 23, in <module>
price = price_container[0].text.splitlines()
IndexError: list index out of range
请帮忙。谢谢
【问题讨论】:
-
我投票决定将此问题作为离题结束,因为 OP 希望找到人来编写他的任务并在每个答案时提出后续问题
标签: python python-3.x web-scraping beautifulsoup