【发布时间】:2015-03-10 23:24:22
【问题描述】:
我是网络抓取的新手,所以如果我误解了什么,我提前道歉......
我正在尝试从 ESPN 获取数据。这是我的python代码:
import pandas as pd
import requests
from bs4 import BeautifulSoup
url = 'http://espn.go.com/nba/teams'
r = requests.get(url)
soup = BeautifulSoup(r.text)
tables = soup.find_all('dl')
teams = []
prefix_1 = []
prefix_2 = []
teams_urls = []
for table in tables:
lis = table.find_all('dt', text=False)
print lis
for li in lis:
info = dt
teams.append(info.text)
url = info['href']
teams_urls.append(url)
prefix_1.append(url.split('/')[-2])
prefix_2.append(url.split('/')[-1])
print (teams)
当我在不同点打印时,我得到空括号 [] 作为返回。请帮忙。谢谢。
【问题讨论】:
-
不清楚您要准确获取什么
标签: python python-2.7 web-scraping beautifulsoup html-parsing