【发布时间】:2016-12-05 02:01:19
【问题描述】:
嗨,伙计们,我正在抓取一个网站,每部电影都有 3 个电影链接,它有 3 个链接,我有获取 3 个链接的代码,但我想选择 1 并只打印那个 1,在这种情况下是 openload 一个,它也像整个 iframe 一样打印它,我喜欢像这样打印清晰的链接 = 'https://openload.co/embed/cosxf9mWZlg/' 我也要把印刷品放在这里,所以你们知道我现在是如何正确的
import urllib2
import urllib
import re
import requests
from bs4 import BeautifulSoup
from lxml import html
url= ('http://goldfilmesonline.com/goldstone-legendado-online/','http://goldfilmesonline.com/sob-a-sombra-legendado-online/','http://goldfilmesonline.com/fora-do-rumo-dublado-online/')
b=0
while b < len(url):
headers = {'user-agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36'}
a = r = requests.get(url[b], headers=headers)
soup = BeautifulSoup(a.text,'html.parser')
x = soup.findAll({'iframe' : 'src'})
print x
b+=1
这是印刷品
[<iframe allowfullscreen="" frameborder="0" src="https://www.youtube.com/embed/"></iframe>, <iframe allowfullscreen="" frameborder="0" src="https://openload.co/embed/noK42_ITHiU/"></iframe>, <iframe allowfullscreen="" frameborder="0" src="http://thevid.net/e/zqlcx3byxh/"></iframe>]
[<iframe allowfullscreen="" frameborder="0" src="https://www.youtube.com/embed/"></iframe>, <iframe allowfullscreen="" frameborder="0" src="https://openload.co/embed/oMzqATsLLsw/"></iframe>, <iframe allowfullscreen="" frameborder="0" src="http://thevid.net/e/rgt2kyrmzdqdbeocwjmspd6/"></iframe>]
[<iframe allowfullscreen="" frameborder="0" src="https://www.youtube.com/embed/"></iframe>, <iframe allowfullscreen="" frameborder="0" src="https://openload.co/embed/cosxf9mWZlg/"></iframe>, <iframe allowfullscreen="" frameborder="0" src="https://openload.co/embed/b85sRhsjJ3Q/"></iframe>, <iframe allowfullscreen="" frameborder="0" src="http://thevid.net/e/4mvpjkef43pqyhnmg/"></iframe>]
【问题讨论】:
标签: python python-2.7 beautifulsoup httprequest