【发布时间】:2019-10-05 16:48:13
【问题描述】:
我正在尝试获取链接的几率,但出现错误。你知道我做错了什么吗?
谢谢
import requests
from bs4 import BeautifulSoup as bs
url = 'https://www.oddsportal.com/soccer/spain/laliga'
r = requests.get(url, headers = {'User-Agent' : 'Mozilla/5.0'})
soup = bs(r.content, 'lxml')
##print([a.text for a in soup.select('#tournamentTable tr[xeid] [href*=soccer]')])
print([b.text for b in soup.select('#tournamentTable td[xodd]')])
我期望获得 10 行和 3 列,每个奇数一个。 但是,我有以下错误
Traceback (most recent call last):
File "/Users/.py", line 14, in <module>
print([b.text for b in soup.select('#tournamentTable td[xodd]')])
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/bs4/element.py", line 1376, in select
return soupsieve.select(selector, self, namespaces, limit, **kwargs)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/soupsieve/__init__.py", line 114, in select
return compile(select, namespaces, flags, **kwargs).select(tag, limit)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/soupsieve/__init__.py", line 63, in compile
return cp._cached_css_compile(pattern, namespaces, custom, flags)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/soupsieve/css_parser.py", line 214, in _cached_css_compile
CSSParser(pattern, custom=custom_selectors, flags=flags).process_selectors(),
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/soupsieve/css_parser.py", line 1113, in process_selectors
return self.parse_selectors(self.selector_iter(self.pattern), index, flags)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/soupsieve/css_parser.py", line 946, in parse_selectors
key, m = next(iselector)
File "/Library/Frameworks/Python.framework/Versions/3.7/lib/python3.7/site-packages/soupsieve/css_parser.py", line 1100, in selector_iter
raise SelectorSyntaxError(msg, self.pattern, index)
File "<string>", line None
soupsieve.util.SelectorSyntaxError: Invalid character '\x1b' position 17
line 1:
#tournamentTable td[xodd]
^
...
【问题讨论】:
-
完成...........
-
您在
#tournamentTable和td[xodd]之间的字符似乎有误。它可能看起来像空格,但它的代码为\x1b。您可以尝试删除此字符并重新放置空间。 -
我可以无错误地运行代码 - 但页面使用 JavaScript 来放置数据
td[xodd]和BS无法运行 JavaScript,因此代码无法获取此数据。您将需要 Selenium 来控制可以运行 JavaScript 的 Web 浏览器。 -
如何使用 selenium 做到这一点?
-
Selenium-Python 的文档
标签: python select html-table beautifulsoup