【发布时间】:2021-08-02 20:03:13
【问题描述】:
我想从链接中抓取,但我发现一些困难,要么我找不到它,要么我不知道如何在一个链接中选择一些列表和一些文本... . 我用 BeautifulSoup 做这个:
response = requests.get(LINK)
response.raise_for_status()
soup = bs4.BeautifulSoup(response.text,'html.parser')
for select in soup.select("script",type="text/javascript"):
print(select)
其中 LINK 是 https,作为输出我得到:
OTHER <script type="text/javascript"> WRITINGS
<script type="text/javascript">
$(function () {
$('#chart_t_2021').highcharts({
chart: {
...
},
title: {
text: 'I WANT TO PRINT THIS TEXT'
},
...
})
});
</script>
<script type="text/javascript">
$(function () {
$('#chart_2021').highcharts({
title: {
text: '...'
},
yAxis: {
...
},
xAxis: {tickPositions: [15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30] <!--I WOULD LIKE TO TAKE THIS LIST AND PUT IT IN A VARIABLE-->
},
legend: {
layout: 'vertical',
align: 'center',
verticalAlign: 'bottom'
},
plotOptions: {
series: {
pointStart: 15
}
},
series: [{
name: 'I WOULD LIKE TO TAKE THIS TEXT AND PUT IT IN A VARIABLE',
data: [0,0,0,0,0,0,0,0,0,3,1,8,12,21,22,13]<!--I WOULD LIKE TO TAKE THIS LIST AND PUT IT IN A VARIABLE-->
}, {
name: 'I WOULD LIKE TO TAKE THIS TEXT AND PUT IT IN A VARIABLE',
data: [0,0,0,0,0,0,0,0,0,3,1,7,12,21,19,13]<!--I WOULD LIKE TO TAKE THIS LIST AND PUT IT IN A VARIABLE-->
}]
})
});</script>
OTHER <script type="text/javascript"> WRITINGS
我尝试过这样做:
for select1 in soup.select("script",type="text/javascript"):
for select2 in select1.select("title"):
print(select2)
但是它不打印任何东西,有人可以帮我打印至少我作为输出的第一个标题吗?
【问题讨论】:
-
之前的回复是否回答了您的问题? -- stackoverflow.com/a/35956388/13261176
-
不,因为前一个一般要求html的标题,但我要求的是 中的标题,或者更确切地说我要求的是文本即在里面: script> function () > $('#chart_t_2021').highcharts > title > text
标签: python web-scraping beautifulsoup