用python在网站上刮一张桌子（没有表格标签）答案

【问题标题】：scrape a table in a website with python (no table tag)用python在网站上刮一张桌子（没有表格标签）
【发布时间】：2019-12-31 15:37:11
【问题描述】：

我正在尝试每天抓取产品的库存价值。这是网络https://funds.ddns.net/f.php?isin=ES0110407097。这是我正在尝试的代码：

import pandas as pd
from bs4 import BeautifulSoup

html_string = 'https://funds.ddns.net/f.php?isin=ES0110407097'    
soup = BeautifulSoup(html_string, 'lxml') 

new_table = pd.DataFrame(columns=range(0,2), index = [0])  

row_marker = 0


column_marker = 0
 for row in soup.find_all('tr'):
     columns = soup.find_all('td')
     for column in columns:
         new_table.iat[row_marker,column_marker] = column.get_text()
         column_marker += 1

print(new_table)

我想在 Python 中使用我在网络上看到的相同格式，包括数据和数字。请问怎么弄啊？

【问题讨论】：

您好，我没有发布解决方案，但我以前曾使用过这种方法，从网上抓取一些信息对我很有用。我用的是BeautifulSoap，你可以在这里找到一个链接：crummy.com/software/BeautifulSoup/bs4/doc，在这里你可以找到一个示例使用教程：towardsdatascience.com/…希望对你有帮助！

标签： python-3.x web-scraping

【解决方案1】：

该特定页面有一种更简单的方法：

import requests
import pandas as pd

url = 'https://funds.ddns.net/f.php?isin=ES0110407097'    
resp = requests.get(url)

new_table = pd.read_html(resp.text)[0]
print(new_table.head(5))

输出：

            0          1
0       FECHA     VL:EUR
1  2019-12-20  120170000
2  2019-12-19  119600000
3  2019-12-18  119420000
4  2019-12-17  119390000

【讨论】：

可以直接通过pd.read_html(url)读取表格，无需使用requests！
@αԋɱҽԃαмєяιcαη - 正确，但如果你这样做，你的 new_table 是一个列表，而不是一个数据框，所以此时你需要将 new_table[0] 转换为数据框，所以我不认为它节省了那么多时间......但是，原则上，你是对的。
stackoverflow.com/questions/38486477/…