【发布时间】:2021-11-13 05:58:53
【问题描述】:
我正在尝试从这个网站https://en.wikipedia.org/wiki/List_of_chemical_elements刮掉表格“化学元素列表”
然后我想将表数据存储到 pandas 数据框中,以便我可以将其转换为 csv 文件。到目前为止,我已经将表的标题抓取并存储到数据框中。我还设法从表中检索每一行数据。但是,我无法将表的数据存储到数据框中。以下是我目前所拥有的
from bs4 import BeautifulSoup
import requests as r
import pandas as pd
response = r.get('https://en.wikipedia.org/wiki/List_of_chemical_elements')
wiki_text = response.text
soup = BeautifulSoup(wiki_text, 'html.parser')
table = soup.select_one('table.wikitable')
table_body = table.find('tbody')
#print(table_body)
rows = table_body.find_all('tr')
cols = [c.text.replace('\n', '') for c in rows[1].find_all('th')]
df2a = pd.DataFrame(columns = cols)
df2a
for row in rows:
records = row.find_all('td')
if records != []:
records = [r.text.strip() for r in records]
print(records)
【问题讨论】:
标签: python pandas beautifulsoup