【发布时间】:2021-05-15 04:12:45
【问题描述】:
我正在像这样从 onvista.de 抓取股票信息:
import pandas as pd
import requests
hdr={'User-Agent':'Chrome/70.0.3538.110'}
table_dfs={}
for page_number in range(3):
http= "https://www.onvista.de/aktien/finder/?continent[0]=Europa&continent[1]=Nordamerika&continent[2]=Asien%20-%20Pazifik&PROFIT_PER_SHARE[enabled]=1&PROFIT_KGV[enabled]=1&MARKET_CAPITALIZATION[enabled]=1&PERFORMANCE_6_MONTHS[enabled]=1&PERFORMANCE_4_WEEKS[enabled]=1&SCREENER_INTEREST[enabled]=1&SCREENER_RISK_ZONE[enabled]=1&PROFIT_PER_SHARE[year]=2020&PROFIT_KGV[year]=2020&MARKET_CAPITALIZATION[year]=2020&offset={}".format(page_number*50)
url= requests.get(http,headers=hdr)
table_dfs[page_number]= pd.read_html(url.text)
我尝试使用列将结果连接到单个数据帧,我尝试了这个:
df = pd.concat(table_dfs)
但这会导致错误:
TypeError: cannot concatenate object of type "<class 'list'>";
only pd.Series, pd.DataFrame, and pd.Panel (deprecated) objs are valid
table_dfs[0] 的输出如下所示:
[ WKN Wert Branche \
0 A2PSR2 BIONTECH SE SP.ADRS Biotechnologie
1 A1JA81 PLUG POWER INC. Elektrotechnologie
2 A0B733 Nel Sonstige Energie / R...
Land Gewinn pro Aktie (€) KGV \
0 Deutschland Deutschland NaN
1 USA USA -26.0
2 Norwegen Norwegen 0.0
Marktkapitalisierung (Mio. €) Performance - 6M (%) Performance - 4W (%) \
0 NaN 000 6139.00000
1 NaN 12.76665 43430.00000
2 NaN 3.97097 8434.00000
Chance-Rating (the Screener) Risiko-Rating (the Screener) Unnamed: 11
0 1888 NaN NaN
1 1962 4.0 0.0
2 -705 1.0 0.0 ]
我的目标是将这些数据放入 csv 文件(所有行合并)。
感谢您的帮助
【问题讨论】:
-
你应该使用
*来解压它吗concat(*table_dfs)或者你应该使用for-loop 来分别添加每个项目 -for table in table_dfs: df = df.concat(table)。你应该检查concat()的文档
标签: python dataframe web-scraping concatenation