【问题标题】:Python - Creating a for loop to build a single csv file with multiple dataframesPython - 创建一个 for 循环来构建具有多个数据帧的单个 csv 文件
【发布时间】:2020-09-27 00:02:40
【问题描述】:

我是 python 新手,我尝试了各种方法来学习基础知识。我目前坚持的一件事是 for 循环。我有以下代码,我很肯定它可以使用循环更有效地构建,但我不确定具体如何。

import pandas as pd
import numpy as np
url1 = 'https://www.cbssports.com/nfl/stats/player/receiving/nfl/regular/qualifiers/?page=1'
url2 = 'https://www.cbssports.com/nfl/stats/player/receiving/nfl/regular/qualifiers/?page=2'
url3 = 'https://www.cbssports.com/nfl/stats/player/receiving/nfl/regular/qualifiers/?page=3'

df1 = pd.read_html(url1)
df1[0].to_csv ('NFL_Receiving_Page1.csv', index=False) #index false gets rid of index listing that appears as the very first column in the csv

df2 = pd.read_html(url2)
df2[0].to_csv ('NFL_Receiving_Page2.csv', index=False) #index false gets rid of index listing that appears as the very first column in the csv

df3 = pd.read_html(url3)
df3[0].to_csv ('NFL_Receiving_Page3.csv', index=False) #index false gets rid of index listing that appears as the very first column in the csv

df_receiving_agg = pd.concat([df1[0], df2[0], df3[0]])
df_receiving_agg.to_csv('NFL_Receiving_Combined.csv', index=False) #index false gets rid of index listing that appears as the very first column in the csv

我最终试图将上述 URL 中的数据合并到 csv 文件中的单个表中。

【问题讨论】:

    标签: python-3.x pandas dataframe for-loop


    【解决方案1】:

    你可以试试这个:

    urls = [url1,url2,url3]
    df_receiving_agg = pd.DataFrame()
    for url in urls:
        df = pd.read_html(url)
        df_receiving_agg = pd.concat([df_receiving_agg, df])
    df_receiving_agg.to_csv('filepath.csv',index=False)
    

    【讨论】:

      【解决方案2】:

      你可以这样做:

      base_url = 'https://www.cbssports.com/nfl/stats/player/receiving/nfl/regular/qualifiers/?page='
      dfs = []
      for page in range(1, 4):
          url = f'{base_url}{page}'
          df = pd.read_html(url)
          df.to_csv(f'NFL_Receiving_Page{page}.csv', index=False)
          dfs.append(df)
      
      df_receiving_agg = pd.concat(dfs)
      df_receiving_agg.to_csv('NFL_Receiving_Combined.csv', index=False)
      

      【讨论】:

        猜你喜欢
        • 1970-01-01
        • 2023-03-19
        • 1970-01-01
        • 1970-01-01
        • 2022-08-06
        • 1970-01-01
        • 1970-01-01
        • 2021-07-17
        • 1970-01-01
        相关资源
        最近更新 更多