【问题标题】:How to combine the dataframes in a for loop如何在 for 循环中组合数据帧
【发布时间】:2017-12-06 09:33:33
【问题描述】:

我已经使用以下代码完成了网络抓取:

Number = soup.find('th',text = "Number of samples").find_next_sibling("td").text


for x in range(1,int(Number)+1):            #loop of function to parse the data format I want
    item = item_text.split('tooltip')[x].split("class")[0].replace('"','').replace(',','').replace(':','').replace("<br>"," ").replace("/","").replace("\\","")
    #print(item) 

    TESTDATA=StringIO(item)

    df = pd.read_csv(TESTDATA, sep=" ",header=None) 
    print(df)

现在结果如下:

                0     1   2      3    4         5   6      7     8    9   \
0  TCGA-KK-A7B3-01A  Male NaN  Stage  not  reported NaN  Alive  FPKM  5.5  
       10    11   12    13      14
0  Living  days  899  (2.5  years)
               0     1    2      3    4         5   6      7     8     9   \
0  TCGA-G9-6347-01A  Male NaN  Stage  not  reported NaN  Alive  FPKM  14.2 
       10    11    12    13      14
0  Living  days  2089  (5.7  years) 
...

现在的问题是如何将这些单独的数据帧组合成一个数据帧,以便更容易保存到整个 csv 文件?

谢谢

【问题讨论】:

    标签: python csv dataframe web-scraping


    【解决方案1】:

    使用pd.concat

    all_dataframes = []
    
    for x in range(1,int(Number)+1):
        ....
    
        df = pd.read_csv(TESTDATA, sep=" ",header=None) 
        all_dataframes.append(df)
    
    concat_df = pd.concat(all_dataframes)
    

    【讨论】:

      猜你喜欢
      • 2021-10-15
      • 2021-09-26
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2019-10-15
      • 1970-01-01
      • 2022-01-02
      相关资源
      最近更新 更多