【问题标题】:Iterate over a dataframe to print the index and column and value遍历数据框以打印索引、列和值
【发布时间】:2017-07-09 05:55:37
【问题描述】:

首先,我还是 Python 的新手,并且已经搜索过并且无法找到任何地方如何做到这一点(从一个新人的角度来看)......

我有一条蟒蛇

我需要打印出索引、列名和值。

假设我有以下数据框

EAT     DAILY  WEEKLY  YEARLY
Fruit                        
APPLE       2       5     200
ORANGE      1       3     100
BANANA      1       4     150
PEAR        0       1      40

我需要将它打印出来,这样我会得到类似下面的内容,以便它遍历数据帧中的每一行。

Eat Apple Daily at least 2
Eat Apple Weekly at least 5
Eat Apple Yearly at least 200
Eat Orange Daily at least 1
Eat Orange Weekly at least 3
Eat Orange Yearly at least 100
..
...
....

我尝试了各种组合,但仍在学习,因此不胜感激。

到目前为止我已经尝试过

for row in test.iterrows():
    index, data = row
    print index , (data['column1'])
    print index , (data['column2'])
    print index , (data['column3'])

这会给我索引和值,但不是列,而且我希望它能够迭代,无论使用了多少列或行。另外,我仍然需要能够插入需要动态的文本...

【问题讨论】:

    标签: python loops pandas dataframe printing


    【解决方案1】:

    考虑使用pandas.DataFrame.to_string 的非循环解决方案:

    sdf = df.stack().reset_index(name='VALUE')
    sdf['Output'] = sdf.apply(lambda row: "EAT {} {} at least {}".\
                              format(row['Fruit'], row['EAT'], row['VALUE']), axis=1)
    
    # PRINT TO CONSOLE
    print(sdf[['Output']].to_string(header=False, index=False, justify='left'))
    
    # WRITE TO TEXT
    with open('Output.txt', 'w') as f:
        f.write(sdf[['Output']].to_string(header=False, index=False, justify='left'))
    
    # EAT APPLE DAILY at least 2
    #    EAT APPLE WEEKLY at least 5
    #  EAT APPLE YEARLY at least 200
    #    EAT ORANGE DAILY at least 1
    #   EAT ORANGE WEEKLY at least 3
    # EAT ORANGE YEARLY at least 100
    #    EAT BANANA DAILY at least 1
    #   EAT BANANA WEEKLY at least 4
    # EAT BANANA YEARLY at least 150
    #      EAT PEAR DAILY at least 0
    #     EAT PEAR WEEKLY at least 1
    #    EAT PEAR YEARLY at least 40
    

    您会注意到该方法当前存在reported bug 的理由问题。当然,您通常可以使用基本 Python 的字符串处理(strip()replace())进行补救。

    【讨论】:

      【解决方案2】:

      字符串系列

      f = 'Eat {Fruit} {EAT} at least {value}'.format
      df.stack().reset_index(name='value').apply(lambda x: f(**x), 1)
      
      0         Eat APPLE DAILY at least 2
      1        Eat APPLE WEEKLY at least 5
      2      Eat APPLE YEARLY at least 200
      3        Eat ORANGE DAILY at least 1
      4       Eat ORANGE WEEKLY at least 3
      5     Eat ORANGE YEARLY at least 100
      6        Eat BANANA DAILY at least 1
      7       Eat BANANA WEEKLY at least 4
      8     Eat BANANA YEARLY at least 150
      9          Eat PEAR DAILY at least 0
      10        Eat PEAR WEEKLY at least 1
      11       Eat PEAR YEARLY at least 40
      dtype: object
      

      打印出来

      for idx, value in df.stack().iteritems():
          print('Eat {0[0]} {0[1]} at least {1}'.format(idx, value))
      
      Eat APPLE DAILY at least 2
      Eat APPLE WEEKLY at least 5
      Eat APPLE YEARLY at least 200
      Eat ORANGE DAILY at least 1
      Eat ORANGE WEEKLY at least 3
      Eat ORANGE YEARLY at least 100
      Eat BANANA DAILY at least 1
      Eat BANANA WEEKLY at least 4
      Eat BANANA YEARLY at least 150
      Eat PEAR DAILY at least 0
      Eat PEAR WEEKLY at least 1
      Eat PEAR YEARLY at least 40
      

      【讨论】:

      • 我遇到了堆栈和中间项,但不确定将我带到我需要的位置所需的语法。完全按照我的需要工作。
      【解决方案3】:

      您可以使用stackSeriesMultiIndex 一起使用,然后通过Series.iteritemsformat 进行迭代:

      test = test.stack()
      print (test)
      Fruit   EAT   
      APPLE   DAILY       2
              WEEKLY      5
              YEARLY    200
      ORANGE  DAILY       1
              WEEKLY      3
              YEARLY    100
      BANANA  DAILY       1
              WEEKLY      4
              YEARLY    150
      PEAR    DAILY       0
              WEEKLY      1
              YEARLY     40
      dtype: int64
      
      for index, data in test.iteritems():
          print (('Eat {} {} at least {}').format(index[0], index[1], data))
      
      Eat APPLE DAILY at least 2
      Eat APPLE WEEKLY at least 5
      Eat APPLE YEARLY at least 200
      Eat ORANGE DAILY at least 1
      Eat ORANGE WEEKLY at least 3
      Eat ORANGE YEARLY at least 100
      Eat BANANA DAILY at least 1
      Eat BANANA WEEKLY at least 4
      Eat BANANA YEARLY at least 150
      Eat PEAR DAILY at least 0
      Eat PEAR WEEKLY at least 1
      Eat PEAR YEARLY at least 40
      

      但是如果真的需要DataFrame添加reset_index然后循环DataFrame.iterrows:

      test = test.stack().reset_index(name='VAL')
      print (test)
           Fruit     EAT  VAL
      0    APPLE   DAILY    2
      1    APPLE  WEEKLY    5
      2    APPLE  YEARLY  200
      3   ORANGE   DAILY    1
      4   ORANGE  WEEKLY    3
      5   ORANGE  YEARLY  100
      6   BANANA   DAILY    1
      7   BANANA  WEEKLY    4
      8   BANANA  YEARLY  150
      9     PEAR   DAILY    0
      10    PEAR  WEEKLY    1
      11    PEAR  YEARLY   40
      
      for index, data in test.iterrows():
          print (('Eat {} {} at least {}').format(data['Fruit'], data['EAT'], data['VAL']))
      
      Eat APPLE DAILY at least 2
      Eat APPLE WEEKLY at least 5
      Eat APPLE YEARLY at least 200
      Eat ORANGE DAILY at least 1
      Eat ORANGE WEEKLY at least 3
      Eat ORANGE YEARLY at least 100
      Eat BANANA DAILY at least 1
      Eat BANANA WEEKLY at least 4
      Eat BANANA YEARLY at least 150
      Eat PEAR DAILY at least 0
      Eat PEAR WEEKLY at least 1
      Eat PEAR YEARLY at least 40
      

      【讨论】:

        猜你喜欢
        • 2011-10-07
        • 2017-04-26
        • 2016-08-20
        • 2021-12-16
        • 1970-01-01
        • 2014-02-03
        • 2016-05-29
        • 1970-01-01
        • 1970-01-01
        相关资源
        最近更新 更多