【问题标题】:Filtering data out and printing the column header in Python在 Python 中过滤数据并打印列标题
【发布时间】:2021-06-11 11:59:50
【问题描述】:

我有一个 CSV 文件,其中包含一堆不同的行,例如行标签是宽度、长度、高度等,其中大约 50 个整数单元格对应于列中每个单元格下方的正确值。列的标签可以是矩形、正方形等。

对于这个例子来说,矩形缺少宽度,但它有高度和长度,而正方形缺少长度和高度,我想制作一个 python 脚本打印出来
square, length, height
rectangle, width
显然,如果还有 40 多个形状丢失了一些数据 在 csv 文件中,数据是空白的,没有 NULL 我相信它会像下面这样

import pandas as pd
data = pd.read_csv('shapes.csv')
# Filter the data accordingly.
data = data[data['width'] > 0]
data = data[data['row'] == 'width']

我相信这只会在宽度上循环?我希望它检查宽度,如果有宽度整数,GREAT,跳到下一列并查找长度......等等。提前谢谢!

【问题讨论】:

    标签: python pandas csv


    【解决方案1】:
    • 已合成更多数据以更好地演示方法
    • 首先筛选出具有缺失值df.loc[df.isna().any(axis=1)] 的行
    • 然后遍历列,挑选出缺少值的列
    • 最后把这个系列missing打印出来
    import pandas as pd
    import numpy as np
    
    df = pd.DataFrame({"shape": ["Triangle", "Acute triangle", "Equilateral triangle", "Heptagonal triangle", "Isosceles triangle", "Golden Triangle", "Obtuse triangle", "Rational triangle", "Right triangle", "Isosceles right triangle", "Kepler triangle", "Scalene triangle", "Quadrilateral", "Cyclic quadrilateral", "Kite", "Parallelogram", "Rhombus", "Lozenge", "Rhomboid", "Rectangle", "Square", "Tangential quadrilateral", "Trapezoid", "Isosceles trapezoid", "Pentagon", "Hexagon", "Lemoine hexagon", "Heptagon", "Octagon", "Nonagon", "Decagon", "Hendecagon", "Dodecagon", "Tridecagon", "Tetradecagon", "Pentadecagon", "Hexadecagon", "Heptadecagon", "Octadecagon", "Enneadecagon"]})
    df = df.assign(**{c:np.random.choice([np.nan]+list(range(3,10)), len(df)) for c in ["width","height","length"]})
    
    
    missing = df.loc[df.isna().any(axis=1)].apply(
        lambda r: ",".join(
            [r["shape"]] + [c for c in r.drop("shape").index.values 
                            if not np.isnan(r[c])]
        ),
        axis=1,
    )
    
    print("\n".join(missing.tolist()))
    

    输出

    Equilateral triangle,height
    Isosceles right triangle,width,height
    Quadrilateral,width,length
    Tangential quadrilateral,height,length
    Isosceles trapezoid,width,height
    Heptagon,width,height
    Tridecagon,width,length
    Tetradecagon,width,height
    Heptadecagon,height,length
    Octadecagon,width,length
    
    

    【讨论】:

      【解决方案2】:

      考虑一个示例数据框:

      >>> x = pd.DataFrame({'a':[None, 1,2], 'b':[None, None, 3], 'c': [1,2,3]})
      x>>> x
           a    b  c
      0  NaN  NaN  1
      1  1.0  NaN  2
      2  2.0  3.0  3
      

      解决方案:

      >>> x.apply(lambda r: list(r[r.isnull()].index), axis=1)
      0    [a, b]
      1       [b]
      2        []
      dtype: object
      

      【讨论】:

        【解决方案3】:

        这样的事情应该可以完成:

        
        from intertools import compress
        
        data = {
        'Shape': ['sqare', 'rectangle'], 
        'width' : [4, ''], 
        'hight': ['', 4], 
        'length': ['', 7]}
        
        df = pd.DataFrame(data)
        
        def f(row):
            missing = map(lambda x: isinstance(x, str), 
                [row.width, row.hight, row.length])
            params = ['width', 'hight', 'length']
            if missing:
                return row.Shape, list(compress(params, missing))
        
        for idx, row in df.iterrows():
            print(f(row))
        
        # Result
        ('sqare', ['hight', 'length'])
        ('rectangle', ['width'])
        

        【讨论】:

          猜你喜欢
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          • 2020-08-05
          • 1970-01-01
          • 1970-01-01
          • 1970-01-01
          相关资源
          最近更新 更多