【问题标题】:Map the multiple columns from pandas dataframe and perform operations on it从 pandas 数据框中映射多个列并对其执行操作
【发布时间】:2019-11-03 17:04:03
【问题描述】:

我有两个数据框,即 df 和 df1。我对数据框 df 的货币转换感兴趣。 在 df 数据框中,我们有 6 列。第一列是日期,其余是各个日期的货币值。我想将这些货币转换成正确的格式。在数据框 df1 我有 2 列,第一列是货币,第二列是运算符。

我有兴趣将相应的运算符应用于 df 的货币值。 例如,在 df 我们有第二列,即“AUD”,我想将所有“AUD”值转换为正确的格式 表示乘以或除以数据帧 df1 中的相应“运算符”列。 这里“AUD”有“乘”运算符,因此所有值都乘以 1。对于“CAD”,应该除以“CAD”列中的 1/“CAD”值。

import pandas as pd    
data = {'Date':['01-01-2019', '01-01-2019', '01-01-2019', '01-01-2019','01-01-2019'],
        'AUD':[98, 98.5, 99, 99.5, 97],
        'BWP':[30,31,33,32,31],
        'CAD':[50,52,51,51,52],
        'BND':[1.01,1.05,1.03,1.02,1.03],
        'COP':[20,21,23,21,22]}    
df = pd.DataFrame(data)

data1 = {'currency':['DZD', 'AUD', 'CNY', 'BND','BRL','BWP','CAD','COP'],
        'operator':['divide', 'multiply', 'divide', 'divide','divide','multiply','divide','divide'],
        }    
df1 = pd.DataFrame(data1)
df

         Date   AUD  BWP  CAD   BND  COP
0  01-01-2019  98.0   30   50  1.01   20
1  02-01-2019  98.5   31   52  1.05   21
2  03-01-2019  99.0   33   51  1.03   23
3  04-01-2019  99.5   32   51  1.02   21
4  05-01-2019  97.0   31   52  1.03   22

df1

  currency code  operator
0           DZD    divide
1           AUD  multiply
2           CNY    divide
3           BND    divide
4           BRL    divide
5           BWP  multiply
6           CAD    divide
7           COP    divide

预期输出:

         Date   AUD  BWP     CAD    BND     COP
0  01-01-2019  98.0   30  0.0200  0.990   0.050
1  02-01-2019  98.5   31  0.0192  0.952   0.047
2  03-01-2019  99.0   33  0.0196  0.970   0.043
3  04-01-2019  99.5   32  0.0196  0.980  20.047
4  05-01-2019  97.0   31  0.0192  0.970   0.045

【问题讨论】:

    标签: python pandas


    【解决方案1】:

    你可以使用:

    n=1
    #Date set like index because you should not perform operations on this column
    df=df.set_index('Date')
    #Selecting columns where divide is necessary
    div_code=df1.loc[df1['operator']=='divide','code']
    
    #Creating a boolean indexing of columns
    col_mask=df.columns.isin(div_code)
    
    #Applying operations to data frame columns
    df[df.columns[col_mask]]=n/df[df.columns[col_mask]]
    df[df.columns[~col_mask]]=n*df[df.columns[~col_mask]]
    
    #putting Date as a column again
    df.reset_index(inplace=True)
    print(df)
    

             Date   AUD  BWP       CAD       BND       COP
    0  01-01-2019  98.0   30  0.020000  0.990099  0.050000
    1  02-01-2019  98.5   31  0.019231  0.952381  0.047619
    2  03-01-2019  99.0   33  0.019608  0.970874  0.043478
    3  04-01-2019  99.5   32  0.019608  0.980392  0.047619
    4  05-01-2019  97.0   31  0.019231  0.970874  0.045455
    

    【讨论】:

      【解决方案2】:

      如果将df1中的数据作为字典存储会更容易:

      operators = df1.set_index('currency')['operator'].to_dict()
      df.apply(lambda col: col if operators.get(col.name, 'multiply') == 'multiply' else 1 / col)
      

      【讨论】:

        【解决方案3】:

        请找到产生预期输出的代码,

        import pandas as pd    
        pd.set_option('display.max_colwidth', 100)
        data = {'Date':['01-01-2019', '01-01-2019', '01-01-2019', '01-01-2019','01-01-2019'],
            'AUD':[98, 98.5, 99, 99.5, 97],
            'BWP':[30,31,33,32,31],
            'CAD':[50.00,52.00,51.00,51.00,52.00],
            'BND':[1.01,1.05,1.03,1.02,1.03],
            'COP':[20.00,21.00,23.00,21.00,22.00]}    
        df = pd.DataFrame(data)
        
        data1 = {'currency':['DZD', 'AUD', 'CNY', 'BND','BRL','BWP','CAD','COP'],
            'operator':['divide', 'multiply', 'divide', 'divide','divide','multiply','divide','divide'],
            }    
        df1 = pd.DataFrame(data1)
        
        for dfcurrency in df.columns:
            for df1currency in df1['currency']: 
                if(dfcurrency == df1currency):   
                    operator = df1[df1['currency'] == df1currency]['operator']
        
                    for j in (operator):
                        if(j == 'multiply'):
                            for k in range(0,df.shape[0]):
                                df[df1currency][k] = df[df1currency][k] *1
                        elif(j == 'divide'):
                            for l in range(0,df.shape[0]):
                                df[df1currency][l] = round(1/df[df1currency][l],4)
        print(df)
        
        
             Date   AUD  BWP     CAD     BND     COP
          0  01-01-2019  98.0   30  0.0200  0.9901  0.0500
          1  01-01-2019  98.5   31  0.0192  0.9524  0.0476
          2  01-01-2019  99.0   33  0.0196  0.9709  0.0435
          3  01-01-2019  99.5   32  0.0196  0.9804  0.0476
          4  01-01-2019  97.0   31  0.0192  0.9709  0.0455
        

        【讨论】:

          【解决方案4】:

          您可以使用operator 创建一个字典来将文本“乘”和“除”替换为运算符:

          import operator as op
          
          operators = { "multiply": op.mul, "divide": op.itruediv }
          

          只获取我们想要将它们映射到的列:

          new_op = df1.iloc[1:,1]
          new_set = new_op.map(ops)
          new_set =pd.Series(new_set) 
          new_set.index -= 1 #for some reason I had to reset the index
          

          以及您列表中的一组新运算符

          new_set
          

          0         <built-in function mul>
          1    <built-in function itruediv>
          2    <built-in function itruediv>
          3    <built-in function itruediv>
          4         <built-in function mul>
          5    <built-in function itruediv>
          6    <built-in function itruediv>
          Name: operator, dtype: object
          

          因此,要将转换后的文本作为运算符应用于您的数据,以下是“AUD”列的示例:

          for i in range(0, len(df)):
          df.loc[i,'AUD'] = new_set[i](1,df.loc[i,'AUD'])
          

          会产生,

                  Date    AUD     BWP     CAD     BND     COP
          0   01-01-2019  98.000000   30  50  1.01    20
          1   01-01-2019  0.010152    31  52  1.05    21
          2   01-01-2019  0.010101    33  51  1.03    23
          3   01-01-2019  0.010050    32  51  1.02    21
          4   01-01-2019  97.000000   31  52  1.03    22
          

          您应该能够将其推广到所有列或为每个国家/地区代码添加新行,例如

          for i in range(0, len(df)):
          df.loc[i,'AUD'] = new_set[i](1,df.loc[i,'AUD'])
          df.loc[i,'BWP'] = new_set[i](1,df.loc[i,'BWP'])
          ....
          

          【讨论】:

            猜你喜欢
            • 2017-09-02
            • 1970-01-01
            • 2021-12-19
            • 1970-01-01
            • 2018-06-14
            • 1970-01-01
            • 2020-09-22
            • 1970-01-01
            • 1970-01-01
            相关资源
            最近更新 更多