【问题标题】:use values in dictionary to replace values in column使用字典中的值替换列中的值
【发布时间】:2020-01-06 19:29:20
【问题描述】:
import pandas as pd
df= pd.DataFrame({'Data':['Hey this is 123456 Jonny B Good',
                              'This is Jonny B Good at 511-233-1137',
                                  'Wow that is Alice N Wonderland A999b',
                                  'Yes hi: Mick E Mouse 1A25629Q88 or ',
                            'Bye Mick E Mouse A13B ok was seen on '], 
                          'E_ID': ['E11','E11', 'E22', 'E33', 'E33'],
                           'N_ID' : ['111', '112', '211', '311', '312'],
                           'Name' : ['JONNY B GOOD', 'JONNY B GOOD', 
                                      'ALICE N WONDERLAND',
                                      'MICK E MOUSE', 'MICK E MOUSE'],        
                          })

df
                      Data                 E_ID N_ID    Name
0   Hey this is 123456 Jonny B Good         E11 111 JONNY B GOOD
1   This is Jonny B Good at 511-233-1137    E11 112 JONNY B GOOD
2   Wow that is Alice N Wonderland A999b    E22 211 ALICE N WONDERLAND
3   Yes hi: Mick E Mouse 1A25629Q88 or      E33 311 MICK E MOUSE
4   Bye Mick E Mouse A13B ok was seen on    E33 312 MICK E MOUSE

如上所示,我有一个示例 df。我也有示例字典d,如下所示

d = {'E11': ['Jonny',
  'B',
  'Good',
   'Jonny',
   'B',
  'Good',
   '123456',
    '511-233-1137'],

'E22': ['Alice',
  'N',
  'Wonderland',
  'A999b'],

'E33': ['Mick', 
        'E' , 
        'Mouse',
        'Mick', 
        'E' , 
        'Mouse',
        '1A25629Q88',
  'A13B',]} 

我想使用来自d 的值,例如Jonny 更改Data 中的对应值。所以例如Jonny 行中的 0 将变为 @@@

为此,我查看了Remap values in pandas column with a dicthow to replace column values with dictionary keys in pandas,但它们并没有多大帮助。我想我需要使用这样的东西

 df['New'] = df['Data'].str.replace(d[value], '@@@')

我希望我的输出看起来像这样

     Data   E_ID N_ID Name  New
0                           Hey this is @@@ @@@ @@@ @@@             
1                           This is @@@  @@@  @@@  at @@@   
2                           Wow that is @@@  @@@  @@@  @@@  
3                           Yes hi: @@@  @@@  @@@  @@@  or      
4                           Bye @@@  @@@  @@@  @@@  ok was seen on

我需要做什么才能得到这个输出?

【问题讨论】:

    标签: python-3.x string pandas dictionary replace


    【解决方案1】:

    d 转换为字典d1 的字典。 pivot 使用d1bfillE_ID 设置为列和replace 并选择第一列。最后,分配回df.Data

    d1 = {k: {x: '@@@' for x in v} for k, v in d.items()}
    df['Data'] = (df.pivot(columns='E_ID', values='Data')
                    .replace(d1, regex=True).bfill(1).iloc[:,0])
    
    Out[619]:
                                      Data E_ID N_ID                Name
    0  Hey this is @@@ @@@ @@@ @@@          E11  111  JONNY B GOOD
    1  This is @@@ @@@ @@@ at @@@           E11  112  JONNY B GOOD
    2  Wow that is @@@ @@@ @@@ @@@          E22  211  ALICE N WONDERLAND
    3  Yes hi: @@@ @@@ @@@ @@@ or           E33  311  MICK E MOUSE
    4  Bye @@@ @@@ @@@ @@@ ok was seen on   E33  312  MICK E MOUSE
    

    额外:正如您在for-loop 格式中询问的相当于上面的dictcomp:

    d1 = {}
    for k, v in d.items():
        y = {}
        for x in v:
            y[x] = '@@@'
        d1[k] = y
    
    
    In [805]: d1
    Out[805]:
    {'E11': {'Jonny': '@@@',
      'B': '@@@',
      'Good': '@@@',
      '123456': '@@@',
      '511-233-1137': '@@@'},
     'E22': {'Alice': '@@@', 'N': '@@@', 'Wonderland': '@@@', 'A999b': '@@@'},
     'E33': {'Mick': '@@@',
      'E': '@@@',
      'Mouse': '@@@',
      '1A25629Q88': '@@@',
      'A13B': '@@@'}}
    

    【讨论】:

    • @Andy_L {k: {x: '@@@' for x in v} for k, v in d.items()} 在不理解的情况下会是什么样子?
    • 我有类似的东西,但我错过了一些东西d1 = {} for k,v in d.items(): for x in v: d1[x] = '@@@'
    • @510:for-loop 版本需要正确的标签,所以我编辑了答案以在最后添加它。请检查我的编辑。
    【解决方案2】:

    您可以生成和使用正则表达式,如下所示:

    df['New']= df['Data']
    for key, value in d.items():   
        regex='({alternatives})'.format(alternatives='|'.join(value))
        df.loc[df['E_ID']==key, 'New']= df.loc[df['E_ID']==key, 'New'].str.replace(regex, '@@@')
    

    结果如下:

    Out[115]: 
                                        Data E_ID N_ID                Name                                  New
    0        Hey this is 123456 Jonny B Good  E11  111        JONNY B GOOD          Hey this is @@@ @@@ @@@ @@@
    1   This is Jonny B Good at 511-233-1137  E11  112        JONNY B GOOD           This is @@@ @@@ @@@ at @@@
    2   Wow that is Alice N Wonderland A999b  E22  211  ALICE N WONDERLAND          Wow that is @@@ @@@ @@@ @@@
    3    Yes hi: Mick E Mouse 1A25629Q88 or   E33  311        MICK E MOUSE          Yes hi: @@@ @@@ @@@ @@@ or 
    4  Bye Mick E Mouse A13B ok was seen on   E33  312        MICK E MOUSE  Bye @@@ @@@ @@@ @@@ ok was seen on 
    

    【讨论】:

    • 你能解释一下regex='({alternatives})'.format(alternatives='|'.join(value))在做什么吗?
    猜你喜欢
    • 1970-01-01
    • 1970-01-01
    • 2018-02-08
    • 2020-01-07
    • 2021-08-31
    • 1970-01-01
    • 1970-01-01
    • 2020-07-11
    • 2010-11-07
    相关资源
    最近更新 更多