【问题标题】:CSV File-Pandas dataFrame column separationCSV File-Pandas dataFrame 列分离
【发布时间】:2020-11-19 11:10:15
【问题描述】:

我有这个数据集,

如您所见,第 3 列(Age at,总统任期开始)和第 4 列(Age at,总统任期结束)已合并 我怎样才能将它们分开? 提前致谢。

【问题讨论】:

  • 这些项目在原始 CSV 文件中是否有分隔符?如果是这样,那么在您创建 Pandas 数据框时它们应该已经分开了。

标签: python pandas dataframe jupyter-notebook


【解决方案1】:

如果可能,用大写分割:

data = [{'Age atend of presidency': '65 years, 10 daysMar 4, 1797'}, 
        {'Age atend of presidency': '65 years, 10 daysMar 4, 1797'}
         ,{'Age atend of presidency': '65 years, 10 daysMar 4, 1797'}]

df = pd.DataFrame(data)

df[['age1','end2']] = df['Age atend of presidency'].str.split("([A-Z][^A-Z]*)", expand=True).iloc[:, :-1]
print (df)
            Age atend of presidency               age1         end2
0  65 years, 10 daysMar 4, 1797  65 years, 10 days  Mar 4, 1797
1  65 years, 10 daysMar 4, 1797  65 years, 10 days  Mar 4, 1797
2  65 years, 10 daysMar 4, 1797  65 years, 10 days  Mar 4, 1797

或者你可以通过days拆分:

df[['age1','end2']] = df['Age atend of presidency'].str.split("days", expand=True)
df['age1'] += 'days'
print (df)
        Age atend of presidency               age1         end2
0  65 years, 10 daysMar 4, 1797  65 years, 10 days  Mar 4, 1797
1  65 years, 10 daysMar 4, 1797  65 years, 10 days  Mar 4, 1797
2  65 years, 10 daysMar 4, 1797  65 years, 10 days  Mar 4, 1797

或者:

df[['age1','a', 'end2']] = df['Age atend of presidency'].str.split("(days)", expand=True)
df['age1'] += df.pop('a')

【讨论】:

    【解决方案2】:

    如果是 STR 你可以使用 X.split(',')。例如:

     df['Age atend of presidency'].apply(lambda x: x.split(','))
    

    如果不是STR

    df['Age atend of presidency'].apply(lambda x: str(x).split(','))
    

    df['Age'] = df['Age atend of presidency'].apply(lambda x: x.split(',')[0])
    df['days'] = df['Age atend of presidency'].apply(lambda x: x.split(',')[1])
    df['date'] = df['Age atend of presidency'].apply(lambda x: x.split(',')[2])
    

    【讨论】:

      猜你喜欢
      • 2018-06-08
      • 2020-04-29
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2015-03-06
      • 1970-01-01
      • 2020-11-30
      • 1970-01-01
      相关资源
      最近更新 更多