用python pandas将一列分成两列答案

【问题标题】：Split one column into two columns with python pandas用python pandas将一列分成两列
【发布时间】：2021-12-07 22:19:16
【问题描述】：

我有一个 df 的城市，显示为：

| id | location         |
|----|------------------|
| 1  | New York (NY)    |
| 2  | Los Angeles (CA) |
| 3  | Houston (TX)     |

我希望使用某种拆分/条带，给我类似的东西

| id | city             | state |
|----|------------------|-------|
| 1  | New York         |   NY  |
| 2  | Los Angeles      |   CA  |
| 3  | Houston          |   TX  |

或者即使是三列，一是原始的，二是由代码制成的。我已经尝试过类似的方法：

df[['city', 'state']] = df['location'].str.split("(", expand=True)
df['state'] = df['state'].str.strip(")")

这行得通，但不是那么多，因为每个城市名称后面都有一个空格，不应该。如果我搜索一个城市，例如：

df[df['city'] == 'Houston']

它不会返回任何内容，但我必须编写如下代码：

df[df['city'] == 'Houston '] # note the empty space after code

给我一些有用的东西，但那样做会让我在进行合并或类似的事情时头疼。

那么，有人有一些技巧可以处理这段代码吗？我在互联网上找不到任何有用的东西。它总是一个简单的分割，或者一个简单的条带。但我相信有一种更智能的模式可以做到这一点。

【问题讨论】：

标签： python pandas dataframe split strip

【解决方案1】：

好吧，为什么不df['city'] = df['city'].strip()？

【讨论】：

不起作用。如果我只是在你写的时候输入，结果是一样的（比如Houston (TX) 将返回Houston (TX)）。如果输入类似.str.strip("()") 的内容，请输入类似Houston (TX 的内容。所以...
嗯，是的，首先你在(上拆分，然后你去掉)，然后你去掉最后的空格

【解决方案2】：

使用str.extract:

df = df.join(df.pop('location').str.extract(r'(.*)\s*\((.*)\)')
               .rename(columns={0: 'location', 1: 'state'}))
print(df)

# Output
   id      location state
0   1     New York     NY
1   2  Los Angeles     CA
2   3      Houston     TX

【讨论】：

那种。我有几行没有City (XY) 的数据。只是City。应用代码时，它会给我一个NaN 作为回报。但是已经帮了我很多了。