【发布时间】:2020-03-19 07:22:35
【问题描述】:
我有一个数据集,其中有两列:Industry Classifications 和 Stock Tickers。一家公司在其Industry Classification 列中有多个标签,由; 分隔符分隔。我只想选择第一个标签。
import pandas as pd
training = pd.read_excel('Training Data.xlsx')
当前文件结构:(这是该列的示例)
Industry Classifications
Beauty Care Products (Primary); Consumer Staples (Primary); Hair Care Products (Primary);
Catalog Flowers, Gifts and Novelties (Primary); Catalog Hobbies, Games and Toy Retail (Primary);
Information Technology (Primary); Internet Software and Services (Primary);
Casualty (Primary); Financials (Primary); Fire and Marine Insurance (Primary);
Commercial and Professional Services (Primary); Commercial Services and Supplies (Primary);
Banks (Primary); Banks (Primary); Diversified Banks (Primary); Financials (Primary);
Application Software (Primary); Information Technology (Primary); Software (Primary);
Commercial and Professional Services (Primary); Consulting Services (Primary); Industrials (Primary);
Banks (Primary); Banks (Primary); Financials (Primary); National and State Commercial Banks (Primary);
预期输出:
Industry Classifications
Beauty Care Products (Primary)
Catalog Flowers
Information Technology (Primary)
Casualty (Primary)
Commercial and Professional Services (Primary)
Banks (Primary); Banks (Primary)
Application Software (Primary)
Commercial and Professional Services (Primary)
Banks (Primary); Banks (Primary)
【问题讨论】:
-
这是个骗子,试试 df['col'].str.split(';').str[0]