【发布时间】:2019-10-10 15:26:00
【问题描述】:
我遇到了类似的问题,但无法解决我的问题。我的数据框的一部分如下所示:
Index Character Top 10 by edits Top 10 by added text
780 NaN Viradha David G Brault · 8 (40%) David G Brault · 1,915 (81.4%)
781 NaN Viradha Wiki-uk · 4 (20%) Risingstar12 · 213 (9.1%)
782 NaN Viradha Rich Farmbrough · 1 (5%) Woohookitty · 44 (1.9%)
783 NaN Viradha Woohookitty · 1 (5%) World8115 · 41 (1.7%)
784 NaN Viradha World8115 · 1 (5%) Rich Farmbrough · 33 (1.4%)
785 NaN Viradha 141.213.55.83 · 1 (5%) SmackBot · 31 (1.3%)
786 NaN Viradha Omnipaedista · 1 (5%) Citation bot 1 · 27 (1.1%)
787 NaN Viradha Jayarathina · 1 (5%) Omnipaedista · 20 (0.9%)
788 NaN Viradha Risingstar12 · 1 (5%) Wiki-uk · 17 (0.7%)
789 NaN Viradha 203.142.46.153 · 1 (5%) 203.142.46.153 · 11 (0.5%)
现在我想通过匹配中间的点(“space-dot-space”)来拆分“Top 10 by edits”和“Top 10 by added text”这两列。为了拆分第一列,我尝试了:
s = df["Top 10 by edits"].str.split(" . ", n = 1, expand = True)
df["Top 10 by edits"] = s[0]
df["Edits contribution"] = s[1]
但是,这会导致以下数据框:
Index Character Top 10 by edits Top 10 by added text Edits contribution
780 NaN Viradha David David G Brault · 1,915 (81.4%) Brault · 8 (40%)
781 NaN Viradha Wiki-uk Risingstar12 · 213 (9.1%) 4 (20%)
782 NaN Viradha Rich Farmbrough Woohookitty · 44 (1.9%) 1 (5%)
783 NaN Viradha Woohookitty World8115 · 41 (1.7%) 1 (5%)
784 NaN Viradha World8115 Rich Farmbrough · 33 (1.4%) 1 (5%)
785 NaN Viradha 141.213.55.83 SmackBot · 31 (1.3%) 1 (5%)
786 NaN Viradha Omnipaedista Citation bot 1 · 27 (1.1%) 1 (5%)
787 NaN Viradha Jayarathina Omnipaedista · 20 (0.9%) 1 (5%)
788 NaN Viradha Risingstar12 Wiki-uk · 17 (0.7%) 1 (5%)
789 NaN Viradha 203.142.46.153 203.142.46.153 · 11 (0.5%) 1 (5%)
可以看出,第一行没有在. 处拆分。我也尝试了\. 和r" . ",但没有什么能满足我的需要。究竟是什么问题?提前致谢。
【问题讨论】: