【问题标题】:Split a column in pandas dataframe based on dot根据点拆分熊猫数据框中的一列
【发布时间】:2019-10-10 15:26:00
【问题描述】:

我遇到了类似的问题,但无法解决我的问题。我的数据框的一部分如下所示:

     Index Character           Top 10 by edits            Top 10 by added text
780    NaN   Viradha  David G Brault · 8 (40%)  David G Brault · 1,915 (81.4%)
781    NaN   Viradha         Wiki-uk · 4 (20%)       Risingstar12 · 213 (9.1%)
782    NaN   Viradha  Rich Farmbrough · 1 (5%)         Woohookitty · 44 (1.9%)
783    NaN   Viradha      Woohookitty · 1 (5%)           World8115 · 41 (1.7%)
784    NaN   Viradha        World8115 · 1 (5%)     Rich Farmbrough · 33 (1.4%)
785    NaN   Viradha    141.213.55.83 · 1 (5%)            SmackBot · 31 (1.3%)
786    NaN   Viradha     Omnipaedista · 1 (5%)      Citation bot 1 · 27 (1.1%)
787    NaN   Viradha      Jayarathina · 1 (5%)        Omnipaedista · 20 (0.9%)
788    NaN   Viradha     Risingstar12 · 1 (5%)             Wiki-uk · 17 (0.7%)
789    NaN   Viradha   203.142.46.153 · 1 (5%)      203.142.46.153 · 11 (0.5%)

现在我想通过匹配中间的点(“space-dot-space”)来拆分“Top 10 by edits”和“Top 10 by added text”这两列。为了拆分第一列,我尝试了:

s = df["Top 10 by edits"].str.split(" . ", n = 1, expand = True)

df["Top 10 by edits"]  = s[0]
df["Edits contribution"] = s[1]

但是,这会导致以下数据框:

     Index Character  Top 10 by edits            Top 10 by added text Edits contribution
780    NaN   Viradha            David  David G Brault · 1,915 (81.4%)   Brault · 8 (40%)
781    NaN   Viradha          Wiki-uk       Risingstar12 · 213 (9.1%)            4 (20%)
782    NaN   Viradha  Rich Farmbrough         Woohookitty · 44 (1.9%)             1 (5%)
783    NaN   Viradha      Woohookitty           World8115 · 41 (1.7%)             1 (5%)
784    NaN   Viradha        World8115     Rich Farmbrough · 33 (1.4%)             1 (5%)
785    NaN   Viradha    141.213.55.83            SmackBot · 31 (1.3%)             1 (5%)
786    NaN   Viradha     Omnipaedista      Citation bot 1 · 27 (1.1%)             1 (5%)
787    NaN   Viradha      Jayarathina        Omnipaedista · 20 (0.9%)             1 (5%)
788    NaN   Viradha     Risingstar12             Wiki-uk · 17 (0.7%)             1 (5%)
789    NaN   Viradha   203.142.46.153      203.142.46.153 · 11 (0.5%)             1 (5%)

可以看出,第一行没有在. 处拆分。我也尝试了\.r" . ",但没有什么能满足我的需要。究竟是什么问题?提前致谢。

【问题讨论】:

    标签: python pandas split


    【解决方案1】:

    “添加文本前 10 名”列中的点不是句点,而是点字符,而您试图在代码中按句点进行分割。尝试更改其中一个以匹配。

    【讨论】:

    • 那我该如何使用点字符呢?
    • 如果需要,您可以只使用复制粘贴:'·' vs '。 '。只需将点字符复制到您的代码中
    猜你喜欢
    • 2016-11-16
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 1970-01-01
    • 2022-01-17
    • 2022-07-20
    • 2013-06-23
    相关资源
    最近更新 更多