【问题标题】:Reshape pandas dataframe with 3 columns用 3 列重塑 pandas 数据框
【发布时间】:2021-07-08 16:31:07
【问题描述】:

我有以下代码:

sentiments = ['He is good', 'He is bad', 'She love her', 'She is fine with it', 'I like going outside', 'Its okay']
positive = [1,0,1,0,1,0]
negative = [0,1,0,0,0,0]
neutral = [0,0,0,1,0,1]
neutral
df = pd.DataFrame({'Sentiments':sentiments, 'Positives':positive, 'Negatives': negative, 'Neutrals':neutral})
df.head()

创建这个:

我希望只有 2 列,1 列带有情绪,其他带有类别,这应该是特定的情绪,即结果应该是:

Sentiment Category
abc positive
xmy negative
poi neutral

【问题讨论】:

    标签: python pandas dataframe nlp sentiment-analysis


    【解决方案1】:

    试试.melt():

    x = df.melt("Sentiments", var_name="Category")
    x = x[x.value != 0].drop(columns="value")
    x["Category"] = x["Category"].str.replace(r"s$", "", regex=True)
    print(x)
    

    打印:

                  Sentiments  Category
    0             He is good  Positive
    2           She love her  Positive
    4   I like going outside  Positive
    7              He is bad  Negative
    15   She is fine with it   Neutral
    17              Its okay   Neutral
    

    【讨论】:

      【解决方案2】:

      假设只有一列值为 1(即假人),请尝试:

      >>> df.set_index("Sentiments").idxmax(axis=1).rename("Category").reset_index()
                   Sentiments   Category
      0            He is good  Positives
      1             He is bad  Negatives
      2          She love her  Positives
      3   She is fine with it   Neutrals
      4  I like going outside  Positives
      5              Its okay   Neutrals
      

      【讨论】:

        【解决方案3】:

        另一种方式:

        df = df.set_index('Sentiments').dot(df.columns[1:]).reset_index(name = 'Category')
        

        【讨论】: