【问题标题】:How to remove the target variable from PCA 2D -biplot如何从 PCA 2D -biplot 中删除目标变量
【发布时间】:2021-05-29 05:33:06
【问题描述】:

我想使用我的数据集(信用卡流失数据集)绘制二维双标图。但是我的图表也包括我的目标变量作为一个特征。如何删除它?

显示标题的数据集示例

我已经附上了我用过的代码

import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler
from bioinfokit.analys import get_data
from bioinfokit.visuz import cluster
from sklearn.decomposition import PCA
# load iris dataset

df=pd.read_csv(r'G:\\Edu\\My academics\\MSc in CS\\3rd sem\\Research\\Python files\\PCA.csv')
df.head(2)
df.loc[df['Attrition_Flag'] == 'Existing Customer', 'Attrition_Flag'] = 0
df.loc[df['Attrition_Flag'] == 'Attrited Customer', 'Attrition_Flag'] = 1
df.Attrition_Flag = df.Attrition_Flag.astype(int)

X = df.iloc[:,0:4]
target = df['Attrition_Flag'].to_numpy()
X.head(2)


X_st =  StandardScaler().fit_transform(X)
pca_out = PCA().fit(X_st)

# component loadings
loadings = pca_out.components_
print(loadings)


# get eigenvalues (variance explained by each PC)  
print(pca_out.explained_variance_)


# get biplot
pca_scores = PCA().fit_transform(X_st)
cluster.biplot(cscore=pca_scores, loadings=loadings, labels=X.columns.values, var1=round(pca_out.explained_variance_ratio_[0]*100, 2),
    var2=round(pca_out.explained_variance_ratio_[1]*100, 2), colorlist=target)

【问题讨论】:

    标签: python pca


    【解决方案1】:

    target = df['Attrition_Flag'].to_numpy()之后删除目标列怎么样?

    df.drop(columns=['Attrition_Flag'], inplace=True)

    【讨论】:

    • 我也试过了,但仍然得到相同的图表
    • 尝试在X_st = StandardScaler().fit_transform(X)之前打印数据框X的内容,看看目标是否仍然存在(即检查是否放置成功)。
    猜你喜欢
    • 2019-07-13
    • 2021-11-28
    • 2021-12-27
    • 1970-01-01
    • 2011-11-16
    • 1970-01-01
    • 2013-01-04
    • 2022-06-28
    • 1970-01-01
    相关资源
    最近更新 更多