【问题标题】:How to find the index of the point closest to K means cluster centers using sklearn?如何使用sklearn找到最接近K的点的索引意味着聚类中心?
【发布时间】:2021-11-08 20:19:01
【问题描述】:

我使用 python 的 sklearn 包进行 K-means 聚类。到目前为止,我可以使用以下代码获取集群中心的坐标。

import numpy as np
from sklearn.cluster import KMeans

p50 = np.load('tsnep400.npy')
kmeans = KMeans(n_clusters=50).fit(p50) 
np.savetxt('kmeans_50clusters_centers_tsnep400', kmeans.cluster_centers_, fmt='%1.3f')
np.savetxt('kmeans_50clusters_tsnep400.dat', kmeans.labels_, fmt='%1.1d')

centroids = {i: np.where(kmeans.labels_ == i)[0] for i in range(kmeans.n_clusters)}
np.save('kmeans_50clusters_memebers_tsnep400.npy',centroids)

如何找到离聚类中心最近的点的索引?

【问题讨论】:

    标签: python scikit-learn k-means centroid


    【解决方案1】:

    根据 scikit-learn 文档,.labels_ 属性包含每个点的标签,按它们的索引。因此,您可以使用它将每个点分组到一个集群中,然后计算到每个集群中心的距离。您可以为此使用以下代码:

    from scipy.spatial.distance import euclidean
    
    # Loop over all clusters and find index of closest point to the cluster center and append to closest_pt_idx list.
    closest_pt_idx = []
    for iclust in range(kmeans.n_clusters):
        # get all points assigned to each cluster:
        cluster_pts = p50[kmeans.labels_ == iclust]
        # get all indices of points assigned to this cluster:
        cluster_pts_indices = np.where(kmeans.labels_ == iclust)[0]
    
        cluster_cen = kmeans.cluster_centers_[iclust]
        min_idx = np.argmin([euclidean(p50[idx], cluster_cen) for idx in cluster_pts_indices])
        
        # Testing:    
        print('closest point to cluster center: ', cluster_pts[min_idx])
        print('closest index of point to cluster center: ', cluster_pts_indices[min_idx])
        print('  ', p50[cluster_pts_indices[min_idx]])
        closest_pt_idx.append(cluster_pts_indices[min_idx])
    

    【讨论】:

      猜你喜欢
      • 2020-08-17
      • 2021-02-14
      • 2021-07-11
      • 2013-05-17
      • 2018-09-30
      • 2020-09-06
      • 2019-04-23
      • 2018-01-22
      • 1970-01-01
      相关资源
      最近更新 更多