【问题标题】:Sorting 2D array in new 2D array (K-means clustering)在新的二维数组中对二维数组进行排序(K-means 聚类)
【发布时间】:2020-10-07 16:36:19
【问题描述】:

作为输入,我有一个二维数组 PointXY[][] clusters,如下所示:

[[23.237633,53.78671], [69.15293,17.138134], [23.558687,45.70517]] . . .
[[47.851738,16.525734], [47.802097,16.689285], [47.946404,16.732542]]
[[47.89601,16.638218], [47.833263,16.478987], [47.88203,16.45793]]
[[47.75438,16.549816], [47.915512,16.506475], [47.768547,16.67624]]
.
.
.

所以数组中的元素是PointXY[] 类型,定义如下:

public PointXY(float x, float y) {
    this.x = x;
    this.y = y;
}

我想做的是对输入集群进行排序并将排序写入数组PointXY[][] clustersSorted,以便将集群中的每个PointXY(除了第一行)与第一行的每个值进行比较。 换句话说,下图中蓝色集合中的元素与红色圈出的每个值进行比较。 所以我想比较从 2. 开始的每个值到第一行中的每个值。

通过调用欧几里得函数进行比较。

public double euclidian(PointXY other) {
    return Math.sqrt(Math.pow(this.x - other.x, 2)
            + Math.pow(this.y - other.y, 2));
}

输出应该是相同类型的二维数组,但在每个红色圆圈点下(在输出数组的同一位置保持相同)应该是蓝色部分中最接近(欧几里得距离)红色圆圈值的点.

所以它是 K-Means 聚类的数据结构,因此每个聚类是一列(绿色圆圈),第一个点是聚类的中心(红色圆圈),所有其他点(黄色圆圈)在列是分配给中心的点。

所以问题是如何遍历输入数组簇,比较所描述的值,并将它们写入数组clustersSorted

我想计算蓝色圆圈集中每个点之间的欧几里德距离,每个值都用红色圆圈圈出。然后根据最小欧几里得距离对它们进行排序。因此,在输出数组clustersSorted 中,蓝色圆圈集中的每个点都将位于红色圆圈中的最近点之下。

【问题讨论】:

    标签: java sorting multidimensional-array k-means


    【解决方案1】:

    查看另一个答案:Grouping all points starting from the second row by the nearest center point from the first row


    要按列对二维对象数组进行排序,您可以先转置此数组并对每一行进行排序,然后转置将其返回。要使用自定义比较器按与第一个元素的距离对从第二个元素开始的一行元素进行排序 - 您可以使用 Arrays.sort(T[],int,int,Comparator) 方法:

    int m = 4;
    int n = 3;
    PointXY[][] clusters = {
            {new PointXY(23.237633, 53.78671),
                    new PointXY(69.15293, 17.138134),
                    new PointXY(23.558687, 45.70517)},
            {new PointXY(47.851738, 16.525734),
                    new PointXY(47.802097, 16.689285),
                    new PointXY(47.946404, 16.732542)},
            {new PointXY(47.89601, 16.638218),
                    new PointXY(47.833263, 16.478987),
                    new PointXY(47.88203, 16.45793)},
            {new PointXY(47.75438, 16.549816),
                    new PointXY(47.915512, 16.506475),
                    new PointXY(47.768547, 16.67624)}};
    
    // transpose a matrix
    PointXY[][] transposed = new PointXY[n][m];
    IntStream.range(0, n).forEach(i ->
            IntStream.range(0, m).forEach(j ->
                    transposed[i][j] = clusters[j][i]));
    
    // sort each line starting from the second
    // element by the distance from the first element
    Arrays.stream(transposed).forEach(cluster ->
            Arrays.sort(cluster, 1, cluster.length,
                    Comparator.comparingDouble(point ->
                            Math.sqrt(Math.pow(cluster[0].x - point.x, 2)
                                    + Math.pow(cluster[0].y - point.y, 2)))));
    
    // transpose a matrix back
    PointXY[][] clustersSorted = new PointXY[m][n];
    IntStream.range(0, m).forEach(i ->
            IntStream.range(0, n).forEach(j ->
                    clustersSorted[i][j] = transposed[j][i]));
    
    // output
    Arrays.stream(clustersSorted).map(Arrays::toString).forEach(System.out::println);
    
    [23.237633,53.78671, 69.15293,17.138134, 23.558687,45.70517]
    [47.75438,16.549816, 47.915512,16.506475, 47.768547,16.67624]
    [47.89601,16.638218, 47.833263,16.478987, 47.946404,16.732542]
    [47.851738,16.525734, 47.802097,16.689285, 47.88203,16.45793]
    

    PointXY 类应如下所示:

    public class PointXY {
        double x, y;
    
        public PointXY(double x, double y) {
            this.x = x;
            this.y = y;
        }
    
        @Override
        public String toString() {
            return x + "," + y;
        }
    }
    

    【讨论】:

      【解决方案2】:

      查看另一个答案:Sorting a 2d array of objects by columns starting from the second row and transposing an array


      构建排序集群。第一行是中心点。其他点按最近的中心分组。在这种情况下,所有点都按第二个中心点分组:

      PointXY[][] clusters = {
              {new PointXY(23.237633, 53.78671),
                      new PointXY(69.15293, 17.138134),
                      new PointXY(23.558687, 45.70517)},
              {new PointXY(47.851738, 16.525734),
                      new PointXY(47.802097, 16.689285),
                      new PointXY(47.946404, 16.732542)},
              {new PointXY(47.89601, 16.638218),
                      new PointXY(47.833263, 16.478987),
                      new PointXY(47.88203, 16.45793)},
              {new PointXY(47.75438, 16.549816),
                      new PointXY(47.915512, 16.506475),
                      new PointXY(47.768547, 16.67624)}};
      
      // array of a center points
      PointXY[] centers = clusters[0];
      
      PointXY[][] clustersSorted = Arrays
              // iterate over array of center points
              .stream(centers)
              // for each center point
              .map(center -> Stream.of(center, Arrays
                      // iterate over array of clusters starting from the second row
                      .stream(clusters, 1, clusters.length)
                      // stream over the full array
                      .flatMap(Arrays::stream)
                      // filter nearest points to the current center point
                      .filter(point -> Arrays
                              // iterate over array of center points
                              .stream(centers)
                              // sort by euclidean distance from current point
                              .sorted(Comparator.comparingDouble(centerXY ->
                                      Math.sqrt(Math.pow(centerXY.x - point.x, 2)
                                              + Math.pow(centerXY.y - point.y, 2))))
                              // find nearest center point to the current point
                              .findFirst()
                              // check this center point is the current center point
                              .get() == center)
                      // array of the nearest points to this center point
                      .toArray(PointXY[]::new))
                      // center point + array of its nearest points
                      .flatMap(element -> element instanceof PointXY ?
                              Stream.of((PointXY) element) :
                              Arrays.stream((PointXY[]) element))
                      // sorted cluster
                      .toArray(PointXY[]::new))
              // sorted array of clusters
              .toArray(PointXY[][]::new);
      
      // output
      Arrays.stream(clustersSorted).map(Arrays::toString).forEach(System.out::println);
      
      [23.237633,53.78671]
      [69.15293,17.138134, 47.851738,16.525734, 47.802097,16.689285, 47.946404,16.732542, 47.89601,16.638218, 47.833263,16.478987, 47.88203,16.45793, 47.75438,16.549816, 47.915512,16.506475, 47.768547,16.67624]
      [23.558687,45.70517]
      

      PointXY 类应如下所示:

      public static class PointXY {
          double x, y;
      
          public PointXY(double x, double y) {
              this.x = x;
              this.y = y;
          }
      
          @Override
          public String toString() {
              return x + "," + y;
          }
      }
      

      【讨论】:

        【解决方案3】:

        创建一个大小为 [n][(n-1) * n] 的临时浮点/双精度数组,其中输入矩阵的大小为 [n][n-1]。

        计算矩阵下部所有点与第一行所有点的欧式距离,并将它们存储在临时数组中各自的位置。

        为每个临时子数组创建一个副本。

        对副本执行任何排序操作,最好是选择排序,因为您只需要对数组进行部分排序,直到找到最低的 n-1 个元素。

        最后,创建一个大小为 [n][n-1] 的新输出数组,对应它们的点的最低欧几里德距离,并存储正下方的 (n-1) 个点的排序后的 (n-1) 个元素组他们最近的参考点。

        【讨论】:

          猜你喜欢
          • 2013-08-17
          • 2017-05-25
          • 2015-09-17
          • 1970-01-01
          • 2021-12-13
          • 1970-01-01
          • 1970-01-01
          相关资源
          最近更新 更多