【问题标题】:colMeans is not functioning in RcolMeans 在 R 中不起作用
【发布时间】:2020-10-14 03:22:49
【问题描述】:

我需要为我的作业这样做: 我们关注以下变量子集:regimeoillogGDPcpillit。删除在任何这些变量中具有缺失值的观测值。使用scale() 函数,缩放这些变量,使每个变量的均值为零,标准差为一。用两个聚类拟合 k-means 聚类算法。每个集群分配了多少个观测值?使用原始的非标准化数据,计算每个集群中这些变量的均值。 这就是我所做的

resources <- read.csv("https://raw.githubusercontent.com/umbertomig/intro-prob-stat-FGV/master/datasets/resources.csv")

#subset
resources.subset <- subset(resources, select = c("cty_name", "year", "regime", "oil", "logGDPcp", "illit"))

#removing missing values
resources1 <- na.omit(resources.subset)

#scaling
scaled.resources <- scale(resources1)
#mean of zero
colMeans(scaled.resources) 
#standard deviation of 1
apply(scaled.resources, 2, sd)

#fitting into two clusters
cluster2 <- kmeans(resources.scaled, centers = 2)

#how many observations are assigned to each cluster?
nrow(resources.scaled)
table(cluster2$cluster)

#means of the variables
cluster2$centers
g1 <- resources1[cluster2$cluster == 1, ]
colMeans(g1)
g2 <- resources1[cluster2$cluster == 2, ]
colMeans(g2)

但是我得到了这个错误” colMeans(x, na.rm = TRUE) 中的错误:“x”必须是数字

我该如何解决这个问题?

【问题讨论】:

    标签: r scale mean standard-deviation


    【解决方案1】:

    有一列不是数字

    str(resources1)
    #'data.frame':  417 obs. of  6 variables:
    # $ cty_name: chr  "United Arab Emirates" "Argentina" "Argentina" "Argentina" ...
    # $ year    : int  1975 1970 1975 1980 1985 1990 1995 1997 1970 1970 ...
    # $ regime  : num  -7 -9 6 -9 8 8 8 8 -7 -2 ...
    # $ oil     : num  65.9386 0.0241 0.0279 0.361 0.6939 ...
    # $ logGDPcp: num  9.71 7.64 8.07 8.53 8.58 ...
    # $ illit   : num  40.2 7.3 6.5 6.1 5 4.3 3.7 3.5 80.1 89.1 ...
    # - attr(*, "na.action")= 'omit' Named int [1:4113] 1 2 3 4 5 6 7 8 9 10 ...
      ..- attr(*, "names")= chr [1:4113] "1" "2" "3" "4" ...
    

    所以,scale 只使用数字列可能会更好

    i1 <- sapply(resources1, is.numeric)
    scaled.resources <- scale(resources1[i1])
    

    【讨论】:

      猜你喜欢
      • 1970-01-01
      • 2016-06-04
      • 2018-07-14
      • 1970-01-01
      • 1970-01-01
      • 1970-01-01
      • 2012-12-15
      • 1970-01-01
      • 1970-01-01
      相关资源
      最近更新 更多