【问题标题】:c50 code called exit with value 1 on Mushroom Data set [duplicate]c50代码在蘑菇数据集上调用exit,值为1 [重复]
【发布时间】:2016-11-27 02:09:03
【问题描述】:

我在使用 Mushroom 数据集处理 C5.0 时遇到错误。我已经考虑了目标类并且没有缺失值。

f <-file("https://archive.ics.uci.edu/ml/machine-learning-databases/mushroom/agaricus-lepiota.data", open="r")
data <- read.table(f, sep=",", header=F)
str(data)

给予

'data.frame':   8124 obs. of  23 variables:
$ V1 : Factor w/ 2 levels "e","p": 2 1 1 2 1 1 1 1 2 1 ...
$ V2 : Factor w/ 6 levels "b","c","f","k",..: 6 6 1 6 6 6 1 1 6 1 ...
$ V3 : Factor w/ 4 levels "f","g","s","y": 3 3 3 4 3 4 3 4 4 3 ...
$ V4 : Factor w/ 10 levels "b","c","e","g",..: 5 10 9 9 4 10 9 9 9 10 ...
$ V5 : Factor w/ 2 levels "f","t": 2 2 2 2 1 2 2 2 2 2 ...
$ V6 : Factor w/ 9 levels "a","c","f","l",..: 7 1 4 7 6 1 1 4 7 1 ...
$ V7 : Factor w/ 2 levels "a","f": 2 2 2 2 2 2 2 2 2 2 ...
$ V8 : Factor w/ 2 levels "c","w": 1 1 1 1 2 1 1 1 1 1 ...
$ V9 : Factor w/ 2 levels "b","n": 2 1 1 2 1 1 1 1 2 1 ...
$ V10: Factor w/ 12 levels "b","e","g","h",..: 5 5 6 6 5 6 3 6 8 3 ...
$ V11: Factor w/ 2 levels "e","t": 1 1 1 1 2 1 1 1 1 1 ...
$ V12: Factor w/ 5 levels "?","b","c","e",..: 4 3 3 4 4 3 3 3 4 3 ...
$ V13: Factor w/ 4 levels "f","k","s","y": 3 3 3 3 3 3 3 3 3 3 ...
$ V14: Factor w/ 4 levels "f","k","s","y": 3 3 3 3 3 3 3 3 3 3 ...
$ V15: Factor w/ 9 levels "b","c","e","g",..: 8 8 8 8 8 8 8 8 8 8 ...
$ V16: Factor w/ 9 levels "b","c","e","g",..: 8 8 8 8 8 8 8 8 8 8 ...
$ V17: Factor w/ 1 level "p": 1 1 1 1 1 1 1 1 1 1 ...
$ V18: Factor w/ 4 levels "n","o","w","y": 3 3 3 3 3 3 3 3 3 3 ...
$ V19: Factor w/ 3 levels "n","o","t": 2 2 2 2 2 2 2 2 2 2 ...
$ V20: Factor w/ 5 levels "e","f","l","n",..: 5 5 5 5 1 5 5 5 5 5 ...
$ V21: Factor w/ 9 levels "b","h","k","n",..: 3 4 4 3 4 3 3 4 3 3 ...
$ V22: Factor w/ 6 levels "a","c","n","s",..: 4 3 3 4 1 3 3 4 5 4 ...
$ V23: Factor w/ 7 levels "d","g","l","m",..: 6 2 4 6 2 2 4 4 2 4 ...

当我跑步时

C5.model <- C5.0(data[1:4000,-1],data[1:4000,1],trials = 3)

给了

c50 code called exit with value 1

我不知道如何找到这个。任何关于调试的想法表示赞赏

Edit1:错误相同,但解决方案不同。 注意:当我更改数据集时,它正在工作。

【问题讨论】:

  • 在那个数据集中它有缺失值,所以这就是问题所在。但是这个数据集没有任何缺失值。
  • 您的数据已退化。例如,变量 V7 和 V17 只取一个值。
  • @tchakravarty 这是正确的,但如果 V7 只包含更多行,它实际上是可以的,因为它有 2 个级别。
  • @tchakravarty 谢谢大家。那行得通
  • @tchakravarty:V7 出了什么问题。这2个级别不适合对数据进行分区吗?

标签: r machine-learning decision-tree


【解决方案1】:
f <-file("https://archive.ics.uci.edu/ml/machine-learning-databases/mushroom/agaricus-lepiota.data", open="r")
data <- read.table(f, sep=",", header=F)
str(data)

pacman::p_load(C50)
C5.model <- C5.0(data[1:10000,c(2:16,18:23)],data[1:10000,1],trials = 3,na.action = na.pass)

第 17 列是导致此问题的原因,因为它没有识别变化。

【讨论】:

    猜你喜欢
    • 2014-05-13
    • 2019-01-25
    • 1970-01-01
    • 2018-09-24
    • 1970-01-01
    • 2016-09-10
    • 1970-01-01
    • 1970-01-01
    • 2012-05-15
    相关资源
    最近更新 更多