【问题标题】:How to set optimizer in logistic regression in the apache spark mllib with python如何在 apache spark mllib 中使用 python 在逻辑回归中设置优化器
【发布时间】:2014-06-10 22:31:16
【问题描述】:

我现在开始对 apache spark mllib 进行一些测试

def mapper(line):
    feats = line.strip().split(',')
    label = feats[len(feats)-1]
    feats = feats[:len(feats)-1]
    feats.insert(0,label)
    return numpy.array([float(feature) for feature in feats])

def test3():
    data = sc.textFile('/home/helxsz/Dropbox/exercise/spark/data_banknote_authentication.txt')
    parsed = data.map(mapper)
    logistic = LogisticRegressionWithSGD()
    logistic.optimizer.setNumIterations(200).setMiniBatchFraction(0.1)
    model = logistic.run(parsed)
    labelsAndPreds = parsed.map(lambda points: (int(points[0]), model.predict( points[1:len(points)]) ))
    trainErr = labelAndPreds.filter(lambda (v,p): v != p).count() / float(parsed.count())
    print 'training error = ' + str(trainErr)

但是当我使用如下 LogisticRegressionWithSGD 时

logistic = LogisticRegressionWithSGD()
logistic.optimizer.setNumIterations(200).setMiniBatchFraction(0.1)

它给出了一个错误,即 AttributeError: 'LogisticRegressionWithSGD' object has no attribute 'optimizer'

这是 LogisticRegressionWithSGDGradientDescent 的 API 文档

【问题讨论】:

    标签: python bigdata apache-spark


    【解决方案1】:

    在 python API 中,您可以在调用 'train' 时设置这些参数:

    model = LogisticRegressionWithSGD.train(parsed, iterations=200, miniBatchFraction=0.1)
    

    我能找到的唯一文档是source code

    【讨论】:

      猜你喜欢
      • 2018-04-06
      • 2023-04-02
      • 2016-04-04
      • 2016-09-26
      • 1970-01-01
      • 2014-07-19
      • 2017-09-28
      • 2017-01-14
      • 2016-03-24
      相关资源
      最近更新 更多