array(2) { ["docs"]=> array(10) { [0]=> array(10) { ["id"]=> string(3) "428" ["text"]=> string(77) "Visual Studio 2017 单独启动MSDN帮助(Microsoft Help Viewer)的方法" ["intro"]=> string(288) "目录 ECharts 异步加载 ECharts 数据可视化在过去几年中取得了巨大进展。开发人员对可视化产品的期望不再是简单的图表创建工具,而是在交互、性能、数据处理等方面有更高的要求。 chart.setOption({ color: [ " ["username"]=> string(8) "DonetRen" ["tagsname"]=> string(55) "Visual Studio 2017|MSDN帮助|C#程序|.NET|Help Viewer" ["tagsid"]=> string(23) "[401,402,403,"300",404]" ["catesname"]=> string(0) "" ["catesid"]=> string(2) "[]" ["createtime"]=> string(10) "1511400964" ["_id"]=> string(3) "428" } [1]=> array(10) { ["id"]=> string(3) "427" ["text"]=> string(42) "npm -v;报错 cannot find module "wrapp"" ["intro"]=> string(288) "目录 ECharts 异步加载 ECharts 数据可视化在过去几年中取得了巨大进展。开发人员对可视化产品的期望不再是简单的图表创建工具,而是在交互、性能、数据处理等方面有更高的要求。 chart.setOption({ color: [ " ["username"]=> string(4) "zzty" ["tagsname"]=> string(50) "node.js|npm|cannot find module "wrapp“|node" ["tagsid"]=> string(19) "[398,"239",399,400]" ["catesname"]=> string(0) "" ["catesid"]=> string(2) "[]" ["createtime"]=> string(10) "1511400760" ["_id"]=> string(3) "427" } [2]=> array(10) { ["id"]=> string(3) "426" ["text"]=> string(54) "说说css中pt、px、em、rem都扮演了什么角色" ["intro"]=> string(288) "目录 ECharts 异步加载 ECharts 数据可视化在过去几年中取得了巨大进展。开发人员对可视化产品的期望不再是简单的图表创建工具,而是在交互、性能、数据处理等方面有更高的要求。 chart.setOption({ color: [ " ["username"]=> string(12) "zhengqiaoyin" ["tagsname"]=> string(0) "" ["tagsid"]=> string(2) "[]" ["catesname"]=> string(0) "" ["catesid"]=> string(2) "[]" ["createtime"]=> string(10) "1511400640" ["_id"]=> string(3) "426" } [3]=> array(10) { ["id"]=> string(3) "425" ["text"]=> string(83) "深入学习JS执行--创建执行上下文(变量对象,作用域链,this)" ["intro"]=> string(288) "目录 ECharts 异步加载 ECharts 数据可视化在过去几年中取得了巨大进展。开发人员对可视化产品的期望不再是简单的图表创建工具,而是在交互、性能、数据处理等方面有更高的要求。 chart.setOption({ color: [ " ["username"]=> string(7) "Ry-yuan" ["tagsname"]=> string(33) "Javascript|Javascript执行过程" ["tagsid"]=> string(13) "["169","191"]" ["catesname"]=> string(0) "" ["catesid"]=> string(2) "[]" ["createtime"]=> string(10) "1511399901" ["_id"]=> string(3) "425" } [4]=> array(10) { ["id"]=> string(3) "424" ["text"]=> string(30) "C# 排序技术研究与对比" ["intro"]=> string(288) "目录 ECharts 异步加载 ECharts 数据可视化在过去几年中取得了巨大进展。开发人员对可视化产品的期望不再是简单的图表创建工具,而是在交互、性能、数据处理等方面有更高的要求。 chart.setOption({ color: [ " ["username"]=> string(9) "vveiliang" ["tagsname"]=> string(0) "" ["tagsid"]=> string(2) "[]" ["catesname"]=> string(8) ".Net Dev" ["catesid"]=> string(5) "[199]" ["createtime"]=> string(10) "1511399150" ["_id"]=> string(3) "424" } [5]=> array(10) { ["id"]=> string(3) "423" ["text"]=> string(72) "【算法】小白的算法笔记:快速排序算法的编码和优化" ["intro"]=> string(288) "目录 ECharts 异步加载 ECharts 数据可视化在过去几年中取得了巨大进展。开发人员对可视化产品的期望不再是简单的图表创建工具,而是在交互、性能、数据处理等方面有更高的要求。 chart.setOption({ color: [ " ["username"]=> string(9) "penghuwan" ["tagsname"]=> string(6) "算法" ["tagsid"]=> string(7) "["344"]" ["catesname"]=> string(0) "" ["catesid"]=> string(2) "[]" ["createtime"]=> string(10) "1511398109" ["_id"]=> string(3) "423" } [6]=> array(10) { ["id"]=> string(3) "422" ["text"]=> string(64) "JavaScript数据可视化编程学习(二)Flotr2,雷达图" ["intro"]=> string(288) "目录 ECharts 异步加载 ECharts 数据可视化在过去几年中取得了巨大进展。开发人员对可视化产品的期望不再是简单的图表创建工具,而是在交互、性能、数据处理等方面有更高的要求。 chart.setOption({ color: [ " ["username"]=> string(7) "chengxs" ["tagsname"]=> string(28) "数据可视化|前端学习" ["tagsid"]=> string(9) "[396,397]" ["catesname"]=> string(18) "前端基本知识" ["catesid"]=> string(5) "[198]" ["createtime"]=> string(10) "1511397800" ["_id"]=> string(3) "422" } [7]=> array(10) { ["id"]=> string(3) "421" ["text"]=> string(36) "C#表达式目录树(Expression)" ["intro"]=> string(288) "目录 ECharts 异步加载 ECharts 数据可视化在过去几年中取得了巨大进展。开发人员对可视化产品的期望不再是简单的图表创建工具,而是在交互、性能、数据处理等方面有更高的要求。 chart.setOption({ color: [ " ["username"]=> string(4) "wwym" ["tagsname"]=> string(0) "" ["tagsid"]=> string(2) "[]" ["catesname"]=> string(4) ".NET" ["catesid"]=> string(7) "["119"]" ["createtime"]=> string(10) "1511397474" ["_id"]=> string(3) "421" } [8]=> array(10) { ["id"]=> string(3) "420" ["text"]=> string(47) "数据结构 队列_队列实例:事件处理" ["intro"]=> string(288) "目录 ECharts 异步加载 ECharts 数据可视化在过去几年中取得了巨大进展。开发人员对可视化产品的期望不再是简单的图表创建工具,而是在交互、性能、数据处理等方面有更高的要求。 chart.setOption({ color: [ " ["username"]=> string(7) "idreamo" ["tagsname"]=> string(40) "C语言|数据结构|队列|事件处理" ["tagsid"]=> string(23) "["246","247","248",395]" ["catesname"]=> string(12) "数据结构" ["catesid"]=> string(7) "["133"]" ["createtime"]=> string(10) "1511397279" ["_id"]=> string(3) "420" } [9]=> array(10) { ["id"]=> string(3) "419" ["text"]=> string(47) "久等了,博客园官方Android客户端发布" ["intro"]=> string(288) "目录 ECharts 异步加载 ECharts 数据可视化在过去几年中取得了巨大进展。开发人员对可视化产品的期望不再是简单的图表创建工具,而是在交互、性能、数据处理等方面有更高的要求。 chart.setOption({ color: [ " ["username"]=> string(3) "cmt" ["tagsname"]=> string(0) "" ["tagsid"]=> string(2) "[]" ["catesname"]=> string(0) "" ["catesid"]=> string(2) "[]" ["createtime"]=> string(10) "1511396549" ["_id"]=> string(3) "419" } } ["count"]=> int(200) } 222 使用TensorFlow构建卷积神经网络 - 爱码网

本文介绍使用TensorFlow构建卷积神经网络解决kaggle上的digit-recognizer问题。

数据规格

kaggle提供的数据集来自MNIST上的60000条手写数字数据。数据中每个手写数字图像使用28 * 28 的灰度图表示 。

模型结构

本文展示的CNN模型包括1个输入层,2个卷积层,2个池化层,1个全连接层以及1个大小为10的输出层。卷积层使用Relu activation function引入非线性特性。池化层使用max-pooling,大小为2 * 2, 步长设置为2。输出层使用softmax activation function 输出0到1的浮点数(最后一层所有node结果相加结果为1)。
模型构建过程参照Tensorflow Tutorials

代码实现

"""
File Name: tf_cnn.py
Author: ce39906
mail: [email protected]
Created Time: 2018-10-25 10:55:56
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function

import numpy as np
import pandas as pd
import tensorflow as tf
from sklearn.preprocessing import StandardScaler

tf.logging.set_verbosity(tf.logging.INFO)

def cnn_model_fn(features, labels, mode):
    # input layer
    input_layer = tf.reshape(features["x"], [-1, 28, 28, 1])
    # convolutional layer 1
    conv1 = tf.layers.conv2d(
        inputs = input_layer,
        filters  = 32,
        kernel_size = [5, 5],
        padding = "same",
        activation = tf.nn.relu)
    # pooling layer 1
    pool1 = tf.layers.max_pooling2d(
        inputs = conv1,
        pool_size = [2, 2],
        strides = 2)
    # convolutional layer 2
    conv2 = tf.layers.conv2d(
        inputs = pool1,
        filters = 64,
        kernel_size = [5, 5],
        padding = "same",
        activation = tf.nn.relu)
    # pooling layer 2
    pool2 = tf.layers.max_pooling2d(
        inputs = conv2,
        pool_size = [2, 2],
        strides = 2)
    # dense layer
    pool2_flat = tf.reshape(pool2, [-1, 7 * 7 * 64])
    dense = tf.layers.dense(
        inputs = pool2_flat,
        units = 1024,
        activation = tf.nn.relu)
    dropout = tf.layers.dropout(
        inputs = dense,
        rate = 0.4,
        training = mode == tf.estimator.ModeKeys.TRAIN)
    # logits layer
    logits = tf.layers.dense(
        inputs = dropout,
        units = 10)
    # do predict
    predictions = {
        "classes" : tf.argmax(input = logits, axis = 1),
        "probabilities" : tf.nn.softmax(logits, name = "softmax_tensor")
    }

    if mode == tf.estimator.ModeKeys.PREDICT:
        return tf.estimator.EstimatorSpec(
                    mode = mode,
                    predictions = predictions)

    # calculate loss (both TRAIN and EVAL mode)
    loss = tf.losses.sparse_softmax_cross_entropy(labels = labels, logits = logits)

    # configure the Trainning Op
    if mode == tf.estimator.ModeKeys.TRAIN:
        optimizer = tf.train.GradientDescentOptimizer(learning_rate = 0.001)
        train_op = optimizer.minimize(
            loss = loss,
            global_step = tf.train.get_global_step())

        return tf.estimator.EstimatorSpec(
                    mode = mode,
                    loss = loss,
                    train_op = train_op)


    # add evaluation metrics (for EVAL mode)
    eval_metric_ops = {
        "accuracy" : tf.metrics.accuracy(
            labels = labels, predictions = predictions["classes"])}

    return tf.estimator.EstimatorSpec(
            mode = mode,
            loss = loss,
            eval_metric_ops = eval_metric_ops)

def main():
    train_data = pd.read_csv('train.csv')
    test_data = pd.read_csv('test.csv')

    labels = np.array(train_data.pop('label'))
    data = StandardScaler().fit_transform(np.float32(train_data.values))
    validation_size = 10000

    train_data, valid_data = data[ : -validation_size], data[-validation_size : ]
    train_labels, valid_labels = labels[: -validation_size], labels[-validation_size : ]

    test_data = StandardScaler().fit_transform(np.float32(test_data.values))

    classifier = tf.estimator.Estimator(
        model_fn = cnn_model_fn,
        model_dir = "/tmp/cnn_model")
    # Set up logging for predictions
    tensors_to_log = {"probabilities" : "softmax_tensor"}
    logging_hook = tf.train.LoggingTensorHook(
        tensors = tensors_to_log, every_n_iter = 50)

    # train the model
    train_input_fn = tf.estimator.inputs.numpy_input_fn(
        x = {"x" : train_data},
        y = train_labels,
        batch_size = 100,
        num_epochs = None,
        shuffle = True)

    classifier.train(
        input_fn = train_input_fn,
        steps = 20000,
        hooks = [logging_hook])

    # evaluate the model and print results
    eval_input_fn = tf.estimator.inputs.numpy_input_fn(
        x = {"x" : valid_data},
        y = valid_labels,
        num_epochs = 1,
        shuffle = False)

    eval_results = classifier.evaluate(input_fn = eval_input_fn)
    print (eval_results)
    # evaluate the model and print results
    predict_input_fn = tf.estimator.inputs.numpy_input_fn(
        x = {"x" : test_data},
        num_epochs = 1,
        shuffle = False)

    preidct_results = classifier.predict(input_fn = predict_input_fn)

    test_labels = []
    for predict_result in preidct_results:
        test_labels.append(predict_result['classes'])
    test_labels = np.array(test_labels)

    submission = pd.DataFrame({'ImageId' : (np.arange(test_labels.shape[0]) + 1),'Label' : test_labels})
    submission.to_csv('submission.csv', index = False)

if __name__ == '__main__':
    main()

结果

交叉测试准确率为97.58%使用TensorFlow构建卷积神经网络
提交到kaggle后准确率为97.35%
使用TensorFlow构建卷积神经网络

相关文章: