三木社区

 找回密码
 立即注册
搜索
热搜: 活动 交友 discuz
查看: 560|回复: 0
打印 上一主题 下一主题

基于MNIST数据的softmax regression

[复制链接]

1562

主题

1564

帖子

4904

积分

博士

Rank: 8Rank: 8

积分
4904
跳转到指定楼层
楼主
发表于 2017-9-18 07:49:34 | 只看该作者 回帖奖励 |倒序浏览 |阅读模式
本文基于TensorFlow官网的Tutorial写成。输入数据是MNIST,全称是Modified National Institute of Standards and Technology,是一组由这个机构搜集的手写数字扫描文件和每个文件对应标签的数据集,经过一定的修改使其适合机器学习算法读取。这个数据集可以从牛的不行的Yann LeCun教授的网站获取。
本文首先使用sklearn的LogisticRegression()进行训练,得到的参数绘制效果如下(红色表示参数估计结果为负,蓝色表示参数估计结果为正,绿色代表参数估计结果为零):

从图形效果看,我们发现蓝色点组成的轮廓与对应的数字轮廓还是比较接近的。
然后本文使用tensorflow对同样的数据集进行了softmax regression的训练,得到的参数绘制效果如下:

蓝色点组成的轮廓与对应的数字轮廓比较接近。但是对比上下两幅截图,感觉tensorflow的效果更平滑一些。不过从测试集的准确率来看,二者都在92%左右,sklearn稍微好一点。注意,92%的准确率看起来不错,但其实是一个很低的准确率,按照官网教程的说法,应该要感到羞愧。
代码
  1. #!/usr/bin/env python
  2. # -*- coding=utf-8 -*-
  3. # @author: 陈水平
  4. # @date: 2017-01-10
  5. # @description: implement a softmax regression model upon MNIST handwritten digits
  6. # @ref: http://yann.lecun.com/exdb/mnist/

  7. import gzip
  8. import struct
  9. import numpy as np
  10. from sklearn.linear_model import LogisticRegression
  11. from sklearn import preprocessing
  12. from sklearn.metrics import accuracy_score
  13. import tensorflow as tf

  14. # MNIST data is stored in binary format,
  15. # and we transform them into numpy ndarray objects by the following two utility functions
  16. def read_image(file_name):
  17.     with gzip.open(file_name, 'rb') as f:
  18.         buf = f.read()
  19.         index = 0
  20.         magic, images, rows, columns = struct.unpack_from('>IIII' , buf , index)
  21.         index += struct.calcsize('>IIII')

  22.         image_size = '>' + str(images*rows*columns) + 'B'
  23.         ims = struct.unpack_from(image_size, buf, index)
  24.         
  25.         im_array = np.array(ims).reshape(images, rows, columns)
  26.         return im_array

  27. def read_label(file_name):
  28.     with gzip.open(file_name, 'rb') as f:
  29.         buf = f.read()
  30.         index = 0
  31.         magic, labels = struct.unpack_from('>II', buf, index)
  32.         index += struct.calcsize('>II')
  33.         
  34.         label_size = '>' + str(labels) + 'B'
  35.         labels = struct.unpack_from(label_size, buf, index)

  36.         label_array = np.array(labels)
  37.         return label_array

  38. print "Start processing MNIST handwritten digits data..."
  39. train_x_data = read_image("MNIST_data/train-images-idx3-ubyte.gz")
  40. train_x_data = train_x_data.reshape(train_x_data.shape[0], -1).astype(np.float32)
  41. train_y_data = read_label("MNIST_data/train-labels-idx1-ubyte.gz")
  42. test_x_data = read_image("MNIST_data/t10k-images-idx3-ubyte.gz")
  43. test_x_data = test_x_data.reshape(test_x_data.shape[0], -1).astype(np.float32)
  44. test_y_data = read_label("MNIST_data/t10k-labels-idx1-ubyte.gz")

  45. train_x_minmax = train_x_data / 255.0
  46. test_x_minmax = test_x_data / 255.0

  47. # Of course you can also use the utility function to read in MNIST provided by tensorflow
  48. # from tensorflow.examples.tutorials.mnist import input_data
  49. # mnist = input_data.read_data_sets("MNIST_data/", one_hot=False)
  50. # train_x_minmax = mnist.train.images
  51. # train_y_data = mnist.train.labels
  52. # test_x_minmax = mnist.test.images
  53. # test_y_data = mnist.test.labels

  54. # We evaluate the softmax regression model by sklearn first
  55. eval_sklearn = False
  56. if eval_sklearn:
  57.     print "Start evaluating softmax regression model by sklearn..."
  58.     reg = LogisticRegression(solver="lbfgs", multi_class="multinomial")
  59.     reg.fit(train_x_minmax, train_y_data)
  60.     np.savetxt('coef_softmax_sklearn.txt', reg.coef_, fmt='%.6f')  # Save coefficients to a text file
  61.     test_y_predict = reg.predict(test_x_minmax)
  62.     print "Accuracy of test set: %f" % accuracy_score(test_y_data, test_y_predict)

  63. eval_tensorflow = True
  64. batch_gradient = False
  65. if eval_tensorflow:
  66.     print "Start evaluating softmax regression model by tensorflow..."
  67.     # reformat y into one-hot encoding style
  68.     lb = preprocessing.LabelBinarizer()
  69.     lb.fit(train_y_data)
  70.     train_y_data_trans = lb.transform(train_y_data)
  71.     test_y_data_trans = lb.transform(test_y_data)

  72.     x = tf.placeholder(tf.float32, [None, 784])
  73.     W = tf.Variable(tf.zeros([784, 10]))
  74.     b = tf.Variable(tf.zeros([10]))
  75.     V = tf.matmul(x, W) + b
  76.     y = tf.nn.softmax(V)

  77.     y_ = tf.placeholder(tf.float32, [None, 10])

  78.     loss = tf.reduce_mean(-tf.reduce_sum(y_ * tf.log(y), reduction_indices=[1]))
  79.     optimizer = tf.train.GradientDescentOptimizer(0.5)
  80.     train = optimizer.minimize(loss)

  81.     init = tf.initialize_all_variables()

  82.     sess = tf.Session()
  83.     sess.run(init)

  84.     if batch_gradient:
  85.         for step in range(300):
  86.             sess.run(train, feed_dict={x: train_x_minmax, y_: train_y_data_trans})
  87.             if step % 10 == 0:
  88.                 print "Batch Gradient Descent processing step %d" % step
  89.         print "Finally we got the estimated results, take such a long time..."
  90.     else:
  91.         for step in range(1000):
  92.             sample_index = np.random.choice(train_x_minmax.shape[0], 100)
  93.             batch_xs = train_x_minmax[sample_index, :]
  94.             batch_ys = train_y_data_trans[sample_index, :]
  95.             sess.run(train, feed_dict={x: batch_xs, y_: batch_ys})
  96.             if step % 100 == 0:
  97.                 print "Stochastic Gradient Descent processing step %d" % step
  98.     np.savetxt('coef_softmax_tf.txt', np.transpose(sess.run(W)), fmt='%.6f')  # Save coefficients to a text file
  99.     correct_prediction = tf.equal(tf.argmax(y, 1), tf.argmax(y_, 1))
  100.     accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32))
  101.     print "Accuracy of test set: %f" % sess.run(accuracy, feed_dict={x: test_x_minmax, y_: test_y_data_trans})
复制代码
输出如下:
  1. Start processing MNIST handwritten digits data...
  2. Start evaluating softmax regression model by sklearn...
  3. Accuracy of test set: 0.926300
  4. Start evaluating softmax regression model by tensorflow...
  5. Stochastic Gradient Descent processing step 0
  6. Stochastic Gradient Descent processing step 100
  7. Stochastic Gradient Descent processing step 200
  8. Stochastic Gradient Descent processing step 300
  9. Stochastic Gradient Descent processing step 400
  10. Stochastic Gradient Descent processing step 500
  11. Stochastic Gradient Descent processing step 600
  12. Stochastic Gradient Descent processing step 700
  13. Stochastic Gradient Descent processing step 800
  14. Stochastic Gradient Descent processing step 900
  15. Accuracy of test set: 0.917400
复制代码


本帖子中包含更多资源

您需要 登录 才可以下载或查看,没有帐号?立即注册

x
回复

使用道具 举报

Archiver|手机版|小黑屋|三木电子社区 ( 辽ICP备11000133号-4 )

辽公网安备 21021702000620号

GMT+8, 2025-5-10 00:08 , Processed in 0.023628 second(s), 24 queries .

Powered by Discuz! X3.3

© 2001-2017 Comsenz Inc.

快速回复 返回顶部 返回列表