本教程中我们将使用的数据集是 MNIST 数据集,它是机器学习领域的经典数据集。该数据集由手写数字图像组成,大小为 28x28 像素。以下是数据集中包含的一些数字示例:


MNIST 图像示例

让我们创建一个 Python 程序来处理这个数据集。在本教程中,我们将把所有工作都放在一个文件中。创建一个名为 main.py 的新文件:

Bash
(tensorflow-demo) $ touch main.py

现在用你选择的文本编辑器打开这个文件,并添加以下代码行来导入 TensorFlow 库:

main.py

Python
import tensorflow as tf

将以下代码行添加到你的文件中,以导入 MNIST 数据集并将图像数据存储在变量 mnist 中:

main.py

Python
...
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True) # y labels are one-hot-encoded

在读取数据时,我们使用独热编码 (one-hot-encoding) 来表示图像的标签(即实际绘制的数字,例如“3”)。独热编码使用二进制值的向量来表示数值或分类值。由于我们的标签是 0-9 的数字,所以向量包含十个值,每个可能数字对应一个。其中一个值被设置为 1,表示该向量索引处的数字,其余值设置为 0。例如,数字 3 用向量 [0, 0, 0, 1, 0, 0, 0, 0, 0, 0] 表示。由于索引 3 处的值存储为 1,因此该向量表示数字 3。

为了表示实际图像本身,28x28 像素的图像被展平为 784 像素大小的 1D 向量。构成图像的 784 个像素中的每一个都存储为一个介于 0 和 255 之间的值。这决定了像素的灰度,因为我们的图像仅以黑白显示。因此,一个黑色像素由 255 表示,一个白色像素由 0 表示,各种深浅的灰色介于两者之间。

我们可以使用 mnist 变量来找出我们刚刚导入的数据集的大小。查看三个子集的 num_examples,我们可以确定数据集已分为55,000 张图像用于训练,5000 张用于验证,以及 10,000 张用于测试。将以下行添加到你的文件中:

main.py

Python
...
n_train = mnist.train.num_examples      # 55,000
n_validation = mnist.validation.num_examples   # 5000
n_test = mnist.test.num_examples       # 10,000

现在我们已经导入了数据,是时候考虑神经网络了。


准备好深入了解神经网络的构建了吗?


Step 2 — Importing the MNIST Dataset
The dataset we will be using in this tutorial is called the MNIST dataset,
and it is a classic in the machine learning community. This dataset is
made up of images of handwritten digits, 28x28 pixels in size. Here are
some examples of the digits included in the dataset:
Examples of MNIST images
Let’s create a Python program to work with this dataset. We will use
one file for all of our work in this tutorial. Create a new file called
main.py:
(tensorflow-demo) $ touch main.py
Now open this file in your text editor of choice and add this line of
code to the file to import the TensorFlow library:
main.py
import tensorflow as tf
Add the following lines of code to your file to import the MNIST
dataset and store the image data in the variable mnist:
main.py
...
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("MNIST_data/", one_hot=True) # y
labels are oh-encoded
When reading in the data, we are using one-hot-encoding to represent
the labels (the actual digit drawn, e.g. “3”) of the images. One-hot-
encoding uses a vector of binary values to represent numeric or
categorical values. As our labels are for the digits 0-9, the vector contains
ten values, one for each possible digit. One of these values is set to 1, to
represent the digit at that index of the vector, and the rest are set to 0. For
example, the digit 3 is represented using the vector [0, 0, 0, 1, 0,
0, 0, 0, 0, 0]. As the value at index 3 is stored as 1, the vector
therefore represents the digit 3.
To represent the actual images themselves, the 28x28 pixels are
flattened into a 1D vector which is 784 pixels in size. Each of the 784
pixels making up the image is stored as a value between 0 and 255. This
determines the grayscale of the pixel, as our images are presented in
black and white only. So a black pixel is represented by 255, and a white
pixel by 0, with the various shades of gray somewhere in between.
We can use the mnist variable to find out the size of the dataset we
have just imported. Looking at the num_examples for each of the three
subsets, we can determine that the dataset has been split into 55,000
images for training, 5000 for validation, and 10,000 for testing. Add the
following lines to your file:
main.py
...
n_train = mnist.train.num_examples
 # 55,000
n_validation = mnist.validation.num_examples
 # 5000
n_test = mnist.test.num_examples
 # 10,000
Now that we have our data imported, it’s time to think about the
neural network.

Last modified: Wednesday, 25 June 2025, 11:55 AM