机器学习
监督学习是指提供一个包含正确响应(目标)的示例训练集,然后算法根据这个训练集进行泛化,以便对所有可能的输入做出正确响应。这也被称为从范例中学习。监督学习是机器学习中的一项任务,旨在学习一个函数,该函数根据示例输入-输出对将输入映射到输出。
在监督学习中,训练集中的每个示例都是一个由输入对象(通常是一个向量)和输出值组成的对。监督学习算法分析训练数据并生成一个函数,该函数可用于映射新的示例。在最佳情况下,该函数将正确确定未见实例的类别标签。分类问题和回归问题都属于监督学习问题。有各种各样的监督学习算法可供选择,每种算法都有其优缺点。没有一种单一的学习算法能在所有监督学习问题上都表现最佳。
图 1.4:监督学习
备注
“监督学习”之所以得名,是因为算法从训练数据集中学习的过程可以被认为是一位老师在监督学习过程。我们知道正确答案(即正确的输出),算法在训练数据上迭代地进行预测,并由老师进行纠正。当算法达到可接受的性能水平时,学习停止。
示例
考虑以下关于进入诊所的患者数据。该数据包含患者的性别和年龄,并且每位患者都被标记为“健康”或“患病”。
A training set of examples with the correct responses (targets) is provided and, based on this
training set, the algorithm generalises to respond correctly to all possible inputs. This is also called
learning from exemplars. Supervised learning is the machine learning task of learning a function that
maps an input to an output based on example input-output pairs.
In supervised learning, each example in the training set is a pair consisting of an input object
(typically a vector) and an output value. A supervised learning algorithm analyzes the training data and
produces a function, which can be used for mapping new examples. In the optimal case, the function
will correctly determine the class labels for unseen instances. Both classification and regressioproblems are supervised learning problems. A wide range of supervised learning algorithms are
available, each with its strengths and weaknesses. There is no single learning algorithm that works best
on all supervised learning problems.
Figure 1.4: Supervised learning
Remarks
A “supervised learning” is so called because the process of an algorithm learning from the
training dataset can be thought of as a teacher supervising the learning process. We know the correct
answers (that is, the correct outputs), the algorithm iteratively makes predictions on the training data
and is corrected by the teacher. Learning stops when the algorithm achieves an acceptable level of
performance.
Example
Consider the following data regarding patients entering a clinic. The data consists of the gender
and age of the patients and each patient is labeled as “healthy” or “sick”.