对于任何学习系统,我们都必须了解三个要素:T(任务)P(性能衡量)和E(训练经验)。从高层次来看,学习系统的过程如下图所示。

学习过程始于任务 T、性能衡量 P 和训练经验 E,目标是找到一个未知的目标函数。目标函数是需要从训练经验中学习的精确知识,它是未知的。例如,在信用审批的案例中,学习系统将客户申请记录作为经验,任务是对给定的客户申请是否符合贷款资格进行分类。因此,在这种情况下,训练样本可以表示为 (x1,y1),(x2,y2),...,(xn,yn),其中 X 代表客户申请详情,y 代表信用审批的状态。

有了这些细节,那么从训练经验中需要学习的确切知识是什么呢?

因此,在信用审批学习系统中要学习的目标函数是一个映射函数 。这个函数代表了定义输入变量 X 和输出变量 y 之间关系的确切知识

学习系统的设计

刚才我们探讨了学习过程,也理解了学习的目标。当我们想要设计一个遵循学习过程的学习系统时,我们需要考虑一些设计选择。设计选择将决定以下关键组成部分:

  1. 训练经验的类型
  2. 选择目标函数
  3. 选择目标函数的表示形式
  4. 选择目标函数的近似算法
  5. 最终设计

我们将以跳棋学习问题为例,并应用上述设计选择。对于跳棋学习问题,这三个要素将是:

  1. 任务 T:玩跳棋
  2. 性能衡量 P:在锦标赛中赢得比赛的总百分比。
  3. 训练经验 E:一系列与自身对弈的游戏。


For any learning system, we must be knowing the three elements — T (Task), P (Performance
Measure), and E (Training Experience). At a high level, the process of learning system looks as below.
The learning process starts with task T, performance measure P and training experience E and objective
are to find an unknown target function. The target function is an exact knowledge to be learned from the
training experience and its unknown. For example, in a case of credit approval, the learning system will
have customer application records as experience and task would be to classify whether the given
customer application is eligible for a loan. So in this case, the training examples can be represented as
(x1,y1)(x2,y2)..(xn,yn) where X represents customer application details and y represents the status of
credit approval.
With these details, what is that exact knowledge to be learned from the training experience?
So the target function to be learned in the credit approval learning system is a mapping function f:X →y.
This function represents the exact knowledge defining the relationship between input variable X and
output variable y.
Design of a learning system
Just now we looked into the learning process and also understood the goal of the learning. When we
want to design a learning system that follows the learning process, we need to consider a few design
choices. The design choices will be to decide the following key components:
1.
2.
3.
4.
5.
Type of training experience
Choosing the Target Function
Choosing a representation for the Target Function
Choosing an approximation algorithm for the Target Function
The final Design
We will look into the game - checkers learning problem and apply the above design choices. For a
checkers learning problem, the three elements will be,
1. Task T: To play checkers
2. Performance measure P: Total percent of the game won in the tournament.
3. Training experience E: A set of games played against itself

Last modified: Wednesday, 18 June 2025, 2:54 PM