我们跳棋学习系统的最终设计可以自然地通过四个独立的程序模块来描述,这些模块代表了许多学习系统中的核心组成部分。

  1. 性能系统(The Performance System) — 接收一个新的棋盘作为输入,并输出它与自身对弈的游戏轨迹。
  2. 评论家(The Critic) — 接收游戏轨迹作为输入,并输出一组目标函数的训练示例。
  3. 泛化器(The Generalizer) — 接收训练示例作为输入,并输出一个估计目标函数的假设。对新情况的良好泛化至关重要。
  4. 实验生成器(The Experiment Generator) — 接收当前假设(当前学习到的函数)作为输入,并输出一个新的问题(一个初始棋盘状态),供性能系统探索。


The final design of our checkers learning system can be naturally described by four distinct
program modules that represent the central components in many learning systems.
1. The performance System — Takes a new board as input and outputs a trace of the game it played
against itself.
2. The Critic — Takes the trace of a game as an input and outputs a set of training examples of the
target function.
3. The Generalizer — Takes training examples as input and outputs a hypothesis that estimates the
target function. Good generalization to new cases is crucial.
4. The Experiment Generator — Takes the current hypothesis (currently learned function) as input and
outputs a new problem (an initial board state) for the performance system to explore.

最后修改: 2025年06月18日 星期三 22:00