机器学习: 1.7.2 列表-然后-消除算法（The LIST-THEN-ELIMINATE algorithm）

列表-然后-消除算法首先将版本空间初始化为包含 H 中所有假设，然后消除任何与训练示例不一致的假设。

VersionSpace 包含 H 中所有假设的列表。
对于每个训练示例 (x,c(x))，从 VersionSpace 中删除任何满足 $h (x) \neq= c (x)$ 的假设 h。
输出 VersionSpace 中的假设列表。

只要版本空间是有限的，列表-然后-消除算法在原则上是可行的。
然而，由于它在实践中需要穷举所有假设，因此不可行。

版本空间的更紧凑表示

版本空间由其最一般成员和最不一般成员表示。这些成员构成了一般边界集和特殊边界集，在部分有序的假设空间中划定了版本空间的范围。

定义：一般边界 G，是相对于假设空间 H 和训练数据 D 而言，与 D 一致的 H 中最大一般成员的集合。

$G \equiv {g \in H ∣ Consistent (g, D) \land (\neg\exists g' \in H) [(g' > g g) \land Consistent (g', D)]}$

定义：特殊边界 S，是相对于假设空间 H 和训练数据 D 而言，与 D 一致的 H 中最小一般（即最大特殊）成员的集合。

$S \equiv {s \in H ∣ Consistent (s, D) \land (\neg\exists s' \in H) [(s > g s') \land Consistent (s', D)]}$

定理：版本空间表示定理

定理： 设 X 是一个任意实例集合，设 H 是定义在 X 上的布尔值假设集合。设 $c : X \to {0, 1}$ 是定义在 X 上的任意目标概念，设 D 是一个由 {(x,c(x))} 构成的任意训练示例集。对于所有 X,H,c, 和 D，使得 S 和 G 定义良好，有：

$V S H, D = {h \in H ∣ (\exists s \in S) (\exists g \in G) (g \geq g h \geq g s)}$

证明草图：

1. 证明满足上述表达式右侧的每个 h 都属于 VSH,D

设 g,h,s 分别是 G,H,S 的任意成员，且满足 $g \geq g h \geq g s$ 。

根据 S 的定义，s 必须被 D 中的所有正例满足。因为 $h \geq g s$ ，所以 h 也必须被 D 中的所有正例满足。
根据 G 的定义，g 不能被 D 中的任何负例满足，并且因为 $g \geq g h$ ，所以 h 也不能被 D 中的任何负例满足。由于 h 被 D 中的所有正例满足，并且不被 D 中的任何负例满足，因此 h 与 D 一致，从而 h 是 VSH,D 的成员。

2. 证明 VSH,D 的每个成员都满足表达式的右侧

这可以通过假设 VSH,D 中存在不满足表达式右侧的某个 h，然后证明这会导致不一致来完成。

The LIST-THEN-ELIMINATE algorithm first initializes the version space to contain all hypotheses in H
and then eliminates any hypothesis found inconsistent with any training example.
1.
2.
3.
VersionSpace c a list containing every hypothesis in H
For each training example, (x, c(x)) remove from VersionSpace any hypothesis h for which h(x) ≠ c(x)
Output the list of hypotheses in VersionSpace


List-Then-Eliminate works in principle, so long as version space is finite.
However, since it requires exhaustive enumeration of all hypotheses in practice it is not feasible.
A More Compact Representation for Version Spaces
The version space is represented by its most general and least general members. These members form general
and specific boundary sets that delimit the version space within the partially ordered hypothesis space.
Definition: The general boundary G, with respect to hypothesis space H and training data D, is the set of
maximally general members of H consistent with D
G {g  H | Consistent (g, D)(g'  H)[(g'  g)  Consistent(g', D)]}
g
Definition: The specific boundary S, with respect to hypothesis space H and training data D, is the set of
minimally general (i.e., maximally specific) members of H consistent with D.
S {s  H | Consistent (s, D)(s'  H)[(s  s')  Consistent(s', D)]}
g
Theorem: Version Space representation theorem
Theorem: Let X be an arbitrary set of instances and Let H be a set of Boolean-valued hypotheses defined over X.
Let c: X →{O, 1} be an arbitrary target concept defined over X, and let D be an arbitrary set of training exa{(x, c(x))). For all X, H, c, and D such that S and G are well defined,
VS
={ h  H | (s  S ) (g  G ) ( g  h  s )}
H,D
g
g
To Prove:
1. Every h satisfying the right hand side of the above expression is in VS
H, D
2. Every member of VS satisfies the right-hand side of the expression
H, D
Sketch of proof:
1. let g, h, s be arbitrary members of G, H, S respectively with g  h  s
g g By the definition of S, s must be satisfied by all positive examples in D. Because h  s,
gh must also be satisfied by all positive examples in D.
 By the definition of G, g cannot be satisfied by any negative example in D, and because g  h
gh cannot be satisfied by any negative example in D. Because h is satisfied by all positive
examples in D and by no negative examples in D, h is consistent with D, and therefore h is a
member of VSH,D.
2. It can be proven by assuming some h in VSH,D ,that does not satisfy the right-hand side of
the expression, then showing that this leads to an inconsistency

Last modified: Thursday, 19 June 2025, 9:13 AM