机器学习
逻辑模型使用逻辑表达式将实例空间划分为多个片段,从而构建分组模型。逻辑表达式是返回布尔值(即真或假)的表达式。一旦数据使用逻辑表达式进行分组,数据就会根据我们试图解决的问题被划分为同质的组。例如,对于分类问题,组中的所有实例都属于同一类别。
逻辑模型主要有两种:树模型和规则模型。
规则模型由一系列蕴涵式或IF-THEN规则组成。对于基于树的模型,“if-part”定义了一个片段,而“then-part”定义了模型在该片段中的行为。规则模型遵循相同的推理方式。
逻辑模型与概念学习
为了进一步理解逻辑模型,我们需要理解概念学习的思想。
概念学习涉及从示例中学习逻辑表达式或概念。概念学习的思想与机器学习的思想非常契合,即从具体的训练示例中推断出通用函数。概念学习构成了基于树和基于规则模型的基础。更正式地说,概念学习涉及从给定类别的正例和负例训练集中获取通用类别的定义。概念学习的正式定义是“从其输入和输出的训练示例中推断出布尔值函数”。在概念学习中,我们只学习正类别的描述,并将所有不符合该描述的事物标记为负类别。
以下示例更详细地解释了这一思想。
如上所示的**“享受运动”概念学习任务由来自某些示例日的数据集定义。每个数据都由六个属性描述。任务是根据给定日期的属性值,学习预测该日期是否“享受运动”。该问题可以通过一系列假设来表示。每个假设都通过属性约束的合取式来描述。训练数据代表目标函数的正例和负例集**。在上述示例中,每个假设都是一个包含六个约束的向量,指定了六个属性的值——天空、气温、湿度、风、水和预测。训练阶段涉及学习“享受运动 = 是”的日期集合(作为属性的合取式)。
因此,该问题可以表述为:
- 给定实例 X,它代表了所有可能的日期集合,每个日期由以下属性描述:
- 天空 – (值: 晴朗, 多云, 下雨),
- 气温 – (值: 温暖, 寒冷),
- 湿度 – (值: 正常, 高),
- 风 – (值: 强劲, 微弱),
- 水 – (值: 温暖, 寒冷),
- 预测 – (值: 相同, 变化)。
- 尝试识别一个函数,该函数可以预测目标变量“享受运动”为“是/否”,即1或0。
Logical models use a logical expression to divide the instance space into segments and hence
construct grouping models. A logical expression is an expression that returns a Boolean value, i.e., a
True or False outcome. Once the data is grouped using a logical expression, the data is divided into
homogeneous groupings for the problem we are trying to solve. For example, for a classification
problem, all the instances in the group belong to one class.
There are mainly two kinds of logical models: Tree models and Rule models.
Rule models consist of a collection of implications or IF-THEN rules. For tree-based models, the ‘if-part’
defines a segment and the ‘then-part’ defines the behaviour of the model for this segment. Rule models
follow the same reasoning.
Logical models and Concept learning
To understand logical models further, we need to understand the idea of Concept Learning.
Concept Learning involves learning logical expressions or concepts from examples. The idea of Concept
Learning fits in well with the idea of Machine learning, i.e., inferring a general function from specific
training examples. Concept learning forms the basis of both tree-based and rule-based models. More
formally, Concept Learning involves acquiring the definition of a general category from a given set of
positive and negative training examples of the category. A Formal Definition for Concept Learning is
“The inferring of a Boolean-valued function from training examples of its input and output.” In
concept learning, we only learn a description for the positive class and label everything that doesn’t
satisfy that description as negative.
The following example explains this idea in more detail.
A Concept Learning Task called “Enjoy Sport” as shown above is defined by a set of data from
some example days. Each data is described by six attributes. The task is to learn to predict the value of
Enjoy Sport for an arbitrary day based on the values of its attribute values. The problem can be
represented by a series of hypotheses. Each hypothesis is described by a conjunction of constraints on
the attributes. The training data represents a set of positive and negative examples of the target
function. In the example above, each hypothesis is a vector of six constraints, specifying the values of
the six attributes – Sky, AirTemp, Humidity, Wind, Water, and Forecast. The training phase involves
learning the set of days (as a conjunction of attributes) for which Enjoy Sport = yes.
Thus, the problem can be formulated as:
Given instances X which represent a set of all possible days, each described by the attributes:
o Sky – (values: Sunny, Cloudy, Rainy),
o AirTemp – (values: Warm, Cold),
o Humidity – (values: Normal, High),
o Wind – (values: Strong, Weak),
o Water – (values: Warm, Cold),
o Forecast – (values: Same, Change).
Try to identify a function that can predict the target variable Enjoy Sport as yes/no, i.e., 1 or 0.