机器学习: 3.1 引言（Introduction）

集成学习

集成学习是一种技术，它创建多个模型并将它们组合起来以产生改进的结果。集成学习通常比单个模型产生更准确的解决方案。

集成学习方法适用于回归和分类问题。
- 回归的集成学习创建多个回归器，即多个回归模型，如线性、多项式等。
- 分类的集成学习创建多个分类器，即多个分类模型，如逻辑回归、决策树、KNN、SVM 等。

图 1：集成学习视图

组合哪些组件？

不同的学习算法
相同的学习算法以不同的方式训练
相同的学习算法以相同的方式训练

集成学习有两个步骤：

使用相同或不同的机器学习算法生成多个机器学习模型。这些模型被称为“基模型”。预测是基于这些基模型进行的。

集成学习中的技术/方法

投票 (Voting)、纠错输出码 (Error-Correcting Output Codes)、Bagging（装袋法）：随机森林树 (Random Forest Trees)、Boosting（提升法）：AdaBoost (Adaboost)、Stacking（堆叠法）。

Ensemble Learning
Ensemble learning usually produces more accurate solutions than a single model woulEnsemble Learning is a technique that create multiple models and then combine them them to produce
improved results. Ensemble learning usually produces more accurate solutions than a single model
would.
 Ensemble learning methods is applied to regression as well as classification.
o Ensemble learning for regression creates multiple repressors i.e. multiple regression
models such as linear, polynomial, etc.
o Ensemble learning for classification creates multiple classifiers i.e. multiple classification
models such as logistic, decision tress, KNN, SVM, etc.
Figure 1: Ensemble learning view
Which components to combine?
• different learning algorithms
• same learning algorithm trained in different ways
• same learning algorithm trained the same way
There are two steps in ensemble learning:
Multiples machine learning models were generated using same or different machine learning
algorithm. These are called “base models”. The prediction perform on the basis of base models.
Techniques/Methods in ensemble learning
Voting, Error-Correcting Output Codes, Bagging: Random Forest Trees, Boosting: Adaboost, Stacking.

最后修改: 2025年06月20日星期五 10:14