集成学习

集成学习是一种技术,它创建多个模型并将它们组合起来以产生改进的结果。集成学习通常比单个模型产生更准确的解决方案。

  • 集成学习方法适用于回归分类问题。
    • 回归的集成学习创建多个回归器,即多个回归模型,如线性、多项式等。
    • 分类的集成学习创建多个分类器,即多个分类模型,如逻辑回归、决策树、KNN、SVM 等。

图 1:集成学习视图


组合哪些组件?

  • 不同的学习算法
  • 相同的学习算法以不同的方式训练
  • 相同的学习算法以相同的方式训练

集成学习有两个步骤:

使用相同或不同的机器学习算法生成多个机器学习模型。这些模型被称为“基模型”。预测是基于这些基模型进行的。


集成学习中的技术/方法

投票 (Voting)、纠错输出码 (Error-Correcting Output Codes)、Bagging(装袋法):随机森林树 (Random Forest Trees)、Boosting(提升法):AdaBoost (Adaboost)、Stacking(堆叠法)。



Ensemble Learning
Ensemble learning usually produces more accurate solutions than a single model woulEnsemble Learning is a technique that create multiple models and then combine them them to produce
improved results. Ensemble learning usually produces more accurate solutions than a single model
would.
 Ensemble learning methods is applied to regression as well as classification.
o Ensemble learning for regression creates multiple repressors i.e. multiple regression
models such as linear, polynomial, etc.
o Ensemble learning for classification creates multiple classifiers i.e. multiple classification
models such as logistic, decision tress, KNN, SVM, etc.
Figure 1: Ensemble learning view
Which components to combine?
• different learning algorithms
• same learning algorithm trained in different ways
• same learning algorithm trained the same way
There are two steps in ensemble learning:
Multiples machine learning models were generated using same or different machine learning
algorithm. These are called “base models”. The prediction perform on the basis of base models.
Techniques/Methods in ensemble learning
Voting, Error-Correcting Output Codes, Bagging: Random Forest Trees, Boosting: Adaboost, Stacking.

最后修改: 2025年06月20日 星期五 10:14