章节大纲

  • 强化学习控制理论的一个子领域,它关注如何控制随时间变化的系统,并广泛应用于自动驾驶汽车、机器人和游戏机器人等领域。在本指南中,你将使用强化学习来为 Atari 视频游戏构建一个机器人。这个机器人无法访问游戏的内部信息。相反,它只能访问游戏的渲染显示和该显示所对应的奖励,这意味着它只能看到人类玩家所能看到的东西。

    在机器学习中,机器人被正式称为智能体 (agent)。在本教程中,智能体是系统中根据决策函数(称为策略 (policy))行动的“玩家”。主要目标是通过赋予智能体强大的策略来开发出色的智能体。换句话说,我们的目标是通过赋予智能体强大的决策能力来开发智能机器人。

    本教程将从训练一个基本的强化学习智能体开始。这个智能体在玩经典 Atari 街机游戏**《太空入侵者》时会采取随机行动,这会作为你的比较基准。之后,你将探索其他几种技术——包括 Q-学习、深度 Q-学习和最小二乘法——同时构建能玩《太空入侵者》和《冰冻湖 (Frozen Lake)》(一个包含在 Gym https://gym.openai.com/ 中的简单游戏环境,Gym 是 OpenAI https://openai.com/ 发布的一个强化学习工具包)的智能体。通过本教程,你将理解在机器学习中选择模型复杂度的基本概念**。


    Reinforcement learning is a subfield within control theory, which
    concerns controlling systems that change over time and broadly includes
    applications such as self-driving cars, robotics, and bots for games.
    Throughout this guide, you will use reinforcement learning to build a bot
    for Atari video games. This bot is not given access to internal information
    about the game. Instead, it’s only given access to the game’s rendered
    display and the reward for that display, meaning that it can only see
    what a human player would see.
    In machine learning, a bot is formally known as an agent. In the case of
    this tutorial, an agent is a “player” in the system that acts according to a
    decision-making function, called a policy. The primary goal is to develop
    strong agents by arming them with strong policies. In other words, our
    aim is to develop intelligent bots by arming them with strong decision-
    making capabilities.
    You will begin this tutorial by training a basic reinforcement learning
    agent that takes random actions when playing Space Invaders, the classic
    Atari arcade game, which will serve as your baseline for comparison.
    Following this, you will explore several other techniques — including Q-
    learning, deep Q-learning, and least squares — while building agents
    that play Space Invaders and Frozen Lake, a simple game environment
    included in Gym (https://gym.openai.com/), a reinforcement learning
    toolkit released by OpenAI (https://openai.com/). By following this
    tutorial, you will gain an understanding of the fundamental concepts
    that govern one’s choice of model complexity in machine learning.