Processing

Please wait...

Settings

Settings

Goto Application

1. CN110546653 - ACTION SELECTION FOR REINFORCEMENT LEARNING USING NEURAL NETWORKS

Office China
Application Number 201880013632.8
Application Date 19.02.2018
Publication Number 110546653
Publication Date 06.12.2019
Publication Kind A
IPC
G06N 3/04
GPHYSICS
06COMPUTING; CALCULATING OR COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
04Architecture, e.g. interconnection topology
G06N 3/08
GPHYSICS
06COMPUTING; CALCULATING OR COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
08Learning methods
G06N 3/00
GPHYSICS
06COMPUTING; CALCULATING OR COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
CPC
G06N 3/006
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
004Artificial life, i.e. computers simulating life
006based on simulated virtual individual or collective life forms, e.g. single "avatar", social simulations, virtual worlds or particle swarm optimisation
G06N 3/0445
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
04Architectures, e.g. interconnection topology
0445Feedback networks, e.g. hopfield nets, associative networks
G06N 3/0454
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
04Architectures, e.g. interconnection topology
0454using a combination of multiple neural nets
G06N 3/08
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
08Learning methods
G06N 3/04
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
04Architectures, e.g. interconnection topology
Applicants DEEPMIND TECHNOLOGIES LIMITED
渊慧科技有限公司
Inventors OSINDERO SIMON
S.奥新德罗
KAVUKCUOGLU KORAY
K.卡夫库格鲁
VEZHNEVETS ALEXANDER
A.维兹尼韦茨
Agents 北京市柳沈律师事务所 11105
Priority Data 01624635 25.07.1932 US
Title
(EN) ACTION SELECTION FOR REINFORCEMENT LEARNING USING NEURAL NETWORKS
(ZH) 使用神经网络的用于强化学习的动作选择
Abstract
(EN)
The present invention discloses methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for a system configured to select actions to be performed by an agentthat interacts with an environment. The system comprises a manager neural network subsystem and a worker neural network subsystem. The manager subsystem is configured to, at each of the multiple timesteps, generate a final goal vector for the time step. The worker subsystem is configured to, at each of multiple time steps, use the final goal vector generated by the manager subsystem to generatea respective action score for each action in a predetermined set of actions.

(ZH)
公开了方法、系统、和装置,包括在计算机存储介质上编码的计算机程序,用于被配置为选择要由与环境交互的代理执行的动作的系统。系统包括管理者神经网络子系统和工作者神经网络子系统。管理者子系统被配置为在多个时间步中的每一个时间步处生成时间步的最终目标向量。工作者子系统被配置为在多个时间步中的每一个时间步处,使用由管理者子系统生成的最终目标向量来为预定动作集中的每个动作生成相应的动作得分。