Processing

Please wait...

Settings

Settings

Goto Application

1. WO2021067057 - NEURAL NETWORK TRAINING IN A DISTRIBUTED SYSTEM

Publication Number WO/2021/067057
Publication Date 08.04.2021
International Application No. PCT/US2020/051791
International Filing Date 21.09.2020
IPC
G06N 3/04 2006.01
GPHYSICS
06COMPUTING; CALCULATING OR COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
04Architecture, e.g. interconnection topology
G06N 3/063 2006.01
GPHYSICS
06COMPUTING; CALCULATING OR COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
063using electronic means
G06N 3/08 2006.01
GPHYSICS
06COMPUTING; CALCULATING OR COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
08Learning methods
CPC
G06N 3/0454
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
04Architectures, e.g. interconnection topology
0454using a combination of multiple neural nets
G06N 3/063
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
063using electronic means
G06N 3/084
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
08Learning methods
084Back-propagation
G06N 3/10
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
10Simulation on general purpose computers
Applicants
  • AMAZON TECHNOLOGIES, INC. [US]/[US]
Inventors
  • VIVEKRAJA, Vignesh
  • HAH, Thiam Khean
  • HUANG, Randy Renfu
  • DIAMANT, Ron
  • HEATON, Richard John
Agents
  • WYLIE, Roger D.
  • CHOY, Ming W.
Priority Data
16/588,60330.09.2019US
Publication Language English (EN)
Filing Language English (EN)
Designated States
Title
(EN) NEURAL NETWORK TRAINING IN A DISTRIBUTED SYSTEM
(FR) ENTRAÎNEMENT DE RÉSEAU NEURONAL DANS UN SYSTÈME DISTRIBUÉ
Abstract
(EN)
Methods and systems for performing a training operation of a neural network are provided. In one example, a method comprises: performing backward propagation computations for a second layer of a neural network to generate second weight gradients; splitting the second weight gradients into portions; causing a hardware interface to exchange a first portion of the second weight gradients with the second computer system; performing backward propagation computations for a first layer of the neural network to generate first weight gradients when the exchange of the first portion of the second weight gradients is underway, the first layer being a lower layer than the second layer in the neural network; causing the hardware interface to transmit the first weight gradients to the second computer system; and causing the hardware interface to transmit the remaining portions of the second weight gradients to the second computer system.
(FR)
La présente invention concerne des systèmes et des procédés pour réaliser une opération d'entraînement d'un réseau neuronal. Dans un exemple, un procédé consiste à : effectuer des calculs de propagation vers l'arrière pour une seconde couche d'un réseau neuronal pour générer des seconds gradients de poids ; à séparer les seconds gradients de poids en parties ; à amener une interface matérielle à échanger une première partie des seconds gradients de poids avec le second système informatique ; à effectuer des calculs de propagation vers l'arrière pour une première couche du réseau neuronal pour générer des premiers gradients de poids lorsque l'échange de la première partie des seconds gradients de poids est en cours, la première couche étant une couche inférieure à celle de la seconde couche dans le réseau neuronal ; à amener l'interface matérielle à transmettre les premiers gradients de poids au second système informatique ; et à amener l'interface matérielle à transmettre les parties restantes des seconds gradients de poids au second système informatique.
Also published as
Latest bibliographic data on file with the International Bureau