Processing

Please wait...

Settings

Settings

Goto Application

1. WO2020142192 - NEURAL NETWORK ACTIVATION COMPRESSION WITH NARROW BLOCK FLOATING-POINT

Publication Number WO/2020/142192
Publication Date 09.07.2020
International Application No. PCT/US2019/066675
International Filing Date 17.12.2019
IPC
G06N 3/063 2006.1
GPHYSICS
06COMPUTING; CALCULATING OR COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
063using electronic means
G06N 3/08 2006.1
GPHYSICS
06COMPUTING; CALCULATING OR COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
08Learning methods
G06N 3/04 2006.1
GPHYSICS
06COMPUTING; CALCULATING OR COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
04Architecture, e.g. interconnection topology
CPC
G06F 7/49915
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
FELECTRIC DIGITAL DATA PROCESSING
7Methods or arrangements for processing data by operating upon the order or content of the data handled
38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
48using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
499Denomination or exception handling, e.g. rounding, overflow
49905Exception handling
4991Overflow or underflow
49915Mantissa overflow or underflow in handling floating-point numbers
G06F 9/30025
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
FELECTRIC DIGITAL DATA PROCESSING
9Arrangements for program control, e.g. control units
06using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
30Arrangements for executing machine instructions, e.g. instruction decode
30003Arrangements for executing specific machine instructions
30007to perform operations on data operands
30025Format conversion instructions, e.g. Floating-Point to Integer, decimal conversion
G06F 9/5027
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
FELECTRIC DIGITAL DATA PROCESSING
9Arrangements for program control, e.g. control units
06using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
46Multiprogramming arrangements
50Allocation of resources, e.g. of the central processing unit [CPU]
5005to service a request
5027the resource being a machine, e.g. CPUs, Servers, Terminals
G06N 20/00
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
20Machine learning
G06N 3/0445
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
04Architectures, e.g. interconnection topology
0445Feedback networks, e.g. hopfield nets, associative networks
G06N 3/063
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
063using electronic means
Applicants
  • MICROSOFT TECHNOLOGY LICENSING, LLC [US]/[US]
Inventors
  • LO, Daniel
  • PHANISHAYEE, Amar
  • CHUNG, Eric S.
  • ZHAO, Yiren
  • ZHAO, Ritchie
Agents
  • MINHAS, Sandip S.
  • ADJEMIAN, Monica
  • BARKER, Doug
  • CHATTERJEE, Aaron C.
  • CHEN, Wei-Chen Nicholas
  • CHOI, Daniel
  • CHURNA, Timothy
  • DINH, Phong
  • EVANS, Patrick
  • GABRYJELSKI, Henry
  • GOLDSMITH, Micah P.
  • GUPTA, Anand
  • HINOJOSA-SMITH, Brianna L.
  • HWANG, William C.
  • JARDINE, John S.
  • LEE, Sunah
  • LEMMON, Marcus
  • MARQUIS, Thomas
  • MEYERS, Jessica
  • ROPER, Brandon
  • SPELLMAN, Steven
  • SULLIVAN, Kevin
  • SWAIN, Cassandra T.
  • TABOR, Ben
  • WALKER, Matt
  • WIGHT, Stephen A.
  • WISDOM, Gregg
  • WONG, Ellen
  • WONG, Thomas S.
  • ZHANG, Hannah
  • TRAN, Kimberly
Priority Data
16/237,19731.12.2018US
Publication Language English (EN)
Filing Language English (EN)
Designated States
Title
(EN) NEURAL NETWORK ACTIVATION COMPRESSION WITH NARROW BLOCK FLOATING-POINT
(FR) COMPRESSION D'ACTIVATION DE RÉSEAU NEURONAL DOTÉE D'UNE VIRGULE FLOTTANTE DE BLOC ÉTROIT
Abstract
(EN)
Apparatus and methods for training a neural network accelerator using quantized precision data formats are disclosed, and in particular for storing activation values from a neural network in a compressed format for use during forward and backward propagation training of the neural network. In certain examples of the disclosed technology, a computing system includes processors, memory, and a compressor in communication with the memory. The computing system is configured to perform forward propagation for a layer of a neural network to produced first activation values in a first block floating-point format. In some examples, activation values generated by forward propagation are converted by the compressor to a second block floating-point format having a narrower numerical precision than the first block floating-point format. The compressed activation values are stored in the memory, where they can be retrieved for use during back propagation.
(FR)
L'invention concerne un appareil et des procédés afin de former un accélérateur de réseau neuronal utilisant des formats de données de précision quantifiés et, en particulier, afin de stocker des valeurs d'activation à partir d'un réseau neuronal dans un format compressé pour une utilisation pendant un apprentissage de propagation vers l'avant et vers l'arrière du réseau neuronal. Dans certains exemples de la technologie de l'invention, un système informatique comprend des processeurs, une mémoire et un compresseur en communication avec la mémoire. Le système informatique est configuré afin d'effectuer une propagation vers l'avant pour une couche d'un réseau neuronal pour des premières valeurs d'activation produites dans un premier format de virgule flottante de bloc. Dans certains exemples, des valeurs d'activation générées par une propagation vers l'avant sont converties par le compresseur en un second format de virgule flottante de bloc ayant une précision numérique plus étroite que celle du premier format de virgule flottante de bloc. Les valeurs d'activation compressées sont stockées dans la mémoire, où elles peuvent être récupérées afin d'être utilisées pendant une propagation vers l'arrière.
Also published as
Latest bibliographic data on file with the International Bureau