Processing

Please wait...

Settings

Settings

Goto Application

1. WO2020142183 - NEURAL NETWORK ACTIVATION COMPRESSION WITH OUTLIER BLOCK FLOATING-POINT

Publication Number WO/2020/142183
Publication Date 09.07.2020
International Application No. PCT/US2019/066433
International Filing Date 16.12.2019
IPC
H03M 7/24 2006.01
HELECTRICITY
03BASIC ELECTRONIC CIRCUITRY
MCODING, DECODING OR CODE CONVERSION, IN GENERAL
7Conversion of a code where information is represented by a given sequence or number of digits to a code where the same information is represented by a different sequence or number of digits
14Conversion to or from non-weighted codes
24Conversion to or from floating-point codes
H03M 7/30 2006.01
HELECTRICITY
03BASIC ELECTRONIC CIRCUITRY
MCODING, DECODING OR CODE CONVERSION, IN GENERAL
7Conversion of a code where information is represented by a given sequence or number of digits to a code where the same information is represented by a different sequence or number of digits
30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
G06N 3/06 2006.01
GPHYSICS
06COMPUTING; CALCULATING OR COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
CPC
G06F 7/49915
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
FELECTRIC DIGITAL DATA PROCESSING
7Methods or arrangements for processing data by operating upon the order or content of the data handled
38Methods or arrangements for performing computations using exclusively denominational number representation, e.g. using binary, ternary, decimal representation
48using non-contact-making devices, e.g. tube, solid state device; using unspecified devices
499Denomination or exception handling, e.g. rounding, overflow
49905Exception handling
4991Overflow or underflow
49915Mantissa overflow or underflow in handling floating-point numbers
G06F 9/30025
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
FELECTRIC DIGITAL DATA PROCESSING
9Arrangements for program control, e.g. control units
06using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
30Arrangements for executing machine instructions, e.g. instruction decode
30003Arrangements for executing specific machine instructions
30007to perform operations on data operands
30025Format conversion instructions, e.g. Floating-Point to Integer, decimal conversion
G06F 9/5027
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
FELECTRIC DIGITAL DATA PROCESSING
9Arrangements for program control, e.g. control units
06using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
46Multiprogramming arrangements
50Allocation of resources, e.g. of the central processing unit [CPU]
5005to service a request
5027the resource being a machine, e.g. CPUs, Servers, Terminals
G06N 20/00
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
20Machine learning
G06N 3/0481
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
04Architectures, e.g. interconnection topology
0481Non-linear activation functions, e.g. sigmoids, thresholds
G06N 3/063
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
063using electronic means
Applicants
  • MICROSOFT TECHNOLOGY LICENSING, LLC [US]/[US]
Inventors
  • LO, Daniel
  • PHANISHAYEE, Amar
  • CHUNG, Eric S.
  • ZHAO, Yiren
  • ZHAO, Ritchie
Agents
  • MINHAS, Sandip S.
  • ADJEMIAN, Monica
  • BARKER, Doug
  • CHATTERJEE, Aaron C.
  • CHEN, Wei-Chen Nicholas
  • CHOI, Daniel
  • CHURNA, Timothy
  • DINH, Phong
  • EVANS, Patrick
  • GABRYJELSKI, Henry
  • GOLDSMITH, Micah P.
  • GUPTA, Anand
  • HINOJOSA-SMITH, Brianna L.
  • HWANG, William C.
  • JARDINE, John S.
  • LEE, Sunah
  • LEMMON, Marcus
  • MARQUIS, Thomas
  • MEYERS, Jessica
  • ROPER, Brandon
  • SPELLMAN, Steven
  • SULLIVAN, Kevin
  • SWAIN, Cassandra T.
  • TABOR, Ben
  • WALKER, Matt
  • WIGHT, Stephen A.
  • WISDOM, Gregg
  • WONG, Ellen
  • WONG, Thomas S.
  • ZHANG, Hannah
  • TRAN, Kimberly
Priority Data
16/237,20231.12.2018US
Publication Language English (EN)
Filing Language English (EN)
Designated States
Title
(EN) NEURAL NETWORK ACTIVATION COMPRESSION WITH OUTLIER BLOCK FLOATING-POINT
(FR) COMPRESSION D'ACTIVATION DE RÉSEAU NEURONAL AVEC VIRGULE FLOTTANTE DE BLOC ABERRANT
Abstract
(EN)
Apparatus and methods for training a neural network accelerator using quantized precision data formats having outlier values are disclosed, and in particular for storing activation values from a neural network in a compressed format for use during forward and backward propagation training of the neural network. In certain examples of the disclosed technology, a computing system is configured to perform forward propagation for a layer of a neural network to produced first activation values in a first block floating-point format. In some examples, activation values generated by forward propagation are converted by the compressor to a second block floating-point format having a narrower numerical precision than the first block floating-point format. Outlier values, comprising additional bits of mantissa and/or exponent are stored in ancillary storage for subset of the activation values. The compressed activation values are stored in the memory, where they can be retrieved for use during back propagation.
(FR)
L'invention concerne un appareil et des procédés destinés à entraîner un accélérateur de réseau neuronal à l'aide de formats de données de précision quantifiés ayant des valeurs aberrantes, et en particulier destinés à stocker des valeurs d'activation provenant d'un réseau neuronal dans un format compressé pour une utilisation pendant un apprentissage de propagation avant et arrière du réseau neuronal. Dans certains exemples de la technologie de l'invention, un système informatique est configuré pour effectuer une propagation avant d'une couche d'un réseau neuronal à des premières valeurs d'activation produites dans un premier format de virgule flottante de bloc. Dans certains exemples, des valeurs d'activation générées par une propagation avant sont converties par le compresseur en un second format de virgule flottante de bloc ayant une précision numérique plus étroite que le premier format de virgule flottante de bloc. Des valeurs aberrantes, comprenant des bits supplémentaires de mantisse et/ou d'exposant, sont stockées dans un dispositif de stockage auxiliaire pour un sous-ensemble de valeurs d'activation. Les valeurs d'activation compressées sont stockées dans la mémoire, où elles peuvent être récupérées pour être utilisées pendant la propagation arrière.
Also published as
Latest bibliographic data on file with the International Bureau