Processing

Please wait...

Settings

Settings

Goto Application

1. WO2021062029 - JOINT PRUNING AND QUANTIZATION SCHEME FOR DEEP NEURAL NETWORKS

Publication Number WO/2021/062029
Publication Date 01.04.2021
International Application No. PCT/US2020/052544
International Filing Date 24.09.2020
IPC
G06N 3/063 2006.01
GPHYSICS
06COMPUTING; CALCULATING OR COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
063using electronic means
G06N 3/08 2006.01
GPHYSICS
06COMPUTING; CALCULATING OR COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
08Learning methods
G06N 3/04 2006.01
GPHYSICS
06COMPUTING; CALCULATING OR COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
04Architecture, e.g. interconnection topology
CPC
G06N 3/04
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
04Architectures, e.g. interconnection topology
G06N 3/0472
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
04Architectures, e.g. interconnection topology
0472using probabilistic elements, e.g. p-rams, stochastic processors
G06N 3/0481
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
04Architectures, e.g. interconnection topology
0481Non-linear activation functions, e.g. sigmoids, thresholds
G06N 3/063
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
063using electronic means
G06N 3/082
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
08Learning methods
082modifying the architecture, e.g. adding or deleting nodes or connections, pruning
G06N 3/084
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
08Learning methods
084Back-propagation
Applicants
  • QUALCOMM INCORPORATED [US]/[US]
Inventors
  • LU, Yadong
  • WANG, Ying
  • BLANKEVOORT, Tijmen Pieter Frederik
  • LOUIZOS, Christos
  • REISSER, Matthias
  • HOU, Jilei
Agents
  • LENKIN, Alan M.
  • LUTZ, Joseph
  • PARTOW-NAVID, Puya
  • CROSBY, Cornell D.
Priority Data
17/030,31523.09.2020US
2019010041024.09.2019GR
Publication Language English (EN)
Filing Language English (EN)
Designated States
Title
(EN) JOINT PRUNING AND QUANTIZATION SCHEME FOR DEEP NEURAL NETWORKS
(FR) SCHÉMA D'ÉLAGAGE ET DE QUANTIFICATION D'ARTICULATION POUR RÉSEAUX DE NEURONES ARTIFICIELS PROFONDS
Abstract
(EN)
A method for compressing a deep neural network includes determining a pruning ratio for a channel and a mixed-precision quantization bit-width based on an operational budget of a device implementing the deep neural network. The method further includes quantizing a weight parameter of the deep neural network and/or an activation parameter of the deep neural network based on the quantization bit-width. The method also includes pruning the channel of the deep neural network based on the pruning ratio.
(FR)
La présente invention concerne un procédé de compression d'un réseau de neurones artificiels profond qui consiste à déterminer un rapport d'élagage pour un canal et une largeur binaire de quantification de précision mixte sur la base d'un budget opérationnel d'un dispositif mettant en œuvre le réseau de neurones artificiels profond. Le procédé consiste en outre à quantifier un paramètre de poids du réseau de neurones artificiels profond et/ou un paramètre d'activation du réseau de neurones artificiels profond sur la base de la largeur binaire de quantification. Le procédé consiste également à élaguer le canal du réseau à neurones artificiels profond sur la base du rapport d'élagage.
Latest bibliographic data on file with the International Bureau