Processing

Please wait...

Settings

Settings

Goto Application

1. WO2020092281 - PROBABILISTIC NEURAL NETWORK ARCHITECTURE GENERATION

Publication Number WO/2020/092281
Publication Date 07.05.2020
International Application No. PCT/US2019/058418
International Filing Date 29.10.2019
IPC
G06N 3/08 2006.01
GPHYSICS
06COMPUTING; CALCULATING OR COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
08Learning methods
G06N 3/04 2006.01
GPHYSICS
06COMPUTING; CALCULATING OR COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
04Architecture, e.g. interconnection topology
G06N 5/00 2006.01
GPHYSICS
06COMPUTING; CALCULATING OR COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
5Computer systems using knowledge-based models
G06N 7/00 2006.01
GPHYSICS
06COMPUTING; CALCULATING OR COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
7Computer systems based on specific mathematical models
CPC
G06K 9/6257
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
9Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
62Methods or arrangements for recognition using electronic means
6217Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
6256Obtaining sets of training patterns; Bootstrap methods, e.g. bagging, boosting
6257characterised by the organisation or the structure of the process, e.g. boosting cascade
G06K 9/6262
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
9Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints
62Methods or arrangements for recognition using electronic means
6217Design or setup of recognition systems and techniques; Extraction of features in feature space; Clustering techniques; Blind source separation
6262Validation, performance evaluation or active pattern learning techniques
G06N 3/0454
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
04Architectures, e.g. interconnection topology
0454using a combination of multiple neural nets
G06N 3/0472
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
04Architectures, e.g. interconnection topology
0472using probabilistic elements, e.g. p-rams, stochastic processors
G06N 3/08
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
08Learning methods
G06N 3/082
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
08Learning methods
082modifying the architecture, e.g. adding or deleting nodes or connections, pruning
Applicants
  • MICROSOFT TECHNOLOGY LICENSING, LLC [US]/[US]
Inventors
  • FUSI, Nicolo
  • CASALE, Francesco Paolo
  • GORDON, Jonathan
Agents
  • MINHAS, Sandip S.
  • CHEN, Wei-Chen Nicholas
  • HINOJOSA-SMITH, Brianna L.
  • SWAIN, Cassandra T.
  • WONG, Thomas S.
  • CHOI, Daniel
  • HWANG, William C.
  • WIGHT, Stephen A.
  • CHATTERJEE, Aaron C.
  • JARDINE, John S.
  • GOLDSMITH, Micah P.
  • TRAN, Kimberly
  • ADJEMIAN, Monica
  • BARKER, Doug
  • CHURNA, Timothy
  • DINH, Phong
  • EVANS, Patrick
  • GABRYJELSKI, Henry
  • GUPTA, Anand
  • LEE, Sunah
  • LEMMON, Marcus
  • MARQUIS, Thomas
  • MEYERS, Jessica
  • ROPER, Brandon
  • SPELLMAN, Steven
  • SULLIVAN, Kevin
  • TABOR, Ben
  • WALKER, Matt
  • WISDOM, Gregg
  • WONG, Ellen
  • ZHANG, Hannah
Priority Data
16/179,43302.11.2018US
Publication Language English (EN)
Filing Language English (EN)
Designated States
Title
(EN) PROBABILISTIC NEURAL NETWORK ARCHITECTURE GENERATION
(FR) GÉNÉRATION D'UNE ARCHITECTURE DE RÉSEAU NEURONAL PROBABILISTE
Abstract
(EN)
Examples of the present disclosure describe systems and methods for probabilistic neural network architecture generation. In an example, an underlying distribution over neural network architectures based on various parameters is sampled using probabilistic modeling. Training data is evaluated in order to iteratively update the underlying distribution, thereby generating a probability distribution over the neural network architectures. The distribution is iteratively trained until the parameters associated with the neural network architecture converge. Once it is determined that the parameters have converged, the resulting probability distribution may be used to generate a resulting neural network architecture. As a result, intermediate architectures need not be fully trained, which dramatically reduces memory usage and/or processing time. Further, in some instances, it is possible to evaluate bigger architectures and/or larger batch sizes while also reducing neural network architecture generation time and maintaining or improving neural network accuracy.
(FR)
Des exemples de l'invention concernent des systèmes et des procédés permettant de générer une architecture de réseau neuronal probabiliste. Dans un exemple, une distribution sous-jacente sur des architectures de réseau neuronal d’après différents paramètres est échantillonnée à l'aide d'une modélisation probabiliste. Les données d'apprentissage sont évaluées afin de mettre à jour de manière itérative la distribution sous-jacente, ce qui permet de générer une distribution de probabilités sur les architectures de réseau neuronal. La distribution est formée de manière itérative jusqu'à ce que les paramètres associés à l'architecture de réseau neuronal convergent. Lorsqu’il est déterminé que les paramètres ont convergé, la distribution de probabilités obtenue peut être utilisée pour générer une architecture de réseau neuronal obtenue. Par conséquent, il n’est pas nécessaire d’apprendre entièrement les architectures intermédiaires, ce qui permet de réduire considérablement l'utilisation de la mémoire et/ou le temps de traitement. De plus, dans certains cas, il est possible d'évaluer les architectures plus grandes et/ou les tailles de lots plus grandes tout en réduisant le temps de génération de l'architecture du réseau neuronal et en maintenant ou améliorant la précision du réseau neuronal.
Latest bibliographic data on file with the International Bureau