Processing

Please wait...

Settings

Settings

Goto Application

1. WO2020117926 - PIPELINED MATRIX MULTIPLICATION AT A GRAPHICS PROCESSING UNIT

Publication Number WO/2020/117926
Publication Date 11.06.2020
International Application No. PCT/US2019/064454
International Filing Date 04.12.2019
IPC
G06F 17/16 2006.01
GPHYSICS
06COMPUTING; CALCULATING OR COUNTING
FELECTRIC DIGITAL DATA PROCESSING
17Digital computing or data processing equipment or methods, specially adapted for specific functions
10Complex mathematical operations
16Matrix or vector computation
G06F 9/38 2006.01
GPHYSICS
06COMPUTING; CALCULATING OR COUNTING
FELECTRIC DIGITAL DATA PROCESSING
9Arrangements for program control, e.g. control units
06using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
30Arrangements for executing machine instructions, e.g. instruction decode
38Concurrent instruction execution, e.g. pipeline, look ahead
G01T 1/20 2006.01
GPHYSICS
01MEASURING; TESTING
TMEASUREMENT OF NUCLEAR OR X-RADIATION
1Measuring X-radiation, gamma radiation, corpuscular radiation, or cosmic radiation
16Measuring radiation intensity
20with scintillation detectors
G06N 3/02 2006.01
GPHYSICS
06COMPUTING; CALCULATING OR COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
CPC
G06F 17/16
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
FELECTRIC DIGITAL DATA PROCESSING
17Digital computing or data processing equipment or methods, specially adapted for specific functions
10Complex mathematical operations
16Matrix or vector computation ; , e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
G06F 9/4843
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
FELECTRIC DIGITAL DATA PROCESSING
9Arrangements for program control, e.g. control units
06using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
46Multiprogramming arrangements
48Program initiating; Program switching, e.g. by interrupt
4806Task transfer initiation or dispatching
4843by program, e.g. task dispatcher, supervisor, operating system
G06N 3/0445
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
NCOMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
3Computer systems based on biological models
02using neural network models
04Architectures, e.g. interconnection topology
0445Feedback networks, e.g. hopfield nets, associative networks
G06T 1/20
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
1General purpose image data processing
20Processor architectures; Processor configuration, e.g. pipelining
Applicants
  • ADVANCED MICRO DEVICES, INC. [US]/[US]
Inventors
  • NEMLEKAR, Milind N.
Agents
  • SHEEHAN, Adam D.
Priority Data
16/211,95406.12.2018US
Publication Language English (EN)
Filing Language English (EN)
Designated States
Title
(EN) PIPELINED MATRIX MULTIPLICATION AT A GRAPHICS PROCESSING UNIT
(FR) MULTIPLICATION MATRICIELLE EN PIPELINE AU NIVEAU D'UN PROCESSEUR GRAPHIQUE
Abstract
(EN)
A graphics processing unit (GPU) [100] schedules recurrent matrix multiplication operations at different subsets of CUs [110, 111, 112, 113] of the GPU. The GPU includes a scheduler [104] that receives sets of recurrent matrix multiplication operations [103, 114], such as multiplication operations associated with a recurrent neural network (RNN). The multiple operations associated with, for example, an RNN layer are fused into a single kernel, which is scheduled by the scheduler such that one work group is assigned per compute unit, thus assigning different ones of the recurrent matrix multiplication operations to different subsets of the CUs of the GPU. In addition, via software synchronization of the different workgroups, the GPU pipelines the assigned matrix multiplication operations so that each subset of CUs provides corresponding multiplication results to a different subset, and so that each subset of CUs executes at least a portion of the multiplication operations concurrently.
(FR)
Un processeur graphique (GPU) [100] planifie des opérations de multiplication matricielle récurrente au niveau de différents sous-ensembles de CU [110, 111, 112, 113] du GPU. Le GPU comprend un planificateur [104] qui reçoit des ensembles d'opérations de multiplication matricielle récurrente [103, 114], telles que des opérations de multiplication associées à un réseau de neurones bouclé (RNN). Les multiples opérations associées, par exemple, à une couche RNN sont fusionnées en un noyau unique, qui est planifié par le programmateur de telle sorte qu'un groupe de travail est attribué par unité de calcul, ce qui permet d'attribuer différentes opérations de multiplication matricielle récurrente à différents sous-ensembles de CU du GPU. De plus, par l'intermédiaire d'une synchronisation logicielle des différents groupes de travail, le GPU effectue en pipeline les opérations de multiplication matricielle attribuées de sorte que chaque sous-ensemble de CU fournit des résultats de multiplication correspondants à un sous-ensemble différent, et de sorte que chaque sous-ensemble de CU exécute au moins une partie des opérations de multiplication simultanément.
Latest bibliographic data on file with the International Bureau