Processing

Please wait...

Settings

Settings

Goto Application

1. WO2013148440 - MANAGING COHERENT MEMORY BETWEEN AN ACCELERATED PROCESSING DEVICE AND A CENTRAL PROCESSING UNIT

Publication Number WO/2013/148440
Publication Date 03.10.2013
International Application No. PCT/US2013/033158
International Filing Date 20.03.2013
IPC
G06F 12/08 2006.01
GPHYSICS
06COMPUTING; CALCULATING OR COUNTING
FELECTRIC DIGITAL DATA PROCESSING
12Accessing, addressing or allocating within memory systems or architectures
02Addressing or allocation; Relocation
08in hierarchically structured memory systems, e.g. virtual memory systems
CPC
G06F 12/0804
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
FELECTRIC DIGITAL DATA PROCESSING
12Accessing, addressing or allocating within memory systems or architectures
02Addressing or allocation; Relocation
08in hierarchically structured memory systems, e.g. virtual memory systems
0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
0804with main memory updating
G06F 12/0806
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
FELECTRIC DIGITAL DATA PROCESSING
12Accessing, addressing or allocating within memory systems or architectures
02Addressing or allocation; Relocation
08in hierarchically structured memory systems, e.g. virtual memory systems
0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
0806Multiuser, multiprocessor or multiprocessing cache systems
G06F 12/0815
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
FELECTRIC DIGITAL DATA PROCESSING
12Accessing, addressing or allocating within memory systems or architectures
02Addressing or allocation; Relocation
08in hierarchically structured memory systems, e.g. virtual memory systems
0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
0806Multiuser, multiprocessor or multiprocessing cache systems
0815Cache consistency protocols
G06F 12/0835
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
FELECTRIC DIGITAL DATA PROCESSING
12Accessing, addressing or allocating within memory systems or architectures
02Addressing or allocation; Relocation
08in hierarchically structured memory systems, e.g. virtual memory systems
0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
0806Multiuser, multiprocessor or multiprocessing cache systems
0815Cache consistency protocols
0831using a bus scheme, e.g. with bus monitoring or watching means
0835for main memory peripheral accesses (e.g. I/O or DMA)
G06F 12/0837
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
FELECTRIC DIGITAL DATA PROCESSING
12Accessing, addressing or allocating within memory systems or architectures
02Addressing or allocation; Relocation
08in hierarchically structured memory systems, e.g. virtual memory systems
0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
0806Multiuser, multiprocessor or multiprocessing cache systems
0815Cache consistency protocols
0837with software control, e.g. non-cacheable data
G06F 12/0848
GPHYSICS
06COMPUTING; CALCULATING; COUNTING
FELECTRIC DIGITAL DATA PROCESSING
12Accessing, addressing or allocating within memory systems or architectures
02Addressing or allocation; Relocation
08in hierarchically structured memory systems, e.g. virtual memory systems
0802Addressing of a memory level in which the access to the desired data or data block requires associative addressing means, e.g. caches
0844Multiple simultaneous or quasi-simultaneous cache accessing
0846Cache with multiple tag or data arrays being simultaneously accessible
0848Partitioned cache, e.g. separate instruction and operand caches
Applicants
  • ADVANCED MICRO DEVICES, INC. [US]/[US]
  • ATI TECHNOLOGIES ULC [CA]/[CA]
Inventors
  • ASARO, Anthony
  • NORMOYLE, Kevin
  • HUMMEL, Mark
Agents
  • SPECHT, Michael D.
Priority Data
13/601,12631.08.2012US
61/617,47929.03.2012US
Publication Language English (EN)
Filing Language English (EN)
Designated States
Title
(EN) MANAGING COHERENT MEMORY BETWEEN AN ACCELERATED PROCESSING DEVICE AND A CENTRAL PROCESSING UNIT
(FR) GESTION DE MÉMOIRE COHÉRENTE ENTRE UN DISPOSITIF DE TRAITEMENT ACCÉLÉRÉ ET UNE UNITÉ CENTRALE DE TRAITEMENT
Abstract
(EN)
Existing multiprocessor computing systems often have insufficient memory coherency and, consequently, are unable to efficiently utilize separate memory systems. Specifically, a CPU cannot effectively write to a block of memory and then have a GPU access that memory unless there is explicit synchronization. In addition, because the GPU is forced to statically split memory locations between itself and the CPU, existing multiprocessor computing systems are unable to efficiently utilize the separate memory systems. Embodiments described herein overcome these deficiencies by receiving a notification within the GPU that the CPU has finished processing data that is stored in coherent memory, and invalidating data in the CPU caches that the GPU has finished processing from the coherent memory. Embodiments described herein also include dynamically partitioning a GPU memory into coherent memory and local memory through use of a probe filter.
(FR)
Les systèmes informatiques multiprocesseurs existants présentent souvent une cohérence de mémoire insuffisante et, par conséquent, ne peuvent pas utiliser de manière efficace des systèmes de mémoire séparés. De façon spécifique, une unité centrale de traitement (UC) ne peut pas écrire efficacement sur un bloc de mémoire puis avoir un accès à cette mémoire depuis une unité de traitement graphique (GPU), à moins qu'une synchronisation explicite ait lieu. De plus, étant donné que la GPU est forcée à diviser de manière statique des emplacements de mémoire entre elle-même et l'UC, les systèmes informatiques multiprocesseurs existants ne sont pas capables d'utiliser efficacement les systèmes de mémoire séparés. Certains modes de réalisation décrits ici permettent de surmonter ces problèmes par réception d'une notification dans la GPU indiquant que l'UC a terminé de traiter des données qui sont stockées dans la mémoire cohérente, et par invalidation de données dans les mémoires caches de l'UC que la GPU a terminé de traiter à partir de la mémoire cohérente. Certains modes de réalisation de l'invention consistent également à diviser dynamiquement une mémoire de GPU en une mémoire cohérente et une mémoire locale par l'intermédiaire d'un filtre de vérification.
Latest bibliographic data on file with the International Bureau