Search International and National Patent Collections

1. (WO2018204910) LOSS-SCALING FOR DEEP NEURAL NETWORK TRAINING WITH REDUCED PRECISION

Pub. No.:    WO/2018/204910    International Application No.:    PCT/US2018/031356
Publication Date: Fri Nov 09 00:59:59 CET 2018 International Filing Date: Tue May 08 01:59:59 CEST 2018
IPC: G06N 3/08
Applicants: NVIDIA CORPORATION
Inventors: WU, Hao
ALBEN, Jonah
MICIKEVICIUS, Paulius
Title: LOSS-SCALING FOR DEEP NEURAL NETWORK TRAINING WITH REDUCED PRECISION
Abstract:
In training a deep neural network using reduced precision, gradient computation operates on larger values without affecting the rest of the training procedure. One technique trains the deep neural network to develop loss, scales the loss, computes gradients at a reduced precision, and reduces the magnitude of the computed gradients to compensate for scaling of the loss. In one example nonlimiting arrangement, the training forward pass scales a loss value by some factor S and the weight update reduces the weight gradient contribution by 1/S. Several techniques can be used for selecting scaling factor S and adjusting the weight update.