Traitement en cours

Veuillez attendre...

Paramétrages

Paramétrages

Aller à Demande

1. WO2020112188 - ARCHITECTURE D'ORDINATEUR POUR LA GÉNÉRATION D'IMAGES ARTIFICIELLES

Note: Texte fondé sur des processus automatiques de reconnaissance optique de caractères. Seule la version PDF a une valeur juridique

[ EN ]

CLAIMS

What is claimed is:

1. An image processing apparatus, the apparatus comprising:

processing circuitry and memory; the processing circuitry to:

receive a voxel model of a target object, wherein the target object is to be recognized using an image recognizer;

generate, based on the voxel model, a set of TSB (target shadow background-mask) images of the target object;

receive, at an auto-encoder, a set of real images of the target object; generate, using the auto-encoder and based on the set of real images, one or more artificial images of the target object based on the set of TSB images, wherein the auto encoder encodes, using a sub-encoder, the set of TSB images into a latent vector and decodes, using a sub-decoder, the latent vector to generate the one or more artificial images; and

provide, as output, the generated one or more artificial images of the target object.

2. The apparatus of claim 1, wherein the sub-encoder comprises a plurality of convolution layers and a plurality of pooling layers interspersed with the convolution layers, and wherein the sub-encoder is trained, using a machine learning training algorithm, to generate the latent vector based on the set of TSB images.

3. The apparatus of claim 1, wherein the sub-decoder comprises a plurality of deconvolution layers and a plurality of depooling layers interspersed with the deconvolution layers, and wherein the sub-decoder is trained, using a machine learning training algorithm, to generate the one or more artificial images based on the latent vector.

4. The apparatus of claim 1, wherein the processing circuitry is further to:

train, using the set of real images and the generated one or more artificial images, the image recognizer to recognize the target object; and

provide, as output, an indication that the image recognizer has been trained.

5. The apparatus of claim 4, wherein the processing circuitry is further to:

use the trained image recognizer to recognize a new image of the target object.

6. The apparatus of claim 4, wherein the image recognizer comprises a ResNet (residual neural network).

7. The apparatus of claim 1, wherein the sub-encoder comprises a plurality of convolution pools, each convolution pool being followed by a batch normalization, and each batch normalization being followed by a ReLU

(rectified linear unit).

8. The apparatus of claim 7, wherein a kernel size of each convolution pool is larger than a kernel size of a previous convolution pool.

9. The apparatus of claim 1, wherein the sub-decoder comprises a plurality of skip connects, each skip connect being followed by a batch normalization, each batch normalization being followed by a ReLU (rectified linear unit), and each ReLU being followed by a decode convolution.

10. The apparatus of claim 9, wherein a kernel size of each decode convolution is smaller than a kernel size of a previous decode convolution.

11. A non-transitory machine-readable medium for image processing, the machine-readable medium storing instructions which, when executed by processing circuitry of one or more machines, cause the processing circuitry to: receive, a voxel model of a target object, wherein the target object is to be recognized using an image recognizer;

generate, based on the voxel model, a set of TSB (target shadow background-mask) images of the target object;

receive, at an auto-encoder, a set of real images of the target object; generate, using the auto-encoder and based on the set of real images, one or more artificial images of the target object based on the set of TSB images, wherein the auto encoder encodes, using a sub-encoder, the set of TSB images into a latent vector and decodes, using a sub-decoder, the latent vector to generate the one or more artificial images; and

provide, as output, the generated one or more artificial images of the target obj ect.

12. The machine-readable medium of claim 11, wherein the sub encoder comprises a plurality of convolution layers and a plurality of pooling layers interspersed with the convolution layers, and wherein the sub-encoder is trained, using a machine learning training algorithm, to generate the latent vector based on the set of TSB images.

13. The machine-readable medium of claim 11, wherein the sub decoder comprises a plurality of deconvolution layers and a plurality of depooling layers interspersed with the deconvolution layers, and wherein the sub -decoder is trained, using a machine learning training algorithm, to generate the one or more artificial images based on the latent vector.

14. The machine-readable medium of claim 11, wherein the processing circuitry is further to :

train, using the set of real images and the generated one or more artificial images, the image recognizer to recognize the target object; and

provide, as output, an indication that the image recognizer has been trained.

15. The machine-readable medium of claim 14, wherein the processing circuitry is further to :

use the trained image recognizer to recognize a new image of the target object.