Traitement en cours

Veuillez attendre...

Paramétrages

Paramétrages

Aller à Demande

1. WO2020112733 - ÉTALONNAGE EN LIGNE DE DONNÉES DE BALAYAGE 3D À PARTIR DE MULTIPLES POINTS DE VUE

Note: Texte fondé sur des processus automatiques de reconnaissance optique de caractères. Seule la version PDF a une valeur juridique

[ EN ]

CLAIMS

What is claimed is:

1 . A calibration system, comprising:

circuitry configured to:

receive a set of depth scans and a corresponding set of color images of a scene comprising a human-object as part of a foreground of the scene; extract a first three-dimensional (3D) representation of the foreground based on a first depth scan of the set of depth scans, wherein the first 3D representation is associated with a first viewpoint in a 3D environment;

spatially align the extracted first 3D representation with a second 3D representation of the foreground, wherein the second 3D representation is associated with a second viewpoint in the 3D environment;

update the spatially aligned first 3D representation based on the corresponding set of color images and a set of structural features of the human- object, as a human-prior; and

reconstruct a 3D mesh of the human-object based on the updated first 3D representation of the foreground.

2. The calibration system according to claim 1 , further comprising a plurality of scanning devices configured to acquire the set of depth scans and the corresponding set of color images from a corresponding set of viewpoints in the

3D environment.

3. The calibration system according to claim 2, wherein the plurality of scanning devices, collectively, form a multi-camera network having a combined field-of-view that covers an entire surface of the human-object.

4. The calibration system according to claim 1 , wherein the circuitry is further configured to:

estimate a set of candidate transformations for the spatial alignment of the extracted first 3D representation with the second 3D representation;

compute a visibility error-metric for each candidate transformation of the estimated set of candidate transformations;

select, from the estimated set of candidate transformations, a candidate transformation for which the computed visibility error-metric is a minimum; and spatially align the extracted first 3D representation with the second 3D representation based on the selected candidate transformation.

5. The calibration system according to claim 1 , wherein the circuitry is further configured to:

extract a third 3D representation of a ground surface in the scene based on the first depth scan associated with the first viewpoint; and

spatially align the extracted third 3D representation of the ground surface with a fourth 3D representation of the ground surface, the fourth 3D representation being associated with the second viewpoint of the scene.

6. The calibration system according to claim 5, wherein the circuitry is further configured to update the spatially aligned first 3D representation further based on the spatial alignment of the extracted third 3D representation with the fourth 3D representation.

7. The calibration system according to claim 1 , wherein the set of structural features comprises a skeleton joint prior, a hand prior, and a face prior of the human-object.

8. The calibration system according to claim 1 , wherein the circuitry is further configured to:

estimate a first set of two-dimensional (2D) feature points for the set of structural features based on a first color image of the corresponding set of color images; and

estimate a second set of 2D feature points for the set of structural features based on a second color image of the corresponding set of color images, wherein the first color image corresponds to the first depth scan associated with the first viewpoint, and

the second color image corresponding to a second depth scan associated with the second viewpoint.

9. The calibration system according to claim 8, wherein the circuitry is further configured to compute a set of 3D feature points for the set of structural features based on the estimated first set of 2D feature points and the estimated second set of 2D feature points.

10. The calibration system according to claim 9, wherein the circuitry is further configured to update the spatially aligned first 3D representation based on the computed set of 3D feature points.

1 1 . The calibration system according to claim 9, wherein the extracted first 3D representation comprises a first plurality of 3D points representing at least a surface portion of the human-object from the first viewpoint.

12. The calibration system according to claim 1 1 , wherein the circuitry is further configured to:

compute a distance between the computed set of 3D feature points and a portion of the first plurality of 3D points corresponding to the set of structural features in the extracted first 3D representation;

estimate a global energy function based on the computed distance; and update the spatially aligned first 3D representation further based on the estimated global energy function being a minimum.

13. The calibration system according to claim 1 , wherein the reconstructed 3D mesh corresponds to a watertight mesh that at least partially captures a texture of the human-object in the corresponding set of color images.

14. The calibration system according to claim 13, wherein the circuitry is further configured to:

transfer texture values from reliable textured regions on temporally neighboring or distant meshes of the human-object to unreliable textured regions on the reconstructed 3D mesh; and

refine the 3D mesh based on the transfer,

wherein the refined 3D mesh corresponds to one temporal frame of a free-viewpoint video.

15. A method, comprising:

receiving a set of depth scans and a corresponding set of color images of a scene comprising a human-object as part of a foreground of the scene;

extracting a first three-dimensional (3D) representation of the foreground based on a first depth scan of the set of depth scans, wherein the first 3D representation is associated with a first viewpoint in a 3D environment;

spatially aligning the extracted first 3D representation with a second 3D representation of the foreground, wherein the second 3D representation is associated with a second viewpoint in the 3D environment;

updating the spatially aligned first 3D representation based on the corresponding set of color images and a set of structural features of the human- object, as a human-prior; and

reconstructing a 3D mesh of the human-object based on the updated first 3D representation of the foreground.

16. The method according to claim 15, further comprising:

extracting a third 3D representation of a ground surface in the scene based on the first depth scan associated with the first viewpoint;

spatially aligning the extracted third 3D representation of the ground surface with a fourth 3D representation of the ground surface, the fourth 3D representation being associated with the second viewpoint of the scene; and

updating the spatially aligned first 3D representation further based on the spatial alignment of the extracted third 3D representation of the ground surface with the fourth 3D representation.

17. The method according to claim 15, further comprising:

estimating a first set of two-dimensional (2D) feature points for the set of structural features based on a first color image of the corresponding set of color images;

estimating a second set of 2D feature points for the set of structural features based on a second color image of the corresponding set of color images, wherein the first color image corresponds to the first depth scan associated with the first viewpoint, and

the second color image corresponding to a second depth scan associated with the second viewpoint; and

computing a set of 3D feature points for the set of structural features based on the estimated first set of 2D feature points and the estimated second set of 2D feature points.

18. The method according to claim 17, further comprising updating the spatially aligned first 3D representation based on the computed set of 3D feature points.

19. A non-transitory computer-readable medium having stored thereon, computer- executable instructions that when executed by a calibration system, causes the calibration system to execute operations, the operations comprising:

receiving a set of depth scans and a corresponding set of color images of a scene comprising a human-object as part of a foreground of the scene;

extracting a first three-dimensional (3D) representation of the foreground based on a first depth scan of the set of depth scans, wherein the first 3D representation is associated with a first viewpoint in a 3D environment;

spatially aligning the extracted first 3D representation with a second 3D representation of the foreground, wherein the second 3D representation is associated with a second viewpoint in the 3D environment;

updating the spatially aligned first 3D representation based on the corresponding set of color images and a set of structural features of the human- object, as a human-prior; and

reconstructing a 3D mesh of the human-object based on the updated first 3D representation of the foreground.‘

20. The non-transitory computer-readable medium according to claim 19, further comprising:

extracting a third 3D representation of a ground surface in the scene based on the first depth scan associated with the first viewpoint;

spatially aligning the extracted third 3D representation of the ground surface with a fourth 3D representation of the ground surface, the fourth 3D representation being associated with the second viewpoint of the scene; and

updating the spatially aligned first 3D representation further based on the spatial alignment of the extracted third 3D representation of the ground surface with the fourth 3D representation.