Processing

Please wait...

Settings

Settings

Goto Application

1. WO2020113326 - AUTOMATIC IMAGE-BASED SKIN DIAGNOSTICS USING DEEP LEARNING

Note: Text based on automatic Optical Character Recognition processes. Please use the PDF version for legal matters

[ EN ]

Claims

What we claim is:

1. A skin diagnostic device comprising:

a storage unit to store and provide a convolutional neural network (CNN) configured to classify pixels of an image to determine a plurality (N) of respective skin sign diagnoses for each of a plurality (N) of respective skin signs wherein the CNN comprises a deep neural network for image classification configured to generate the N respective skin sign diagnoses and wherein the CNN is trained using skin sign data for each of the N respective skin signs; and

a processing unit coupled to the storage unit configured to receive the image and process the image using the CNN to generate the N respective skin sign diagnoses.

2. The skin diagnostic device according to claim 1 , wherein the CNN comprises:

an encoder phase defined from a pre-trained network for image classification and configured to encode features to a final encoder phase feature net; and

a decoder phase configured to receive the final encoder phase feature net for decoding by a plurality (N) of respective parallel skin sign branches to generate each of the N respective skin sign diagnoses.

3. The skin diagnostic device according to claim 2, wherein the decoder phase includes a global pooling operation to process the final encoder phase feature net to provide to each of the N respective parallel skin sign branches.

4. The skin diagnostic device according to one of claims 2 and 3, wherein the CNN is further configured to classify the pixels to determine an ethnicity vector and the CNN is trained using skin sign data for each of the N respective skin signs and a plurality of ethnicities.

5. The skin diagnostics device according to claim 4, wherein the decoder phase comprises a further parallel branch for ethnicity to generate the ethnicity vector.

6. The skin diagnostic device according to any one of claims 2 to 5, wherein each branch of the N respective parallel skin sign branches comprises in succession: a first fully connected layer followed, a first activation layer, a second fully connected layer, a second activation layer and a final activation layer to output a final value comprising one of the N respective skin sign diagnoses and the ethnicity vector.

7. The skin diagnostic device according to claim 6, wherein the final activation layer is defined in accordance with a function of equation (1) for an input score x received from the second activation layer:

if x e [a, b]

LeakyClamp if x < a (1)


if > b

where a is a slope, a is a lower bound and b is an upper bound of a respective score range for each the N respective skin sign diagnoses.

8. The skin diagnostic device according to any one of claims 4 to 7:

wherein the CNN is trained using multiple samples in the form (Xj, y ), with Xj being the z-th training image and
being a corresponding vector of ground truth skin sign diagnoses; and

wherein the CNN is trained to minimize a loss function for each respective branch of the N parallel skin sign branches and the further parallel branch for ethnicity.

9. The skin diagnostic device according to claim 8, wherein the CNN is further trained to minimize a loss function L, comprising a L2 loss function for each of the N respective skin sign branches in a weighted combination with a standard cross-entropy classification loss Lethmcity for the further parallel branch for ethnicity, according to equation (3):

L = L2 + XL ethnicity (3)

where l controls a balance between a score regression and ethnicity classification losses.

10. The skin diagnostic device according to any one of claims 1 to 9, wherein the storage unit stores a face and landmark detector to pre-process the image and wherein the processing unit is configured to generate a normalized image from the image using the face and landmark detector and use the normalized image when using the CNN.

11. The skin diagnostic device according to any one of claims 1 to 10, wherein the CNN comprises a pre-trained network for image classification which is adapted to generate the N respective skin sign diagnoses such that:

the fully connected layers of the pre-trained network are removed; and

N respective groups of layers are defined to decode a same feature net for each of the N

respective skin sign diagnoses in parallel.

12. The skin diagnostic device according to any one of claims 1 to 11 , configured as one of:

a computing device for personal use comprising a mobile device; and

a server providing skin diagnostic services via a communications network.

13. The skin diagnostic device according to any one of claims 1 to 12, wherein the storage unit stores code which when executed by the processing unit provides a treatment product selector responsive to at least some of the N skin sign diagnoses to obtain a recommendation for at least one of a product and a treatment plan.

14. The skin diagnostic device according to any one of claims 1 to 13, wherein the storage unit stores code which when executed by the processing unit provides an image acquisition function to receive the image.

15. The skin diagnostic device according to any one of claims 1 to 14, wherein the storage unit stores code which when executed by the processing unit provides a treatment monitor to monitor treatment for at least one skin sign.

16. The skin diagnostic device according to claim 15, wherein the processing unit is configured to at least one of remind, instruct and/or record treatment activities associated with a product application for respective treatment sessions.

17. The skin diagnostic device according to any one of claims 1 to 16, wherein the processing unit is configured to process a second image using the CNN to generate a subsequent skin diagnoses received following a treatment session.

18. The skin diagnostic device according to claim 17, wherein the storage unit stores code which when executed by the processing unit provide a presentation of comparative results using the subsequent skin diagnoses.

19. A computer implemented method of skin diagnoses comprising:

providing a storage unit to store and provide a convolutional neural network (CNN) configured to classify pixels of an image to determine a plurality (N) of respective skin sign diagnoses for each of a plurality (N) of respective skin signs wherein the CNN comprises a deep neural network for image classification configured to generate the N respective skin sign diagnoses and wherein the CNN is trained using skin sign data for each of the N respective skin signs; and

performing by a processing unit coupled to the storage unit:

receiving the image; and

processing the image using the CNN to generate the N respective skin sign diagnoses.

20. The method according to claim 19, wherein the CNN comprises:

an encoder phase defined from a pre-trained network for image classification and configured to encode features to a final encoder phase feature net; and

a decoder phase configured to receive the final encoder phase feature net for decoding by a plurality (N) of respective parallel skin sign branches to generate each of the N respective skin sign diagnoses.

21. The method according to claim 20, wherein the decoder phase includes a global pooling operation to process the final encoder phase feature net to provide to each of the N respective parallel skin sign branches.

22. The method according to one of claims 20 and 21 , wherein the CNN is further configured to classify the pixels to determine an ethnicity vector and the CNN is trained using skin sign data for each of the N respective skin signs and a plurality of ethnicities and wherein the processing of the image by the CNN generates the ethnicity vector.

23. The method according to claim 22, wherein the decoder phase comprises a further parallel branch for ethnicity to generate the ethnicity vector.

24. The method according to any one of claims 20 to 23, wherein each branch of the N respective parallel skin sign branches comprises in succession: a first fully connected layer followed, a first activation layer, a second fully connected layer, a second activation layer and a final activation layer to output a final value comprising one of the N respective skin sign diagnoses and the ethnicity vector.

25. The method according to claim 24, wherein the final activation layer is defined in accordance with a function of equation (1) for an input score x received from the second activation layer:

LeakyClamp(x) =


where a is a slope, a is a lower bound and b is an upper bound of a respective score range for each the N respective skin sign diagnoses.

26. The method according to any one of claims 22 to 25:

wherein the CNN is trained using multiple samples in the form (Xj, y ), with Xj being the z-th training image and
being a corresponding vector of ground truth skin sign diagnoses; and

wherein the CNN is trained to minimize a loss function for each respective branch of the N parallel skin sign branches and the further parallel branch for ethnicity.

27. The method according to claim 26, wherein the CNN is further trained to minimize a loss function L, comprising a L2 loss function for each of the N respective skin sign branches in a weighted combination with a standard cross-entropy classification loss Lethmcity for the further parallel branch for ethnicity, according to equation (3):

L = L2 + XL ethnicity (3)

where l controls a balance between a score regression and ethnicity classification losses.

28. The method according to any one of claims 19 to 27, wherein the storage unit stores a face and landmark detector to pre-process the image and wherein the method comprises pre-processing the image by the processing unit using the face and landmark detector to generate a normalized image from the image and using the normalized image when using the CNN.

29. The method according to any one of claims 19 to 28, wherein the CNN comprises a pre-trained network for image classification which is adapted to generate the N respective skin sign diagnoses such that:

the fully connected layers of the pre-trained network are removed; and

N respective groups of layers are defined to decode a same feature net for each of the N

respective skin sign diagnoses in parallel.

30. The method according to any one of claims 19 to 29, wherein the storage unit and processing unit are components of one of:

a computing device for personal use comprising a mobile device; and

a server providing skin diagnostic services via a communications network.

31. The method according to any one of claims 19 to 30, wherein the storage unit stores code which when executed by the processing unit provides a treatment product selector responsive to at least some of the N skin sign diagnoses to obtain a recommendation for at least one of a product and a treatment plan; and wherein the method comprises executing the code of the treatment product selector by the processing unit to obtain a recommendation for at least one of a product and a treatment plan.

32. The method according to any one of claims 19 to 31 , wherein the storage unit stores code which when executed by the processing unit provides an image acquisition function to receive the image; and wherein the method comprises executing the code of the image acquisition function by the processing unit to receive the image.

33. The method according to any one of claims 19 to 32, wherein the storage unit stores code which when executed by the processing unit provides a treatment monitor to monitor treatment for at least one skin sign; and wherein the method comprises executing the code of the treatment monitor by the processing unit to monitor treatment for at least one skin sign.

34. The method according to claim 33, wherein the method comprises, via the processing unit, at least one of reminding, instructing and/or recording treatment activities associated with a product application for respective treatment sessions.

35. The method according to any one of claims 19 to 34 comprising, via the processing unit, processing a second image using the CNN to generate a subsequent skin diagnoses received following a treatment session.

36. The method according to claim 35 comprising, via processing unit, providing a presentation of comparative results using the subsequent skin diagnoses.

37. A method comprising:

training a convolutional neural network (CNN) configured to classify pixels of an image to

determine a plurality (N) of respective skin sign diagnoses for each of a plurality (N) of respective skin signs wherein the CNN comprises a deep neural network for image classification configured to generate the N respective skin sign diagnoses and wherein the training is performed using skin sign data for each of the N respective skin signs.

38. The method according to claim 37, wherein the CNN comprises:

an encoder phase defined from a pre-trained network for image classification and configured to encode features to a final encoder phase feature net; and

a decoder phase configured to receive the final encoder phase feature net for decoding by a plurality (N) of respective parallel skin sign branches to generate each of the N respective skin sign diagnoses.

39. The method according to claim 38, wherein the decoder phase includes a global pooling operation to process the final encoder phase feature net to provide to each of the N respective parallel skin sign branches.

40. The method according to one of claims 38 and 39, wherein the CNN is further configured to classify the pixels to determine an ethnicity vector and the method comprises training the CNN using skin sign data for each of the N respective skin signs and a plurality of ethnicities.

41. The method according to claim 40, wherein the decoder phase comprises a further parallel branch for ethnicity to generate the ethnicity vector.

42. The method according to any one of claims 38 to 41 , wherein each branch of the N respective parallel skin sign branches comprises in succession: a first fully connected layer followed, a first activation layer, a second fully connected layer, a second activation layer and a final activation layer to output a final value comprising one of the N respective skin sign diagnoses and the ethnicity vector.

43. The method according to claim 42, wherein the final activation layer is defined in accordance with a function of equation (1) for an input score x received from the second activation layer:

ifx e [a, b]

LeakyClamp if x < a (1)


if x > b

where a is a slope, a is a lower bound and b is an upper bound of a respective score range for each the N respective skin sign diagnoses.

44. The method according to any one of claims 40 to 44:

wherein training trains the CNN using multiple samples in the form (Xj, y ), with Xj being the i- th training image and
being a corresponding vector of ground truth skin sign diagnoses; and

wherein training trains the CNN to minimize a loss function for each respective branch of the N parallel skin sign branches and the further parallel branch for ethnicity.

45. The method according to claim 44, wherein training trains the CNN to minimize a loss function L, comprising a L2 loss function for each of the N respective skin sign branches in a weighted combination with a standard cross-entropy classification loss Lethmcity for the further parallel branch for ethnicity, according to equation (3):

L = L2 + XL ethnicity (3)

where l controls a balance between a score regression and ethnicity classification losses.

46. The method according to any one of claims 19 to 27, wherein the CNN is configured to receive a normalized image pre-processed by face and landmark detection.

47. The method according to any one of claims 19 to 28, wherein the CNN initially comprises a pre-trained network for image classification which is adapted to generate the N respective skin sign diagnoses such that:

the fully connected layers of the pre-trained network are removed; and

N respective groups of layers are defined to decode a same feature net for each of the N respective skin sign diagnoses in parallel.