Processing

Please wait...

Settings

Settings

1. WO2020005305 - AUDIO PROCESSING IN A LOW-BANDWIDTH NETWORKED SYSTEM

Note: Text based on automatic Optical Character Recognition processes. Please use the PDF version for legal matters

CLAIMS

What is claimed:

1. A system to detect activation phrases in remote devices, comprising:

a natural language processor component executed by a first client device to:

receive a first instance of first input audio signal detected by a first microphone of a sensing device;

parse the first instance of the first input audio signal to identify a first candidate activation phrase in the first instance of the first input audio signal;

determine that the first candidate activation phrase does not contain a predetermined activation phrase;

receive a second instance of the first input audio signal detected by a second microphone of the sensing device;

parse the second instance of the first input audio signal to identify a second candidate activation phrase in the second instance of the first input audio signal;

determine that the second candidate activation phrase contains the predetermined activation phrase; and

an interface of the first client device to:

transmit, based on a determination that the second candidate activation phrase contains the predetermined activation phrase, an audio signal associated with at least one of the first instance of the first input audio signal and the second instance of the first input audio signal to a data processing system comprising a second natural language processor component to identify a request in the at least one of the first instance of the first input audio signal and the second instance of the first input audio signal.

2. The system of claim 1, comprising:

the interface to transmit, from the first client device to the sensing device, a request for the second instance of the first input audio signal based on a determination that the first candidate activation phrase does not contain the predetermined activation phrase.

3. The system of claim 1 or 2, comprising:

the natural language processor component to:

receive a first instance of a second input audio signal detected by the first microphone of the sensing device;

parse the first instance of the second input audio signal to identify a third candidate activation phrase;

determine that the third candidate activation phrase contains the predetermined activation phrase; and

the interface to transmit, to the sensing device, a request for a third input audio signal based on a determination that the third candidate activation phrase contains the predetermined activation phrase.

4. The system of any preceding claim, comprising:

the natural language processor component to:

receive a first instance of a second input audio signal detected by the first microphone of the sensing device;

parse the first instance of the second input audio signal to identify a third candidate activation phrase;

determine that the third candidate activation phrase contains the predetermined activation phrase; and

the interface to terminate a reception of a second instance of the second input audio signal based on a determination that the third candidate activation phrase contains the predetermined activation phrase.

5. The system of any preceding claim, comprising:

the interface to establish a Bluetooth connection between the first client device and the sensing device.

6. The system of any preceding claim, comprising the natural language processor component to: receive a third instance of the first input audio signal from a sensor of the first client device;

parse the third instance of the first input audio signal to identify a third candidate activation phrase in the third instance of the first input audio signal; and

determine that the first input audio signal contains the predetermined activation phrase based at least on the third candidate activation phrase and the second candidate activation phrase.

7. The system of any preceding claim, comprising:

the natural language processor component to:

receive a first instance of second input audio signal detected by a first microphone of a second sensing device;

parse the first instance of the second input audio signal to identify a third candidate activation phrase in the first instance of the second input audio signal;

determine that the third candidate activation phrase does not contain a predetermined activation phrase;

receive a second instance of the second input audio signal detected by the first microphone of the second sensing device, the second instance of the second input audio signal having a lower compression rate than the first instance of second input audio signal;

parse the second instance of the second input audio signal to identify a fourth candidate activation phrase in the second instance of the second input audio signal;

determine that the fourth candidate activation phrase contains the predetermined activation phrase; and

the interface of the first client device to:

transmit, based on a determination that the fourth candidate activation phrase contains the predetermined activation phrase, at least one of the first instance of the second input audio signal and the second instance of the second input audio signal to the data processing system comprising the second natural language processor component to

identify a second request in the at least one of the first instance of the second input audio signal and the second instance of the second input audio signal.

8. A system to transmit data in a voice-activated network, comprising:

a first microphone, of a sensing device, to receive a first instance of a first input audio signal;

a second microphone, of the sensing device, to receive a second instance of the first input audio signal;

a natural language processor component executed by the sensing device, to parse the first instance of the first input audio signal to identify an activation phrase;

an interface, of the sensing device, to transmit, at a first time point, the first instance of the first input audio signal to a client device based on identification of the activation phrase in the first instance of the first input audio signal, the client device comprising a second natural language processor component;

the interface, of the sensing device, to transmit, at a second time point after the first time point, the second instance of the first input audio signal to the client device; and

the interface, of the sensing device, to transmit an audio signal associated with at least one of the first instance of the first input audio signal and the second instance of the first input audio signal to the client device based on a confirmation message from the client device of an identification of the activation phrase in the second instance of the first input audio signal.

9. The system of claim 8, comprising the interface to:

transmit, at the first time point, the first instance of the first input audio signal to the client device at a first compression level; and

transmit, at the second time point, the second instance of the first input audio signal to the client device at a second compression level lower than the first compression level.

10. The system of claim 8 or 9, comprising the interface to:

transmit, at the first time point, the second instance of the first input audio signal to the client device at a first compression level; and

transmit, at the second time point, the first instance of the first input audio signal and the second instance of the input audio signal to the client device at a second compression level lower than the first compression level.

11. The system of any one of claims 8, 9 or 10, comprising the interface to:

transmit, at the second time point, the second instance of the first input audio signal to the client device based on a confirmation message that the activation phrase is not in the first instance of the input audio signal.

12. The system of any one of claims 8 to 11, comprising the interface to:

transmit, at the second time point, the second instance of the first input audio signal to the client device prior to receipt of a confirmation message that the activation phrase is not in the first instance of the input audio signal.

13. The system of claim 12, comprising:

the interface to terminate the transmission of the second interface of the first input audio signal based on a confirmation message that the activation phrase is in the first instance of the input audio signal.

14. The system of any one of claims 8 to 13, comprising the interface to:

establish a Bluetooth connection with the client device;

transmit, over the Bluetooth connection, the first instance of the first input audio signal and the second instance of the first input audio signal.

15. A method to transmit data in a voice-activated network, comprising:

receiving, by a first microphone of a sensing device, a first instance of a first input audio signal;

receiving, by a second microphone of the sensing device, a second instance of the first input audio signal;

parsing, by a natural language processor component executed by the sensing device, the first instance of the first input audio signal to identify an activation phrase;

transmitting, by an interface of the sensing device at a first time point, the first instance of the first input audio signal to a client device based on identification of the activation phrase in the first instance of the first input audio signal, the client device comprising a second natural language processor component;

transmitting, by the interface of the sensing device at a second time point after the first time point, the second instance of the first input audio signal to the client device; and

transmitting, by the interface of the sensing device, an audio signal associated with at least one of the first instance of the first input audio signal and the second instance of the first input audio signal based on a confirmation message from the client device of an identification of the activation phrase in the second instance of the first input audio signal.

16. The method of claim 15, comprising:

transmitting, by the interface at the first time point, the first instance of the first input audio signal to the client device at a first compression level; and

transmitting, by the interface at the second time point, the second instance of the first input audio signal to the client device at a second compression level lower than the first compression level.

17. The method of claim 15 or 16, comprising:

transmitting, by the interface at the first time point, the second instance of the first input audio signal to the client device at a first compression level; and

transmitting, by the interface at the second time point, the first instance of the first input audio signal and the second instance of the input audio signal to the client device at a second compression level lower than the first compression level.

18. The method of any one of claims 15 to 17, comprising:

transmitting, by the interface at the second time point, the second instance of the first input audio signal to the client device based on a confirmation message that the activation phrase is not in the first instance of the input audio signal.

19. The method of any one of claims 15 to 18, comprising:

transmitting, by the interface at the second time point, the second instance of the first input audio signal to the client device prior to receipt of a confirmation message that the activation phrase is not in the first instance of the input audio signal.

20. The method of claim 19, comprising:

terminating, by the interface, the transmission of the second interface of the first input audio signal based on a confirmation message that the activation phrase is in the first instance of the input audio signal.

21. A system to detect activation phrases in remote devices, comprising:

a natural language processor component executed by a first client device to:

receive a first instance of first input audio signal detected by a first microphone of a sensing device;

parse the first instance of the first input audio signal to identify a first candidate activation phrase in the first instance of the first input audio signal;

determine that the first candidate activation phrase does not contain a predetermined activation phrase;

receive a second instance of the first input audio signal detected by a second microphone of the sensing device;

parse the second instance of the first input audio signal to identify a second candidate activation phrase in the second instance of the first input audio signal;

determine that the second candidate activation phrase contains the predetermined activation phrase; and

an interface of the first client device to:

transmit, based on a determination that the second candidate activation phrase contains the predetermined activation phrase, at least one of the first instance of the first input audio signal and the second instance of the first input audio signal to a data processing system comprising a second natural language processor component to identify a request in the at least one of the first instance of the first input audio signal and the second instance of the first input audio signal.

22. The system of claim 21, comprising:

the interface to transmit, from the first client device to the sensing device, a request for the second instance of the first input audio signal based on a determination that the first candidate activation phrase does not contain the predetermined activation phrase.

23. The system of claim 21, comprising:

the natural language processor component to:

receive a first instance of a second input audio signal detected by the first microphone of the sensing device;

parse the first instance of the second input audio signal to identify a third candidate activation phrase;

determine that the third candidate activation phrase contains the predetermined activation phrase; and

the interface to transmit, to the sensing device, a request for a third input audio signal based on a determination that the third candidate activation phrase contains the predetermined activation phrase.

24. The system of claim 21, comprising:

the natural language processor component to:

receive a first instance of a second input audio signal detected by the first microphone of the sensing device;

parse the first instance of the second input audio signal to identify a third candidate activation phrase;

determine that the third candidate activation phrase contains the predetermined activation phrase; and

the interface to terminate a reception of a second instance of the second input audio signal based on a determination that the third candidate activation phrase contains the

predetermined activation phrase.

25. The system of claim 21, comprising:

the interface to establish a Bluetooth connection between the first client device and the sensing device.

26. The system of claim 21, comprising the natural language processor component to:

receive a third instance of the first input audio signal from a sensor of the first client device;

parse the third instance of the first input audio signal to identify a third candidate activation phrase in the third instance of the first input audio signal; and

determine that the first input audio signal contains the predetermined activation phrase based at least on the third candidate activation phrase and the second candidate activation phrase.

27. The system of claim 21, comprising:

the natural language processor component to:

receive a first instance of second input audio signal detected by a first microphone of a second sensing device;

parse the first instance of the second input audio signal to identify a third candidate activation phrase in the first instance of the second input audio signal;

determine that the third candidate activation phrase does not contain a predetermined activation phrase;

receive a second instance of the second input audio signal detected by the first microphone of the second sensing device, the second instance of the second input audio signal having a lower compression rate than the first instance of second input audio signal;

parse the second instance of the second input audio signal to identify a fourth candidate activation phrase in the second instance of the second input audio signal;

determine that the fourth candidate activation phrase contains the predetermined activation phrase; and

the interface of the first client device to:

transmit, based on a determination that the fourth candidate activation phrase contains the predetermined activation phrase, at least one of the first instance of the second input audio signal and the second instance of the second input audio signal to the data processing system comprising the second natural language processor component to identify a second request in the at least one of the first instance of the second input audio signal and the second instance of the second input audio

28. A system to transmit data in a voice-activated network, comprising:

a first microphone, of a sensing device, to receive a first instance of a first input audio signal and a first instance of a second input audio signal;

a second microphone, of the sensing device, to receive a second instance of the first input audio signal and a second instance of the second input audio signal;

a natural language processor component executed by the sensing device, to parse the first instance of the first input audio signal to identify an activation phrase;

an interface, of the sensing device, to transmit, at a first time point, the first instance of the first input audio signal to a client device based on identification of the activation phrase in the first instance of the first input audio signal, the client device comprising a second natural language processor component;

the interface, of the sensing device, to transmit, at a second time point after the first time point, the second instance of the first input audio signal to the client device; and

the interface, of the sensing device, to transmit the first instance of the second input audio signal to the client device based on a confirmation message from the client device of an identification of the activation phrase in the second instance of the first input audio signal.

29. The system of claim 28, comprising the interface to:

transmit, at the first time point, the first instance of the first input audio signal to the client device at a first compression level; and

transmit, at the second time point, the second instance of the first input audio signal to the client device at a second compression level lower than the first compression level.

30. The system of claim 28, comprising the interface to:

transmit, at the first time point, the second instance of the first input audio signal to the client device at a first compression level; and

transmit, at the second time point, the first instance of the first input audio signal and the second instance of the input audio signal to the client device at a second compression level lower than the first compression level.

31. The system of claim 28, comprising the interface to:

transmit, at the second time point, the second instance of the first input audio signal to the client device based on a confirmation message that the activation phrase is not in the first instance of the input audio signal.

32. The system of claim 28, comprising the interface to:

transmit, at the second time point, the second instance of the first input audio signal to the client device prior to receipt of a confirmation message that the activation phrase is not in the first instance of the input audio signal.

33. The system of claim 32, comprising:

the interface to terminate the transmission of the second interface of the first input audio signal based on a confirmation message that the activation phrase is in the first instance of the input audio signal.

34. The system of claim 28, comprising the interface to:

establish a Bluetooth connection with the client device;

transmit, over the Bluetooth connection, the first instance of the first input audio signal and the second instance of the first input audio signal.

35. A method to transmit data in a voice-activated network, comprising:

receiving, by a first microphone of a sensing device, a first instance of a first input audio signal and a first instance of a second input audio signal;

receiving, by a second microphone of the sensing device, a second instance of the first input audio signal and a second instance of the second input audio signal;

parsing, by a natural language processor component executed by the sensing device, the first instance of the first input audio signal to identify an activation phrase;

transmitting, by an interface of the sensing device at a first time point, the first instance of the first input audio signal to a client device based on identification of the activation phrase in the first instance of the first input audio signal, the client device comprising a second natural language processor component;

transmitting, by the interface of the sensing device at a second time point after the first time point, the second instance of the first input audio signal to the client device; and

transmitting, by the interface of the sensing device, the first instance of the second input audio signal to the client device based on a confirmation message from the client device of an identification of the activation phrase in the second instance of the first input audio signal.

36. The method of claim 35, comprising:

transmitting, by the interface at the first time point, the first instance of the first input audio signal to the client device at a first compression level; and

transmitting, by the interface at the second time point, the second instance of the first input audio signal to the client device at a second compression level lower than the first compression level.

37. The method of claim 35, comprising:

transmitting, by the interface at the first time point, the second instance of the first input audio signal to the client device at a first compression level; and

transmitting, by the interface at the second time point, the first instance of the first input audio signal and the second instance of the input audio signal to the client device at a second compression level lower than the first compression level.

38. The method of claim 35, comprising:

transmitting, by the interface at the second time point, the second instance of the first input audio signal to the client device based on a confirmation message that the activation phrase is not in the first instance of the input audio signal.

39. The method of claim 35, comprising:

transmitting, by the interface at the second time point, the second instance of the first input audio signal to the client device prior to receipt of a confirmation message that the activation phrase is not in the first instance of the input audio signal.

40. The method of claim 39, comprising:

terminating, by the interface, the transmission of the second interface of the first input audio signal based on a confirmation message that the activation phrase is in the first instance of the input audio signal.