WIPO logo
Mobile | Deutsch | Español | Français | 日本語 | 한국어 | Português | Русский | 中文 | العربية |
PATENTSCOPE

Search International and National Patent Collections
World Intellectual Property Organization
Search
 
Browse
 
Translate
 
Options
 
News
 
Login
 
Help
 
maximize
Machine translation
1. (WO2014202647) JITTER BUFFER CONTROL, AUDIO DECODER, METHOD AND COMPUTER PROGRAM
Note: Text based on automatic Optical Character Recognition processes. Please use the PDF version for legal matters

Claims

1. A jitter buffer control (100; 350; 490) for controlling a provision of a decoded audio content (312; 412) on the basis of an input audio content (310; 410),

wherein the jitter buffer control is configured to select a frame-based time scaling or a sample-based time scaling in a signal-adaptive manner.

2. The jitter buffer control (100; 350; 490) according to claim 1 , wherein audio frames are dropped or inserted to control a depth of a jitter buffer (320; 430) when the frame-based time scaling is used, and wherein a time-shifted overlap-and-add (954; 1068) of audio signal portions is performed when the sample-based time- scaling is used.

3. The jitter buffer control (100; 350; 490) according to claim 1 or claim 2, wherein the jitter buffer control is configured to switch between a frame-based time scaling, a sample-based time scaling and a deactivation of a time scaling in a signal-adaptive manner.

4. The jitter buffer control (100; 350; 490) according to one of claims 1 to 3, wherein the jitter buffer control is configured to select the frame-based time scaling or the sample-based time scaling in order to control a depth of a de-jitter buffer (320;430).

5. The jitter buffer control (100; 350; 490) according to one of claims 1 to 4, wherein the jitter buffer control is configured to select a comfort noise insertion or a comfort noise deletion (856) if a previous frame was inactive.

6. The jitter buffer control (100; 350; 490) according to claim 5, wherein a comfort noise insertion results in an insertion of a comfort noise frame into a de-jitter buffer (320;430), and wherein a comfort noise deletion results in a removal of a comfort noise frame from the de-jitter buffer.

7. The jitter buffer control (100; 350; 490) according to claim 5 or claim 6, wherein a respective frame is considered inactive when the respective frame carries a signaling information indicating a generation of comfort noise.

8. The jitter buffer control (100; 350; 490) according to one of claims 1 to 7, wherein the jitter buffer control is configured to select a time-shifted overlap-and-add (954; 1068) of audio signal portions if a previous frame was active.

9. The jitter buffer control (100; 350; 490) according to claim 8, wherein the time- shifted overlap-and-add (954; 1068) of audio signal portions is adapted to allow for an adjustment of a time shift between blocks of audio samples obtained on the basis of subsequent frames of the input audio content with a resolution which is smaller than a length of the blocks of audio samples, or which is smaller than a quarter of the length of the blocks of audio samples, or which is smaller than or equal to two audio samples.

10. The jitter buffer control (100; 350; 490) according to claim 8 or claim 9, wherein the jitter buffer control is configured to determine (930,936; 1010, 1014) whether a block of audio samples represents an active but silent audio signal portion, and wherein the jitter buffer control is configured to select an overlap-and-add mode (962; 1018), in which a time shift between the block of audio samples representing a silent audio signal portion and a previous or subsequent block of audio samples is set to a predetermined maximum value, for a block of audio samples representing a silent audio signal portion.

1 1 . The jitter buffer control (100; 350; 490) according to one of claims 8 to 10, wherein the jitter buffer control is configured to determine (930,936; 1010, 1014) whether a block of audio samples represents an active and non-silent audio signal portion, and to select an overlap-and-add mode (942,950,954; 1030, 1060, 1064, 1068), in which the time shift between blocks of audio samples determined on the basis of subsequent frames of the input audio content is determined in a signal adaptive manner.

12 The jitter buffer control (100; 350; 490) according to one of claims 1 to 1 1 , wherein the jitter buffer control is configured to select an insertion of a concealed frame in response to a determination that a time stretching is required and that a jitter buffer is empty.

13. The jitter buffer control (100; 350; 490) according to one of claims 1 to 12, wherein the jitter buffer control is configured to select the frame-based time scaling or the sample-based time scaling in dependence on whether a discontinuous transmission in conjunction with comfort noise generation is currently used or was used for a previous frame.

14. The jitter buffer control (100; 350; 490) according to one of claims 1 to 13, wherein the jitter buffer control is configured to select a frame-based time scaling if a comfort noise generation is currently used or was used for a previous frame and to select a sample-based time scaling if a comfort noise generation is not currently used or was not used for a previous frame.

15. The jitter buffer control (100; 350; 490) according to one of claims 1 to 14,

wherein the jitter buffer control is configured to select a comfort noise insertion or a comfort noise deletion (856) for a time scaling if a discontinuous transmission in conjunction with comfort noise generation is currently used or was used for a previous frame,

wherein the jitter buffer control is configured to select an overlap-add-operation using a predetermined time shift (962, 1018) for a time scaling if a current audio signal portion is active but comprises a signal energy which is smaller than or equal to an energy threshold value, and if a jitter buffer is not empty, or if a previous audio signal portion was active but comprises a signal energy which is smaller than or equal to the energy threshold value, and if the jitter buffer is not empty;

wherein the jitter buffer control is configured to select an overlap-add-operation using a signal-adaptive time shift (954; 1068) for a time scaling if a current audio signal portion is active and comprises a signal energy which is larger than or equal to the energy threshold value and if the jitter buffer is not empty, or if a previous audio signal portion was active and comprises a signal energy which is larger than or equal to the energy threshold value and if the jitter buffer is not empty; and

wherein the jitter buffer control is configured to select an insertion of a concealed frame for a time scaling if a current audio signal portion is active and if the jitter buffer is empty, or if a previous audio signal portion was active and if the jitter buffer is empty.

16. The jitter buffer control (100; 350; 490) according to one of claims 1 to 15, wherein the jitter buffer control is configured to select an overlap-add-operation (942,950,954; 1030, 1060, 1064,1068, 1072, 1084) using a signal-adaptive time shift and a quality control mechanism (950; 1060, 1064, 1072, 1084) for a time scaling if a current audio signal portion is active and comprises a signal energy which is larger than or equal to the energy threshold value and if the jitter buffer is not empty, or if a previous audio signal portion was active and comprises a signal energy which is larger than or equal to the energy threshold value and if the jitter buffer is not empty.

17. An audio decoder (300;400) for providing a decoded audio content (312;412) on the basis of an input audio content (310;410), the audio decoder comprising;

a jitter buffer (320;430) configured to buffer a plurality of audio frames representing blocks of audio samples;

a decoder core (330;440) configured to provide blocks (332;442) of audio samples on the basis of audio frames (322;432) received from the jitter buffer;

a sample-based time scaler (340; 450), wherein the sample based time scaler is configured to provide time-scaled blocks of audio samples (342;448) on the basis of blocks of audio samples provided by the decoder core; and

a jitter buffer control (100;350;490) according to one of claims 1 to 15,

wherein the jitter buffer control is configured to select a frame-based time scaling, which is performed by the jitter buffer, or a sample-based time scaling, which is performed by the sample-based time scaler, in a signal-adaptive manner.

18. The audio decoder (300;400) according to claim 17, wherein the jitter buffer (320;430) is configured to drop or insert audio frames in order to perform a frame- based time scaling.

19. The audio decoder according to claim 17 or claim 18, wherein the decoder core (330;440) is configured to perform a comfort noise generation in response to a frame carrying a signaling information indicating a generation of comfort noise, and

wherein the decoder core is configured to perform a concealing in response to an empty jitter buffer.

20. The audio decoder (300;400) according to one of claims 17 to 19, wherein the sample-based time scaler (340; 450) is configured to perform the time scaling of the input audio signal in dependence on a computation or an estimation (950; 1060) of the quality of the time scaled version of the input audio signal obtainable by the time scaling.

21 . A method (1400) for controlling a provision of a decoded audio content on the basis of an input audio content,

wherein the method comprises selecting (1410) a frame-based time scaling or a sample-based time scaling in a signal-adaptive manner

22. A computer program for performing the method according to claim 21 when the computer program is running on a computer.