Processing

Please wait...

Settings

Settings

Goto Application

1. WO2020141258 - AN APPARATUS, A METHOD AND A COMPUTER PROGRAM FOR VIDEO CODING AND DECODING

Note: Text based on automatic Optical Character Recognition processes. Please use the PDF version for legal matters

[ EN ]

CLAIMS:

1. A method comprising:

encoding at least four bitstream versions of a same content divided into segments of independently coded tile sets representing a plurality of spatial regions, wherein a first and a second bitstream comprise independently coded tile sets encoded at a first quality, and a third and a fourth bitstream comprise independently coded tile sets encoded at a second quality, wherein the first and the third bitstream have first random access picture interval and the second and the fourth bitstream have second random access picture interval, which is an integer multiple of the first random access picture interval;

grouping the independently coded tile sets of all four bitstreams representing a common spatial region into a plurality of groups of collocated sub-picture tracks, wherein only one of said tile sets per group is intended to be received and/or decoded per any segment; and

generating at least one instruction for merging tile sets of different spatial locations into at least one coded picture, the at least one instruction causing a tile set originating from a random access picture to be decoded as a tile set originating from a non-random-access picture when merged with a tile set originating from a non-random-access picture.

2. An apparatus comprising:

means for encoding at least four bitstream versions of a same content divided into segments of independently coded tile sets representing a plurality of spatial regions, wherein a first and a second bitstream comprise independently coded tile sets encoded at a first quality, and a third and a fourth bitstream comprise independently coded tile sets encoded at a second quality, wherein the first and the third bitstream have first random access picture interval and the second and the fourth bitstream have second random access picture interval, which is an integer multiple of the first random access picture interval;

means for grouping the independently coded tile sets of all four bitstreams representing a common spatial region into a plurality of groups of collocated sub-picture tracks, wherein only one of said tile sets per group is intended to be received and/or decoded per any segment; and

means for generating at least one instruction for merging tile sets of different spatial locations into at least one coded picture, the at least one instruction causing a tile set originating from a random access picture to be decoded as a tile set originating from a non-random-access picture when merged with a tile set originating from a non-random-access picture.

3. The apparatus according to claim 2, further comprising

means for encapsulating the at least one instruction into a collector track.

4. The apparatus according to claim 3, further comprising

means for forming a collector representation element in a streaming manifest from the collector track.

5. The apparatus according to claim 4, further comprising

means for indicating, in the streaming manifest, the first random access picture interval and the second random access picture interval; and

means for indicating, in the streaming manifest, a mapping of the first random access picture interval and the second random access picture interval to the sub-picture representation elements.

6. The apparatus according to any of claims 2 - 5, further comprising

means for including, as said at least one instruction for merging tile sets of different spatial locations into at least one coded picture, an indication into a container file indicating a possibility to rewrite network abstraction layer (NAL) unit types.

7. The apparatus according to any of claims 3 - 6, further comprising

means for including, as said at least one instruction for merging tile sets of different spatial locations into at least one coded picture, an in-line picture-level indication in the collector track.

8. The apparatus according to any of claims 2 - 7, further comprising

means for including, as said at least one instruction for merging tile sets of different spatial locations into at least one coded picture, an indication into a file and/or in a Media Presentation Description (MPD) indicating which track or representation contains picture-level syntax structures that apply to all bitstreams.

9. The apparatus according to any of claims 2 - 8, further comprising

means for indicating a set of switch-point pictures considered as random-access pictures for the first random access picture interval that are not integer multiples of the second random access picture interval as non-random-access pictures, the switch-point pictures being intra-coded and causing the same reference picture selection implications as respective random-access pictures.

10. A method comprising

obtaining sub-picture tracks representing a plurality of spatial regions and grouping information of the sub-picture tracks, the grouping information being indicative of groups comprising sub-picture tracks of a common spatial resolution, wherein content for one sub-picture track per group is intended to be received and/or decoded per any segment, the content for sub-picture tracks being derived from at least four bitstream versions of a same content divided into segments of independently coded tile sets representing the plurality of spatial regions, wherein a first and a second bitstream comprise independently coded tile sets encoded at a first quality, and a third and a fourth bitstream comprise independently coded tile sets encoded at a second quality, wherein the first and the third bitstream have first random access picture interval and the second and the fourth bitstream have second random access picture interval;

selecting sub-picture tracks of different random access picture intervals from groups to be received or decoded for a segment;

obtaining or inferring at least one instruction for merging tile sets of different spatial locations into at least one coded picture, the at least one instruction causing a tile set originating from a random access picture to be decoded as a tile set originating from a non-random-access picture when merged with a tile set originating from a non-random-access picture; and

processing the at least one instruction to form the at least one coded picture.

11. An apparatus comprising

means for obtaining sub-picture tracks representing a plurality of spatial regions and grouping information of the sub-picture tracks, the grouping information being indicative of groups comprising sub-picture tracks of a common spatial resolution, wherein content for one sub-picture track per group is intended to be received and/or decoded per any segment, the content for sub-picture tracks being derived from at least four bitstream versions of a same content divided into segments of independently coded tile sets representing the plurality of spatial regions, wherein a first and a second bitstream comprise independently coded tile sets encoded at a first quality, and a third and a fourth bitstream comprise independently coded tile sets encoded at a second quality, wherein the first and the third bitstream have first random access picture interval and the second and the fourth bitstream have second random access picture interval;

means for selecting sub-picture tracks of different random access picture intervals from groups to be received or decoded for a segment;

means for obtaining or inferring at least one instruction for merging tile sets of different spatial locations into at least one coded picture, the at least one instruction causing a tile set originating from a random access picture to be decoded as a tile set originating from a non-random-access picture when merged with a tile set originating from a non-random-access picture; and

means for processing the at least one instruction to form the at least one coded picture.

12. The apparatus according to claim 11, further comprising

means for obtaining the at least one instruction from a collector track.

13. The apparatus according to claim 12, further comprising

means for obtaining a collector representation element from a streaming manifest.

14. The apparatus according to any of claims 11 - 13, further comprising

means for inferring, upon receiving network abstraction layer (NAL) units from both random-access and non-random-access pictures for a single time instance, said at least one instruction for merging tile sets of different spatial locations into at least one coded picture.

15. The apparatus according to any of claims 12 - 14, further comprising

means for inferring, upon receiving NAL units from both random-access and non-random-access pictures for a single time instance, an in-line picture-level indication from the collector track as said at least one instruction for merging tile sets of different spatial locations into at least one coded picture.