您的位置:首页 > 编程语言 > Qt开发

Image Processing Transform Coding Using the Residual Quadtree (RQT)

2014-07-17 19:10 567 查看
In HEVC, each picture is divided into coding tree blocks (CTBs). A CTB is a square block and represents the root of a quadtree, i.e., the coding tree. The CTB size may range from 8×8 to 64×64 luma samples,
but typically 64×64 is used. Each CTB can be further split into smaller square blocks called coding blocks (CBs). After the CTB is split recursively into CBs, each CB is further divided into prediction blocks (PBs) and transform blocks (TBs). The partitioning
of the CBs into TBs is carried out recursively based on a quadtree approach. The corresponding structure, i.e. the residual quadtree (RQT), allows TB sizes from 4×4 up to 32×32 luma samples. The figure below shows an example where a CB includes 10 TBs, labeled
with the letters a to j, and the corresponding block partitioning. The individual TBs are processed in alphabetical order, which follows a recursive Z-scan with depth-first traversal. The quadtree approach
enables the adaptation of the transform to the varying space-frequency characteristics of the residual signal. Larger transform block sizes, which have larger spatial support, provide better frequency resolution. However, smaller transform block sizes, which
have smaller spatial support, provide better spatial resolution. The trade-off between the two, spatial and frequency resolution, is chosen by the encoder control, for example based on Lagrangian optimization techniques.




Parameter Signaling

The RQT is defined by three parameters: the maximum depth of the tree, the minimum allowed transform size and the maximum allowed transform size. The minimum and maximum transform sizes can vary within the
range from 4×4 to 32×32 samples, which correspond to the supported block transforms mentioned in the previous section. The maximum allowed depth of the RQT restricts the number of subdivisions. A maximum depth equal to zero means that a CB cannot be split
any further and thus the associated CB contains only one TB.

All these parameters interact and influence the subdivision of the RQT. Consider a case, in which the root CB size is 64×64, the maximum depth is equal to zero and the maximum transform size is equal to 32×32. In this case, the CB has to be subdivided at least
once, since otherwise it would lead to a 64×64 TB, which is not allowed. The RQT parameters, i.e. maximum RQT depth, minimum and maximum transform size, are transmitted in the bitstream at the sequence parameter set level. Regarding the RQT depth, different
values can be specified and signaled for intra and inter coded CUs.


Fast Encoder Control

In order to determine the optimal partitioning of a CU into TUs, the encoder has to exhaustively evaluate all possible RQT structures, corresponding to all possible TU partitionings for the given CU. Since
the number of possible RQT structures grows exponentially with the maximum allowed tree depth, the encoder complexity (e.g. runtime) required to obtain the optimal TU partitioning in terms of rate-distortion (RD) would be exponentially increased with increased
RQT depth. This would limit application of the RQT approach in transform coding. Therefore, in addition to the exhaustive search as it is done by the HM reference encoder software, we developed a fast RQT encoder control limiting the number of possible candidates.
This leads to a reduction of encoder runtime, which comes at the cost of a slightly inferior coding performance and is designed as follows.

The encoder starts at the RQT root, corresponding to the maximum possible TB size, and continues evaluation at the next RQT level, corresponding to the next smaller TB size, until either an early-termination criterion is fulfilled or the maximum allowed RQT
depth is reached. For the early-termination criterion, it is checked whether all the absolute unquantized transform coefficients are below a certain threshold. If this is the case, then the evaluation stops at the current level, and smaller TB sizes are not
taken into consideration. A QP-dependent threshold is used, which is higher for the smaller QP values and lower for the larger QP values, such that the reduction of encoder runtime in percentage is approximately the same for the whole QP range. For QP values
below 24, the threshold is equal to 125% of the quantizer step size, 50% for QP values above 48, and for QP values in the range of 24 and 48, there is a linear transition between 50% and 125% of the quantizer step size.


References

D.
Marpe, H. Schwarz, S. Bosse, B. Bross, P. Helle, T. Hinz, H. Kirchhoffer, H. Lakshman, T. Nguyen, S. Oudin, M. Siekmann, K. Sühring, M. Winken, and T. Wiegand, "Video Compression Using Nested Quadtree Structures, Leaf Merging and Improved Techniques for Motion
Representation and Entropy Coding," IEEE Transactions on Circuits and Systems for Video Technology, Vol. 20, No. 12, pp. 1676-1687, Dec. 2010.
M. Winken, P. Helle, D. Marpe, H. Schwarz, and T. Wiegand,
"Transform Coding in the HEVC Test Model," 18th IEEE International Conference on Image Processing (ICIP), 2011, pp. 3693 – 3696.
M. Siekmann, H. Schwarz, B. Bross, D. Marpe, and T. Wiegand, "Fast encoder control for RQT," JCTVC-E425, Mar. 2011.
内容来自用户分享和网络整理,不保证内容的准确性,如有侵权内容,可联系管理员处理 点击这里给我发消息
标签: 
相关文章推荐