8.4 Unequal Error Protection

Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

8.4 Unequal Error Protection

As mentioned earlier in this chapter, in a video communication system, the output of the source encoder is a bit stream that usually has different parts and a great deal of structure. Within the bit stream, headers and markers separate frames and group of frames. But even beyond this structure, there is a structure within the source encoded bit stream corresponding to a single frame. Within this portion of the bit stream it is usually possible to recognize different parts corresponding to the different outputs from the source encoder operation. For example, it will usually be possible to recognize a part composed of entropy-coded information resulting from the compression of a frame texture, if it is an intra-predicted frame, or resulting from the compression of a motion compensated prediction error frame if the frame is not intra-predicted. Within the source encoded bit stream for non-intra-predicted frames it will be possible to find, for example, another part containing the motion vector information, which is frequently encoded using differential encoding.

In essence, the different parts of the compressed video bit stream result in different reconstruction distortion after being affected by channel errors. To reduce the likelihood of these errors happening, it is necessary to add forward error control (FEC) to the compressed video bit stream. This implies adding redundant bits that will help identify and correct some errors. Since these added redundancy bits will increase the bit rate required to transmit the video, they bring the problem of how much error protection redundancy to add. Adding enough redundancy so as to have the strong error correcting code required by those parts of the bit stream that yield the most distortion would result in an inefficient excess of redundancy for the other parts. Adding redundancy matched to the average error-related distortion performance between all the parts of the bit stream is both excessive for the parts that need little error protection and insufficient for the parts that need the most error protection. Of course, adding a small amount of redundancy, such as that usually needed by most of the parts of the bit stream, may be cheap in terms of transmission bit rate increase but would result in severe distortion of those few parts requiring a lot more error protection. In summary, the best approach in terms of striking the best tradeoff between distortion performance and added error control redundancy is to assign different levels of error protection to the different parts of the source encoded bit stream, according to the impact that the channel errors on that part would have on the end-to-end distortion. This assignment of different error protection to different parts of the source encoded bit stream is a technique called “unequal error protection” (UEP).

With 3D video, different frames have different importance, depending on their temporal dependency and the effects that their loss or corruption produces. For example, I-frames are the most important of all frames because their loss will propagate error into the reconstruction of all dependent P-frames. Also, if P-frames from a view, say, the right view, are used to predict and differentially encode the other view, then it is more important for these frames to be received with no errors. These observations, common to 3D video, naturally lend themselves to a UEP error protection scheme design. This is the problem approached in [10], where both a rate-distortion optimized and a UEP protection scheme are designed for a 3D video codec derived from the H.264 codec. A simplified block diagram, showing the configuration for this video codec is shown in Figure 8.10.

images

Figure 8.10 Simplified block diagram for the H.264-based 3D video encoder and decoder from [11].

images

Figure 8.11 Frames types and their interdependency for the H.264-based 3D video encoder and decoder from [11].

Figure 8.10 shows a typical configuration for a video codec that outputs I-frames and P-frames, implemented with the H.264 codec, but now expanded to provide 3D functionality by adding a decoded picture buffer that allows differential encoding of one view based on the other view. The interdependency of the different frames is shown in Figure 8.11. The figure introduces, for ease of discussion, a classification of frames into three different types. Frames of different types have different importance. As such, frames of type 0 are the most important ones because their loss affects all P-frames from both views. Frames of type 1 (left view, P-frames) are second in importance because their loss affects P-frames from both views. Frames of type 2 (right view, P-frames) are the least important ones because their loss only affects other frames of the same type.

As mentioned, the design in [10] follows two main steps. In the first step, the video encoding rate is calculated for the three types of frames using as a criterion the minimization of the overall distortion subject to total rate limitations. This step of the design is the well-known problem of using knowledge of the rate-distortion (RD) source codec curve to do bit rate allocation so as to minimize distortion. The different twist to this problem is the presence of the different views from the 3D view. Nevertheless, considering the different views is not too different from considering the different predictive types of frames in 2D video. To solve this design problem, it is necessary to derive first the RD curve. In [10], this is done by extending the model for 2D RD curves. Consequently, the RD curve for a type 0 frame is found to be:

images

where D₀ is the distortion of a type 0 frame when encoding at a rate R₀. The variables θ₀, R_c0 and D_c0 are coded-specific adjustment values for the model that need to be calculated through simulations and curve fitting. Similarly, the RD curve for a type 1 frame is:

images

where D₁ is the distortion of a type 1 frame when encoding at a rate R₁, and θ₁, C₁, R_c1, and D_c₁ are the adjustment variables that need to be computed to match the model to the codec performance. Note that the RD curve for a type 1 frame depends also on the rate used for type 0 frames, making explicit the need of type 0 frames in the encoding and decoding of type 1 frames. Finally, and reasoning in the same way, the RD curve for a type 2 frame is:

images

where D₂ is the distortion of a type 2 frame when encoding at a rate R₂, and θ₂, C₂, C₃, R_c2, and D_c2 are the adjustment variables that need to be computed to match the model to the codec performance. Knowing the RD performance from the three types of frames allows calculating the total RD model:

images

where D_cT = D_c0 + D_c1 + D_c2. With this intermediate result it is now possible to write the rate allocation design problem:

images

where p is the proportional increase in bit rate due to the added redundancy and R_CH is the total bit rate supported for transmission over the channel. One of the advantages of the model used for DR function is that the design problem can be solved using simply the Lagrange multiplier method. The result is the optimal allocation of encoding rate to each type of frame. In the above formulation, one possible debatable assumption is whether the total distortion can actually be assumed to be equal to the sum of distortions from each frame type D_T = D₀ + D₁ + D₂. As it turns out, it is possible to make this assumption in this case, as results presented in [10] show that the result from the Lagrange multiplier based rate assignment is very close to the optimal solution.

The 3D video codec in this case retains many of the characteristics and structure of the H.264 video codec because it was implemented as an adaptation from it. In particular, the output from the encoder is organized into data packets called the “network abstraction layer” (NAL). Losing a NAL unit during transmission means losing a number of macroblocks from a type 0, type 1, or type 2 frame. Yet, as discussed earlier, the effect of losing a NAL unit is more important when the NAL unit contains macroblock data from a type 0 frame rather than from a type 1 frame and even less important if the macroblock contains data from a type 2 frame. Therefore, a UEP scheme that considers this effect will achieve a more efficient redundancy allocation. In [10] this is done by first calculating the average distortion linked with losing a NAL unit associated with each of the three frames types. For NAL units associated with type 0 frames, the average distortion D_L0 is the mean squared difference between the original macroblocks and their replacement using spatial error concealment. For NAL units associated with type 1 and type 2 frames, the average distortions D_L1 and D_L2 are the mean squared difference between the original macroblocks and their replacement using temporal error concealment. With this result, the UEP design problem consists of distributing redundancy between the three frame types. Formally, this can be written as:

images

where p₀, p₁, and p₂ are the proportion of the bit rate allocated to frame types 0, 1, and 2, respectively, used for redundancy. Also, P₀, P₁, and P₂ are the loss probability for frame types 0, 1, and 2, respectively, and which depend on the FEC technique used and the channel conditions and parameters. Note that in the expression for the problem formulation:

images

is the probability that a NAL unit is associated with a type 0 frame,

images

is the probability that a NAL unit is associated with a type 1 frame, and

images

is the probability that a NAL unit is associated with a type 2 frame. The constraint p₀R₀ + p₁R₁ + p₂R₂ = pR_CH in the UEP design formulation expresses that the combined redundancy from NAL units associated with all the frame types should equal the total transmitted redundancy.

The error protection in [10] is implemented using systematic Raptor codes, [12]. With this setting, the UEP scheme shows 3–8 dB improvement in PSNR when compared to the equivalent scheme with no UEP (the redundancy is distributed evenly among the three types of frames) and is approximately 4 dB away from the PSNR performance with no channel errors.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for 8.4 Unequal Error Protection

Create new playlist

Sign In

Sign Up

8.4 Unequal Error Protection

Table of Contents for
8.4 Unequal Error Protection