1.4 Challenges and Opportunities

As 3D visual communication has become one of the main focus research areas for the coming decade, we emphasize some potential research directions in this chapter. Streaming 3D video over 4G wireless networks has become a feasible and practicable application. 4G wireless network standards include mechanisms to establish connections with different scheduling types and priorities, which support, for example, traffic with guaranteed maximum and minimum traffic rates. Also, the standards include different mechanisms that allow the uplink scheduler at the base station to learn the status of the buffers at the mobiles. Since the resource allocation mechanism is not yet specified in 4G wireless communication standard, it is an important topic to jointly consider the design of schedulers and resource allocation mechanisms with the QoS-related mechanisms in the standard to meet the final users' QoE requirement. The combination of the flexibility offered by 4G physical layer technologies, such as OFDM and AMC, and 4G features at higher layers, such as QoS support for heterogeneous services and all-IP, results in a framework that is suitable for delivery of 3D video services which require high bit rates, low latencies, and the feasibility to deploy adaptive application layer adjustment, such as unequal error protection, that can tailor the error control and transmission parameters to the very different requirements exhibited by the different components of 3D video streams.

images

Figure 1.3 Evolving of wireless communication system.

As there is such a great flexibility in the 3D video service over 4G network applications, the major challenge becomes how to efficiently utilize system resources supplied in different layers to maximize users' 3D viewing experience. From the perspective of the communication system, the allocation of system resources for a cross-layer designed framework is both constrained across different communication layers and constrained among users who are simultaneously sharing the same spectrum [3]. From the perspective of 3D video QoE, real-time 3D video transmission has further delay constraints for real-time playback, minimal requirement for binocular quality, more coding parameters with more decoding dependency to consider, and limited computation complexity on the mobile side. Moreover, the allocation of system resources should be conducted dynamically to reflect the time-varying characteristics of the channel condition and the time-heterogeneity of the video source. To acquire up-to-date information on system resource availability, the resource allocator often accrues extra computation costs and communication overhead depending on the accuracy and frequency. A formal way to resolve the resource allocation problem is to formulate a 3D over 4G network as a cross-layer optimization problem to maximize user QoE subject to system QoS constraints. The problem we often encountered is how to develop a good model to link/map between system QoS and user QoE. Also, we often have to deal with resources having both continuous and integer-valued parameters. The overall optimization problem may also have nonlinear or/and nonconvex objective functions/constraints, and many local optima may exist in the feasible range. Thus, obtaining the optimal solution is often NP hard. How to choose the parameter sets in different layers as the optimization search space and how to develop fast solvers to attain optimal/suboptimal values in real time remain challenging issues.

As there are more and more applications with high bandwidth requirement, and these services increase the demands on the 4G networks. A growing tendency toward increasing network capacity is to include architectures that tend to reduce the range of the radio links, thus improving the quality of the link (data rate and reliability) and increasing spatial reuse. In WiMAX, the solution is proposed by adopting the use of multi-hop networks. The other solution with an analogous rationale is to introduce the use of femtocells [4]. Femtocells are base stations, typically installed at a home or office, which operate with low power within a licensed spectrum to service a reduced number of users. Work in femtocells was initiated for 3G networks and it focused on addressing the main challenges when implementing this technology: interference mitigation, management ease of configuration, and integration with the macrocell network. The progress achieved with 3G networks is being carried over to 4G systems, for both LTE and WiMAX. With the popularity of femtocells, 3D streaming over femtocells (or as the first/last mile) will become an important service. Therefore, there is a strong need to study how to efficiently allocate resources and conduct rate control.

In the multiuser video communications scenario, a single base station may serve several users, receiving the same 3D video program but requesting different views in free-viewpoint systems or multi-view video plus depth systems. As the information received by different users is highly correlated, it is not efficient to stream the video in a simulcast fashion. A more efficient way to utilize the communication resource is to jointly consider the correlation among all users' received 3D video programs and send only a subset of the video streams corresponding to a selection of views. The views actually transmitted are chosen in such a way that they can be used to synthesize the intermediate views. How to maximize all users' QoE by selecting the representative views and choosing the video encoding parameters and network parameters to meet different users' viewing preferences under different channel conditions becomes an important issue.

Most of the approaches in a 3D video communications pipeline take advantage of intrinsic correlation among different views. In networks of many users receiving 3D video streams, the view being received by one user could serve other users by providing the same view, if needed, or a view that could be used to synthesize a virtual view. This scenario could arise in multimedia social networks based on video streaming through peer-to-peer (P2P) network architecture. Some recent publications have studied techniques that address incentive mechanisms in multimedia live streaming P2P networks to enable the cooperation of users to establish a distributed, scalable, and robust platform. These techniques, nevertheless, fall into an area of incipient research for which still more work is needed. In the future, the properties of 3D video make it an application area that could benefit from techniques to incentivize the collaboration between users.

The features and quality offered by 4G systems result in an ecosystem where it is expected that users' equipment will present vastly heterogeneous capabilities. In such a background, scalability and universal 3D access become rich fields with plenty of potential and opportunities. With the introduction of 3D video services – in addition to the traditional spatial, temporal, and quality scalabilities – video streams will need to offer scalability in the number of views. For this to be realizable, it will be necessary to design algorithms that select views in a hierarchy consistent with scalability properties and that are able to scale up in a number of views by interpolation procedures. Considering scalability from the bandwidth perspective, it is also important to realize the challenges for 3D streaming services over low bit-rate network. Under a strict bit-rate budget constraint, the coding artifacts (e.g., blocking/ringing artifacts) due to a higher compression ratio become much more severe and degrade the binocular quality. Furthermore, channel resources for error protection are limited such that channel-induced distortion could further decrease the final 3D viewing experience. A joint rate control and error protection mechanism should be carefully designed to preserve/maximize objects with critical binocular quality (such as foreground objects) to remedy the coding artifacts and channel error. On the other hand, in order to achieve universal 3D access, 3D video analysis and abstraction techniques will attract more attention, as well as the transmission of 3D abstract data.

The development of distributed source coding paves the way for 3D video transmission over wireless network. Distributed source coding solution tries to resolve the problem of lossy source compression with side information. When applying this concept to video coding, this technique can be summarized as transmitting both a coarse description of the video source and extra data that completes the representation of the source, which is compressed using a distributed source coding technique (also known as Wiener–Ziv coding). The coarse description contains side information that is used to decode the extra data and obtain a representation of the reconstructed video. The main property exploited by the distributed video coding system is the correlation between the distributed source-coded data and the side information. The coarse description of the video can be either a highly compressed frame or an intra-predictive frame. In the latter case, the combination of the distributed source-coded data with the side information is able to recover the time evolution of the video sequence. Distributed video coding is a technique of interest for wireless video transmission because there is a duality between the distributed source-coded data and error correcting redundancy that results in an inherent resiliency for the compressed video stream.

We can extend the principle of distributed video coding to multi-view 3D video, since we can exploit the redundancies already present in mono-view video and add the new ones in multi-view video. Taking the simplest scenario consisting of two views, we can encode one view using distributed video coding and use the second view as side information. The other way to construct such a system is to deploy the distributed video coding for both views and use the implicit correlation between the two views to extract their time-dependent difference as side information. For a more generic setting involving more than two views, the multi-view video structure can be exploited by generating the side information from a combination of inter-view texture correlation and time-dependent motion correlation. Owing to the distributed nature, it is possible to combine the distributed 3D video coding with the use of relay nodes enabled with cooperative communications. Such a combination of distributed video coding and cooperative communications sets up a flexible framework that can be applied in a variety of ways. For example, different types of data in the multi-view video (or even V+D) can be via different channels/paths. Even more, if the relay is equipped with high computation ability, it can perform different application layer processing, such as transcoding/video post-processing/error concealment/view synthesis, to facilitate the 3D visual communication.

We often rely on the objective measurement, such as throughput, goodput, and mean-squared-error (MSE), to evaluate the system resource utilization from communication layer to 3D video application layer. One of the reasons is that the selected objective measurement simplifies the problem formulation by excluding the highly nonlinear HVS factors and the optimal solutions exist in the formulated linear/nonlinear continuous/integer optimization problem. However, the final 3D video quality is evaluated by the human eyes; and the objective measurement does not always align with what human beings perceive. In other words, understanding the 3D human vision system and quantifying the QoE becomes extremely important. More specifically, we need to find the critical features and statistics which affect the 3D QoE and an effective objective measurement for 3D QoE that reflects subjective measurement. It is also important to have a quantitative measurement mechanism to evaluate the impact of distortion caused in each stage of the 3D communication pipeline to the end-to-end 3D QoE. Having those QoE metric will enables the QoE-based optimized framework for 3D visual communications.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.191.186.219