Search in book...
Toggle Font Controls
Create new playlist

Name your new playlist

Playlist description (optional)
Sign In

Email address

Password

Forgot Password?

or

Continue with Facebook

Continue with Google
Sign Up

Full Name

Email address

Confirm Email Address

Password

or

Continue with Facebook

Continue with Google

Chapter 22

HDR Image Watermarking

F. Guerrini*; M. Okuda^†; N. Adami*; R. Leonardi* ^* University of Brescia, Brescia, Italy
^† University of Kitakyushu, Kitakyushu, Japan

Abstract

In this chapter we survey available solutions for high dynamic range (HDR) image watermarking. First, we briefly discuss watermarking in general terms, with particular emphasis on its requirements that primarily include security, robustness, imperceptibility, capacity, and the availability of the original image during recovery. However, with respect to traditional image watermarking, HDR images possess a unique set of features, such as an extended range of luminance values to work with and tone-mapping operators against which it is essential to be robust. These clearly affect the HDR watermarking algorithms proposed in the literature, which we extensively review next, including a thorough analysis of the reported experimental results. As a working example, we also describe the HDR watermarking system that we recently proposed and that focuses on combining imperceptibility, security, and robustness in tone-mapping operators at the expense of capacity. We conclude the chapter with a critical analysis of the current state and future directions of the watermarking applications in the HDR domain.

Keywords

Digital watermarking; High dynamic range; Steganography; Robustness; Imperceptibility; Security; Quantization index modulation; Tone-mapping operators

22.1 A Brief Introduction to Digital Watermarking

This section introduces the basic notions of digital watermarking, to give the reader a self-contained, succinct yet quite complete coverage of digital watermarking basics while presenting the watermarking system in a way congenial to the rest of this chapter.

Digital watermarking (Barni and Bartolini, 2004; Cox et al., 2008) belongs to the data hiding field. Data (or information) hiding is a field as old as history itself (Cox et al., 2008, Chapter 1). One can hide an object, as a piece of information, for many possible reasons to obtain more disparate intended results. The commonest reason for hiding information is to protect it from inappropriate or prohibited or sometimes even perfectly licit use by people who have not the authority to use it. A different but somewhat correlated reason is secrecy: one may want to hide information to keep its very existence unknown (secrecy is correlated with protection because the former can be a means for protection on its own and because secrecy and protection are often simultaneously present, and sometimes confused, in real-world applications). Hidden information may also be used in surveillance systems to trigger some actions in response to people, unaware of their presence, performing (potentially illicit) operations, this way constituting a repressive rather than a preventative way of protection. Finally, information could be hidden because, in spite of its perhaps indispensable presence, should it be in plain sight in some cases it would degrade the perceptual value of the object it corresponds to. An excellent introduction to data hiding history in its totality can be found in Petitcolas et al. (1999). For a mathematical analysis of the problem of secret communication, see the classic article by Shannon (1949).

Digital data hiding is the direct extension of these concepts to the digital world. Since digital content is only another representation of the same information the human senses naturally perceive, it is quite natural that needs the same as those previously mentioned arise in the digital world as well. Hence, digital information hiding applications attempt to hide some kind of digital information in digital documents.

Digital information hiding can be thought of as a combination of three main techniques: cryptography, steganography, and digital watermarking. The best known branch of information hiding is probably cryptography (Menezes et al., 1996), which is the art of hiding the content of a transmission between two subjects from a potentially malicious eavesdropper by making it unintelligible to all except the intended recipients. Another very old information hiding technique, although less well known than cryptography, is steganography (Provos and Honeyman, 2003), sometimes also called covert communication. In steganography, the very existence of the communication is hidden; the information is conveyed by proper, imperceptible modifications of an innocent-looking object. The most recent information hiding technique, digital watermarking, is the subject of this chapter, applied in the high dynamic range (HDR) imaging context.

Loosely speaking, digital watermarking tries to introduce information in a certain domain (this process is called watermark embedding, Barni and Bartolini, 2004, Chapter 4) inside a digital object while preserving its perceptual content (like steganography) and while being in a “hostile” environment (like cryptography) — that is, populated by intelligent attackers (embodying the so-called watermark channel, Barni and Bartolini, 2004, Chapter 7) interested in disrupting the watermarking system operativeness while preserving the perceptual quality of the host object. This information will then be retrieved by another entity, or perhaps the same one that performed the embedding process, thus performing the so-called watermark recovery (Barni and Bartolini, 2004, Chapter 6).

The need for digital watermarking first arose as an answer to the inherent deficiencies of the existing information hiding techniques to counter the digital multimedia piracy problem. In this case, a digital content owner wants to protect it (eg, against unauthorized copying). Steganography is useless in this case because the potential pirate is well aware of the owner’s intentions and methods, thus failing the main hypothesis of the steganographic model; and cryptography can only ensure the encrypted content is not subject to eavesdropping during its distribution but can do nothing more when the content is finally decrypted for consumption by its intended recipient. Digital watermarking, then, could offer a solution to this problem, at least in principle, because it is contained within the content itself.

Later, it became clear that digital watermarking could also be used in totally different application contexts (Cox et al., 2008, Chapter 2; Barni and Bartolini, 2004, Chapter 2). For example, broadcast monitoring refers to the ability by a broadcaster to have precise reports of which shows are aired and how often they are aired, for reasons ranging from collection of royalties to marketing studies. It includes as applicative environments digital TV broadcasting and Internet TV services and has recently surfaced as a critical problem because of the proliferation of available channels. The main target of these recently released products is the identification of copyrighted content made available on the Internet by online viewers or peer-to-peer networks, at the very least to ask for its removal.

Other possible watermarking applications do not deal with intellectual property rights protection at all. As the watermark represents a side channel to convey information that is attached to the content, it could also carry useful metadata instead. For example, metadata watermarks could simplify content-based retrieval of multimedia objects. A related application which is experiencing growing interest is the embedding of linking information to enable a range of e-commerce services. The distinct advantage of the use of watermarking to attach metadata to a piece of content is its robustness to digital-to-analog and analog-to-digital conversions, processes that usually result in loss of any other type of metadata such as header-based metadata. The ability to insert as much information as possible into the host object is the main requirement, along with issues related to implementation such as complexity and speed of execution.

A rather ambitious proposed application of digital watermarking is enhanced coding of digital content, both source coding and channel coding. It is not clear from a theoretical point of view whether it could boost coding performance, especially in practical scenarios; nevertheless, some authors argue that some advantage is to be expected by data hiding techniques are applied for content coding. For the source coding part, it has been proposed that watermarking could help the compression process, achieving better compression by replacing (a lossless operation) the imperceptible part of the content with information data on the content that therefore no longer has to be stored in the bitstream. Digital watermarking could also be used for channel coding (ie, to counter transmission errors that are especially harmful to compressed content). It could be a valuable alternative to error concealment on the decoder side (to approximate lost information by means of some sort of filtering) and/or redundant coding on the encoder side (via error correcting codes), which usually suffer from backward-compatibility issues. In the case of the use of a digital watermark to embed redundant information, compatibility is automatically achieved because it can be safely ignored by the decoder. It is not clear at this point if the peak signal-to-noise ratio (PSNR) distortion (the traditional way to evaluate the goodness of a coding algorithm) introduced by the watermark is better than that achievable by means of other techniques.

The first domain considered by the digital watermarking community, and undoubtedly the most studied even today, is the one pertaining to still digital low dynamic range (LDR) images. Digital watermarking has been applied to other domains as well, especially audio, but also video (often by use of the methods designed for still images) and more exotic domains such as text and three-dimensional meshes. As we will see, the most recent entry in this list is still digital HDR images.

22.1.1 Digital Watermarking Requirements

The requirements of a digital watermarking system, whose simultaneous presence distinguishes it from the other data hiding techniques, are capacity, robustness/security, and imperceptibility. All of these requirements are discussed in the following.

As is often the case, the system designer must handle an application-dependent trade-off between the requirements. To picture their conflicting nature, they are often drawn on a so-called trade-off triangle, as in Fig. 22.1, to stress the fact that trying to favor one of these requirements always damages to some extent one or both of the other two. Note that robustness and security have been drawn at the same vertex; this can be acceptable in general, given that the boundary between these two requirements is very subtle (and even not considered by some authors). However, to be specific, security and robustness are sometimes themselves conflicting requirements, when treated as separate objectives (and we argue below that this should indeed be the case), so the trade-off triangle could actually be a trade-off tetrahedron.

f22-01-9780081004128 — Figure 22.1 Digital watermarking trade-off triangle depicting how the requirements at the vertices conflict.

The capacity (Barni and Bartolini, 2004, Chapter 3) is the quantity of information (usually measured in bits) that the watermark is able to convey. It is generally dependent on the size of the host object; thus, sometimes the capacity is expressed as a relative entity (eg, in the case of images the unit is a bit of information per pixel, or bpp). This way of expressing capacity is particularly common in steganography, where capacity is often the primary concern. Vice versa, the capacity of a watermark more strongly depends on its application; for example, an image watermark capacity could range from a single bit in the case of detectable watermarking (see later) to some thousands of bits for a single host image, while in steganography it is usually some fraction of a bit of information per pixel or more (which, given the number of pixels in an ordinary image, is several order of magnitude higher). Hence, the watermark capacity must be tailored for the aim of the intended application.

Once the watermark has been embedded into a host object, sooner or later it must be retrieved. How this can be achieved even after the host watermarked object has been possibly processed is referred to under the terms watermark security and watermark robustness. There is some confusion in the early literature about the definition of these two requirements. The philosophy adopted here — that is, to clearly distinguish security and robustness as separate and conflicting requirements — is probably the most acceptable in our opinion.

Robustness (Cox et al., 2008, Chapter 9) refers to the ability of the watermark to survive any nonmalicious data processing the host object happens to undergo. Such processing is not intended to remove the watermark but is applied to the host object for some other purpose. The set of acceptable data processing must be decided before the watermarking system is designed and obviously depends on the nature of the host object. For classic LDR images, lossy compression, noise addition, digital-to-analog and analog-to-digital conversion, geometric transformations (such as zooming, rotation, and cropping), linear and nonlinear filtering for image enhancement, and histogram modifications are all examples of possibly nonmalicious processing that can occur.

Security (Cox et al., 2008, Chapter 10) is related to the inability by a hostile entity (referred to as the attacker) to remove the watermark or disable in some way its recovery. The most basic instance of security, which is usually present even in watermarking systems which do not explicitly address security, is that only authorized entities are allowed to embed watermarks in or recover watermarks from host objects. This, in turn, implies that both the embedding process and the recovery process must in some way depend on knowledge of some secret key. In a well-designed system, security must obey Kerckhoffs’s principle (Kerchoffs, 1883; Menezes et al., 1996) much like in cryptography — that is, the security of the system must be assured considering the attacker is aware of all the system details; the only thing he/she ignores is the secret key.

Some authors refer to security in a way related to cryptanalysis, which means it should be impossible for an attacker to guess the secret key. However, we prefer to use our previous definition of security because the former is only a (particularly harmful) instance of the latter, because an attacker could also be interested in simply disabling or removing any watermark present without knowing the secret key.

Finally, it is worth noting that repeated or strong use of a certain processing tool initially considered as nonmalicious (maybe the one against which the system is the most vulnerable) could be effectively considered a security attack, making the boundary between security and robustness very fuzzy. Therefore, it can be tempting to put both these requirements at the same vertex of the trade-off triangle, but we believe that neglecting the more sophisticated intelligence of a determined attacker can hurt the deployability of a watermarking system in a security-critical application. In our view, robustness and security are conflicting requirements in the sense that robustness tends to exploit any possible perceptual niche of the host object to survive processing, while security tries to make the watermark characteristics and location as unpredictable as possible.

Last, the watermark has to be imperceptible (Cox et al., 2008, Chapter 8) — that is, it should not degrade the perceptual content of the host object. Starting from this definition, it is obvious that perceptibility is a subjective matter, so it is impossible to give a universal measure of perceptibility. The best way to handle perceptibility is to study how humans perceive the environment in which they live means of models that approximate the mechanisms underlying perception. These models, primarily the human auditory system and the human visual system (HVS), have been extensively studied in the field of digital compression. Multimedia data compression tries to remove perceptually irrelevant parts of the original data to decrease the amount of information that must be conveyed to reproduce an acceptable “quality” of the data, so it is very important to include perceptual cues. In a sense, digital watermarking and data compression can be thought of as dual problems: the former, in fact, because it has to be imperceptible, must reside in the field of imperceptible data, the data that compression tries to eliminate from the original data. Unfortunately (or luckily depending on the point of view) perfect perceptual compression is not achievable, and therefore watermarks could be accommodated in imperceptible “niches” left by compression. Hence, to achieve imperceptibility, digital watermarking could exploit a whole mass of knowledge borrowed from multimedia compression.

Many systems do not explicitly use a perceptual model (which is usually difficult to implement), but instead rely on more classic approaches based on standard metrics (eg, PSNR) to minimize perceptual impairments; however, care has to be taken when one is comparing human perception with these kinds of absolute distortion measures. It is also very common to guarantee imperceptibility by the mere selection of an appropriate watermark domain.

An exception to the general rule of imperceptibility is visible watermarking, where the watermark is rendered perceptible to assess its presence (maybe for informative purposes, much like a logo) while retaining all the other watermark characteristics. In any case, even if the watermark is visible, there is a certain amount of distortion that the embedding process cannot exceed on the host object, so even in this case it is possible to define the imperceptibility requirement with some slight modification.

22.1.2 Watermarking System Examples

To help convey the basics of watermarking systems as described above, we first provide a brief explanation of a couple of classic algorithms before introducing the watermarking system structure in general and abstract terms in the following sections.

Cox et al. (1997) introduced a widely used watermarking paradigm known as spread-spectrum watermarking. The simplified flowcharts of the watermark embedding and recovery processes are depicted in Fig. 22.2. On constructs the watermark by seeding a pseudorandom number generator with the secret key and then extracting a Gaussian random sequence w. It is argued that to render the watermark robust against common processing such as compression, the best way is to insert the watermark in the most perceptually significant portion of the host object. For the still image case, a full-frame two-dimensional discrete cosine transform (DCT) is computed and represents the watermark domain. Next, a perceptual mask is computed to identify the most perceptually significant coefficients, referred as the vector v. The watermark is then introduced to obtain the watermarked coefficients v′ by application of one of the following formulas:

$\begin{array}{l} v_{i}^{'} & = v_{i} + α_{i} w_{i}, \end{array}$ $\begin{array}{l} v_{i}^{'} & = v_{i} + α_{i} w_{i}, \end{array}$

(22.1)

$\begin{array}{l} v_{i}^{'} & = v_{i} (1 + α_{i} w_{i}), \end{array}$ $\begin{array}{l} v_{i}^{'} & = v_{i} (1 + α_{i} w_{i}), \end{array}$

(22.2)

for i = 1,…,n, where n is the watermark length. The scaling factors α_i should be selected so as to ensure imperceptibility and thus can be dependent on the value v_i. The scheme’s target is only to assess if a particular watermark is present or not (see the discussion of detectable watermarking later). The above formulas are sometimes referred to as additive spread-spectrum and multiplicative spread-spectrum watermarking, respectively.

f22-02-9780081004128 — Figure 22.2 Example of a spread-spectrum watermarking process, the one presented in Cox et al. (1997). IDCT, inverse DCT.

The watermark recovery is nonblind (see later) — that is, the original, unwatermarked object is required by the entity that performs the recovery. To determine if the watermark is present, the received watermarked object (which is possibly further processed) undergoes the same two-dimensional DCT and the coefficients obtained are then subtracted from the original ones to obtain the recovered coefficients w* and then a normalized correlation (or similarity measure) is computed as

$\begin{array}{l} Sim (w, w^{*}) = \frac{w \cdot w^{*}}{∥ w ∥ \cdot ∥ w^{*} ∥} . \end{array}$ $\begin{array}{l} Sim (w, w^{*}) = \frac{w \cdot w^{*}}{∥ w ∥ \cdot ∥ w^{*} ∥} . \end{array}$

(22.3)

Use of Gaussian distributed watermark coefficients increases security because it counters so-called collusion attacks, in which the attacker averages several watermarked images in the hope of obtaining an unwatermarked object: in this case all the watermarks are still present simultaneously. However, it should be noted that an attacker can apply the same perceptual mask to identify which coefficients carry the watermark and then perform subtler attacks in the DCT domain. Choosing just a random portion of all coefficients would increase security in this sense, but would surely hurt robustness.

The second watermarking system example we discuss is quantization index modulation (QIM), the precursor of many techniques that have appeared and are still appearing in the literature. It was proposed by Chen and Wornell (2001). QIM consists in a quantization of the host object features using a particular codebook Q associated with a given watermark. To be more specific, suppose we have a set U of 2^|b| different quantizers, each identified by a specific string associated with a binary string b of length |b|. If we wish to embed the watermark code $\bar{b}$ $\bar{b}$ in the host object, the latter’s features must be quantized with use of the correspondent quantizer from the set U, giving the quantized (watermarked) features. When the watermark is to be retrieved, the received object features are requantized with use of the entire codebook set U (since the retriever does not know in advance which quantizer was used in the first place) and then one identifies to which particular codebook the quantized value belongs by taking the one in the entire set U with a quantized value at the minimum distance, thus retrieving $\bar{b}$ $\bar{b}$ .

Scalar QIM (SQIM), also referred as dither modulation watermarking, is a very common subset of QIM algorithms. In these systems the reconstruction values associated with all the codebooks forming U are arranged in a regular, rectangular lattice; in this way, one can perform scalar feature quantization, one feature at time, thus avoiding vector quantization processes. A rearrangement of QIM to make it adhere to the SQIM paradigm is illustrated in Fig. 22.3, where two features are represented on each axis. As observable, now it is possible to perform feature quantization separately along the two axis, with in general two different quantization steps, as every bit of b is embedded in a single feature. In this case, the watermark code b is only two bits long, so U consists of four distinct quantizers, each represented by a different symbol. The key on the right describes the relation between each watermark code and its correspondent quantizer. The host feature point f_A is represented by the filled diamond. Assuming that we wish to embed the code $\bar{b} = 01$ $\bar{b} = 01$ , only the quantizer individuated by the squares is used for the embedding and so the filled square is selected as the quantized feature $\bar{h}$ $\bar{h}$ (with the watermark embedded because it is a square). Note that the quantized feature point $\bar{h}$ $\bar{h}$ is by no means the closest to f_A for the entire codebook U, but is so only for a suitable quantizer Q. When, because of the watermark channel, this feature point is moved (hopefully in a close neighborhood), its requantization using all the symbols performed by the decoder will output which symbol was used during the embedding process, hence allowing us to identify $\bar{b}$ $\bar{b}$ . Note that as long as the codebook is known during the recovery phase, there is no need to use the original unwatermarked object (blind recovery, see later).

f22-03-9780081004128 — Figure 22.3 QIM example, specifically a two-bit scalar QIM is depicted.

The trade-off triangle is easily observable in QIM systems. Robustness, heuristically, depends on the mutual distance between the reconstruction values belonging to different quantizers; imperceptibility depends on the quantization step Δ because it is related to how much the host feature point is moved; and capacity refers to the number of different codebooks available. It is obvious they are conflicting requirements, because increasing robustness means moving away reconstruction values of different codebooks, but this in turn affects imperceptibility; and increasing capacity increases the density of symbols, worsening robustness, whereas lowering the density increases the quantization step, harming imperceptibility. As a final note, one usually achieves security in QIM systems by shifting the codebook U by a random quantity, extracted from a uniform distribution to increase the uncertainty on its value, and depending on the secret key, so that an attacker cannot tell which symbol has been used in the quantization process. Because addition of this shift, which can be shared thanks to knowledge of the secret key, has no effect on the other requirements and is easily implemented, this solution is very commonly adopted.

22.1.3 Structure of a Watermarking System

In the literature the watermarking system structure has been proposed in many ways; here we adopt the view of the watermarking game as a digital communication one. The basic flowchart of a digital watermarking system is illustrated in Fig. 22.4. In this high-level flowchart, only mandatory variables (ie, variables always present in every watermarking system) are depicted (an exception is the dashed optional input, which is represented because it is almost always used).

f22-04-9780081004128 — Figure 22.4 Elementary structure of a digital watermarking system. A watermark message m is embedded into a host object A, optionally with use of a secret key K. The watermarked object A_w goes through the channel in which it possibly undergoes some attacks. The recovery is then applied on the resulting A_w′. The output is the estimated watermark message m $^{'}$ $^{'}$ .

The message (which can be represented as a bit string m without loss of generality) is the main input to the system; the objective of the watermarking game is to guarantee that it is correctly received at the end of the transmission chain. The watermark embedder introduces the message into the host object A following the suitable mix of watermarking requirements a priori selected by the system designer, producing a watermarked object A_w. As previously mentioned, this process is very often driven by a secret key K to ensure a certain amount of security: for example, the seed of a pseudorandom generator of the watermark as in spread-spectrum watermarking. After the embedding stage, the watermarked object A_w possibly undergoes some processing (both malicious and nonmalicious attacks, or no attacks at all) which is overall modeled by the so-called watermark channel. Finally, the watermark recovery is performed on the “received” object A_w′, aided by the secret key K which was shared with the embedder (if used), giving as output a bit string m′ which represents an estimate of the original message.

22.1.3.1 Watermark embedding

The embedder block objective is to produce the watermarked object A_w; this can be summarized by the following formula:

$\begin{array}{l} A_{w} = E (m, A [, K]), \end{array}$ $\begin{array}{l} A_{w} = E (m, A [, K]), \end{array}$

(22.4)

where $E (\cdot)$ $E (\cdot)$ is referred to as the embedding function: Eq. (22.1) constitutes an example of an embedding function. In Eq. (22.4) the notation, [K] indicates that the secret key K is an optional parameter whose presence depends on the considered watermarking paradigm. This also applies to formulas of the chapter. Notice how we consider the secret key K as an optional variable (enclosed in square brackets), to be coherent with Fig. 22.4. Depending on how the task of Eq. (22.4) is implemented, we can distinguish between two different types of watermark embedding (and, by extension, watermarking systems): waveform-based watermarking and direct embedding watermarking.

Fig. 22.5 depicts the typical steps of a waveform-based embedding stage. Additive and multiplicative spread-spectrum watermarking can be classified as based on a waveform-based embedding process. First the message code m is coded into a bit string b (the watermark code or simply the watermark) with use of a code $C$ $C$ ; this operation is not always present, and in this latter case m = b.

f22-05-9780081004128 — Figure 22.5 Waveform-based watermark embedding steps.

After the preliminary message coding step, the embedding function is applied. Usually, the watermark domain is different from the host object domain — that is, the watermark embedding is accompanied by a feature extraction process $F (\cdot)$ $F (\cdot)$ which transforms the host object A into a set of original host features f_A (the feature space is the watermark domain). For example, the full-frame two-dimensional DCT coefficients represent the watermark domain in our example spread-spectrum system (Cox et al., 1997). Analogously to $F (\cdot)$ $F (\cdot)$ , a watermark coding $W$ $W$ must take place beforehand, and transforms the watermark b into a suitable watermark signalw (alternatively called a watermark waveform; see Eqs. 22.1 and 22.2 for examples), expressed in the feature domain, which is well suited to be embedded in the description of the host object A carried by its features. The embedding function $E (\cdot)$ $E (\cdot)$ can then be thought of in this case as a mixing ⊕ of some kind of watermark signal w with the host features f_A to obtain the watermarked features $f_{A_{w}}$ $f_{A_{w}}$ . Note that not all the host features f_A need to be mixed with the watermark signal w (ie, the watermark signal dimensionality need not to be the same as that of the host features). The watermarked object A_w is finally obtained by a reverse mapping function $F^{- 1} (\cdot)$ $F^{- 1} (\cdot)$ from the watermarked features $f_{A_{w}}$ $f_{A_{w}}$ to the host object domain (the inverse DCT in our spread-spectrum example). When the watermark domain coincides with the host object domain (eg, image watermarking in the pixel domain which works directly on pixel values), the feature extraction and its inverse revert to an identity function. This whole procedure can be illustrated as follows (optional arguments are again enclosed in square brackets):

$\begin{array}{l} b & = C (m), \end{array}$ $\begin{array}{l} b & = C (m), \end{array}$

(22.5)

$\begin{array}{l} w & = W (b, [A, K]), \end{array}$ $\begin{array}{l} w & = W (b, [A, K]), \end{array}$

(22.6)

$\begin{array}{l} f_{A} & = F (A [, K]), \end{array}$ $\begin{array}{l} f_{A} & = F (A [, K]), \end{array}$

(22.7a)

$\begin{array}{l} f_{A_{w}} & = f_{A} \oplus_{[K]} w, \end{array}$ $\begin{array}{l} f_{A_{w}} & = f_{A} \oplus_{[K]} w, \end{array}$

(22.7b)

$\begin{array}{l} A_{w} & = F^{- 1} (f_{A_{w}} [, K]) . \end{array}$ $\begin{array}{l} A_{w} & = F^{- 1} (f_{A_{w}} [, K]) . \end{array}$

(22.7c)

Eqs. (22.5) and (22.6) pertain to the message coding and watermark coding blocks depicted in Fig. 22.5, respectively, while the operations referred to as Eq. (22.7) are performed by the mixing block.

The secret key K could drive both the watermark coding process and the watermark mixing process. For example, the watermark signal w could be randomly selected from a set of possible waveforms according to K; the secret key could also randomly select which features the watermark signal has to be mixed with and which it should not mixed with or randomize the feature extraction process itself out of a predetermined set.

On the other hand, in direct embedding techniques there is no watermarking signal defined before the manipulation of the host features. They are described in Fig. 22.6, where with respect to Fig. 22.5 the watermark coding step is missing. Hence, the bit string b is embedded directly into the host object A by modification of, in a controlled way, the host features f_A. The QIM paradigm falls into this category. The set of equations describing the direct embedding paradigm is as follows:

$\begin{array}{l} b = C (m), \end{array}$ $\begin{array}{l} b = C (m), \end{array}$

(22.8)

$\begin{array}{l} f_{A} & = F (A [, K]), \end{array}$ $\begin{array}{l} f_{A} & = F (A [, K]), \end{array}$

(22.9a)

$\begin{array}{l} f_{A_{w}} & = E^{'} (b, f_{A} [, K]), \end{array}$ $\begin{array}{l} f_{A_{w}} & = E^{'} (b, f_{A} [, K]), \end{array}$

(22.9b)

$\begin{array}{l} A_{w} & = F^{- 1} (f_{A_{w}} [, K]) . \end{array}$ $\begin{array}{l} A_{w} & = F^{- 1} (f_{A_{w}} [, K]) . \end{array}$

(22.9c)

f22-06-9780081004128 — Figure 22.6 Direct watermark embedding steps.

Message coding of Eq. (22.8) is the same and serves the same purpose as Eq. (22.5). In this case, therefore, what is really different with respect to the waveform-based approach is the absence of any operation similar to Eq. (22.7b). Instead of mixing with a predefined signal w, there is a function $E^{'} (b, f_{A})$ $E^{'} (b, f_{A})$ , described by Eq. (22.9b), which moves the host features to the watermarked features $f_{A_{w}}$ $f_{A_{w}}$ in a way depending both on the initial position f_A and on b. Referring to Fig. 22.3, the host feature point f_A must be moved to the new watermarked feature point $\bar{h}$ $\bar{h}$ without our explicitly defining a watermark signal w. This process usually involves minimizing a cost function tied to the introduced perceptual distortion, possibly using an iterative algorithm (for SQIM, searching for the “nearest” quantized value).

22.1.3.2 Watermark channel

The watermark channel represents all the transformations applied to the watermarked object A_w before the watermark recovery is performed, so the latter is actually done on an attacked object A_w′. As previously mentioned, the attacks applied on a watermarked object could be either malicious (targeting watermarking security) or nonmalicious (targeting robustness instead), although this separation is not always simple. Nevertheless, if we keep this overlapping of meanings in mind, it is possible to define nonmalicious attacks as robustness attacks and malicious attacks as security attacks. In turn, malicious attacks can be further classified as blind security attacks, in which the attacker does not exploit any knowledge of the watermarking system but instead attacks the watermarked object with some operation which is not considered usual, hoping from the attacker’s point of view that the system designer did not take it into account, and nonblind security attacks, in which the attacker knows all of the watermarking system, except the secret key used, and exploits this knowledge to attack the system at its weak spots — for example, by mounting special attacks allowed by the particular system implementation (such as the availability of watermark recovery tools and the ability to apply them to arbitrary objects) and/or exploiting knowledge of watermark localization or spectral properties to alter synchronization between the embedder and the recovery block or to filter out the watermark. These latter attacks are the most dangerous and at the same time they cannot be ignored by the system designer if the already mentioned Kerckhoffs’s principle is to be respected; however, in some applications, it is reasonable to safely ignore such attacks simply because the intentional watermark removal or disabling is not in the attacker’s own interest.

Robustness attacks are defined as all transformations which are applied on the host object without the aim of explicitly disabling or removing the watermark; consequently, they comprise all processing which belongs to the “normal” course of life of the host object. Which attacks one should consider as nonmalicious is a matter strongly dependent on the nature of the host object A; moreover, also the application intended for the digital watermarking system at hand plays an important role because many robustness attacks are important than others in certain contexts. There are so many types of robustness attacks that usually the system designer takes into account only a certain number of them (if any) and then hopes for the best for all the others.

Common signal processing, usually aimed at digital objects’ perceptual content enhancement, is another form of robustness attack. It could be some kind of simple manipulation of features (eg, the so-called constant gain attack where the watermarked features are multiplied by a certain factor) or some more complex processing (eg, for digital images histogram modification is a classic type of image processing similar to a constant gain attack); it could also be represented by filtering, either linear (eg, low-pass filters) or nonlinear (eg, noise-suppressing filters).

Geometric manipulations are another very important class of robustness attacks, where geometry refers to any coordinate of the host object (spatial coordinates for images, temporal coordinates for audio, and both spatial and temporal coordinates for video): for this reason they are also called synchronization attacks. These attacks can be very tricky as they tend to break the consistency between the coordinates of the embedder and those used during the recovery process. To undo synchronization attacks, the system could either try to be as invariant as possible to modification of coordinates (losing robustness with respect to other types of attacks) or leave the task of recovering the original coordinate configuration to the recovery block (usually by exhaustive search).

Finally, object editing processes, such as cropping for images, can be considered robustness attacks; they usually intrinsically contain some amount of geometric manipulations when they are applied.

Among blind security attacks, the exhaustive use of some robustness attacks is the first thing that comes into mind, as the attacker may want to break the system by going beyond the robustness the watermark may tolerate.

Another way of attacking the system without delving into its details is to treat the watermark as noise and hence to use some noise-suppressing process to completely remove it in the simplest cases (in particular when the watermark is well modeled by an additive noise) or at times at least roughly estimate the watermark itself to remove it or to illicitly embed it into other objects (another example of how security and robustness concepts overlap). Other, more sophisticated threats should also be taken into account (eg, the already mentioned collusion attack where many watermarked copies of the same host object or many objects with the same watermark embedded are compared to estimate the watermark).

With regard to nonblind security attacks, the first observation that has to be made is that the secret key K is the most sensitive parameter of the system, because its unwanted leakage to an attacker aware of the system design could signify the nullification of the intended task of the watermark. This is effectively a very tough problem in some applications where the recovery process is to be performed by any user willing to do so (and this means the user has to use the secret key). This prompted the rise of asymmetric key schemes which use different keys during the embedding and recovery stages, although they introduce other problems (most notably a robustness decrease tendency).

Furthermore, when the attacker can repetitively perform the recovery process on any object, the so-called sensitivity attack can be adopted. In this attack, the attacker modifies little by little the watermarked object and then performs the watermark recovery; this will give a rather precise estimation of the detection/decoding boundaries, allowing the attacker to choose the most convenient unwatermarked object (eg, the one that minimizes distortion), and maybe to learn some information about the secret key used. Using complex detection boundaries or making the attack construction unfeasible thanks to computational complexity issues are the most common countermeasures to the sensitivity attack.

Obviously, a nonblind attacker could also use one of the attacks previously classified as blind if it is identified as a weak point of the system after an appropriate analysis of the system operations. As a last note, to better counter security attacks on a critical application, it is arguably mandatory to couple watermarking and cryptography technologies on a protocol level. Several examples are found in security-oriented applications, where cryptography is usually used to secure content distribution and to limit as much as possible unauthorized access to the object at hand and watermarking is used to tie security information with the content perception. A good survey of security issues in digital watermarking can be found in Cayre et al. (2005a,b).

22.1.3.3 Watermark recovery

The recovery stage is responsible for the extraction from the possibly attacked object A_w′ of an estimate m′ (the bit string representing m′) of the original message string m. Hence, the most general aspect of the recovery process is described by the following equation:

$\begin{array}{l} m^{'} & = D (A_{w}^{'} [, K]) . \end{array}$ $\begin{array}{l} m^{'} & = D (A_{w}^{'} [, K]) . \end{array}$

(22.10)

A recovery function $D (\cdot)$ $D (\cdot)$ is applied on A_w′ to obtain the recovered message m′. The form of the recovery function $D (\cdot)$ $D (\cdot)$ very much depends on the nature of the watermarking algorithm — that is, whether the recovery consists in assessing that a certain watermark is present or in choosing which watermark among those possible has been embedded. Therefore, Eq. (22.10) can be specialized into two different forms belonging, respectively, to detectable and decodable watermarking, which are separately depicted in Fig. 22.7.

f22-07-9780081004128 — Figure 22.7 Watermark recovery process forms: (A) watermark detector; (B) watermark decoder.

In detectable watermarking, the watermark recovery process (now called the detector block) can be schematized as in Fig. 22.7A. Here we are interested only in assessing the presence or the absence of the watermark, so the message m can be reduced to a binary variable (thus, m = m as the string has unitary length); the fact that we are embedding a watermark inside a host object A means that m =“1.” Consequently, the coding $C (\cdot)$ $C (\cdot)$ of Eq. (22.5) is better described by a codeword selection (such as bit repetition rather than a channel code), as the watermark b is embedded into A only to mean that A_w is watermarked, regardless of the meaning of b. The detector, then, looks for the watermark b in the attacked object A_w′ and takes a decision about its presence or absence, thus outputting an estimated message m′ which is 1 if the detector believes the watermark b has been embedded into A_w′ and 0 otherwise. Eq. (22.3) exemplifies this process: a threshold selected on the similarity measure between the supposedly embedded watermark and the received one embodies the detection decisor. Therefore, the watermark detection can be expressed as

$\begin{array}{l} D (b, A_{w}^{'} [, A, K]) = m^{'} \in {1, 0} . \end{array}$ $\begin{array}{l} D (b, A_{w}^{'} [, A, K]) = m^{'} \in {1, 0} . \end{array}$

(22.11)

Notice how in this case the detector must know b in advance so it can achieve its task. In some literature, it is stated that since the message m conveys only one bit of information (the watermark presence or its absence), detectable schemes are also defined as one-bit watermarking. This could be confusing: it would be more correct to say that the detector obtains one bit of information, but it has to be kept in mind that there is only one possible message m.

Decodable watermarking is depicted in Fig. 22.7B. In this case, the recovery block, conveniently called the decoder, does not know in advance the watermark b, so it has to read it from the attacked object A_w′ (for this reason, this scheme is also called readable watermarking) to form an estimated watermark b′. Here the original message m is meaningfully represented by the string m, and the message coding $b = C (m)$ $b = C (m)$ , if present, could be, for example, a channel coding of the message m into the watermark b. The first operation of the decoder is to decode an estimated watermark b′ from the attacked object A_w′ (using a function which is referred as $D^{'} (\cdot)$ $D^{'} (\cdot)$ ); then, given that the first stage of the embedding process is the message coding, the last stage of the decoding process is obviously the message decoding of the recovered string $m^{'} = C^{- 1} (b)$ $m^{'} = C^{- 1} (b)$ . As is easily imagined, decodable watermarking is also called multibit watermarking. Fig. 22.3 shows an example of multibit watermarking. Depending on the symbol over which the received feature point is quantized, a different pair of bits is decoded (hopefully, if we embedded the “square” as depicted by $\bar{h}$ $\bar{h}$ , the received feature point will still be nearer to it than to the other symbol to ensure correct decoding). The whole process is illustrated as follows:

$\begin{array}{l} D^{'} (A_{w}^{'} [, A, K]) & = b^{'}, \end{array}$ $\begin{array}{l} D^{'} (A_{w}^{'} [, A, K]) & = b^{'}, \end{array}$

(22.12a)

$\begin{array}{l} m^{'} & = C^{- 1} (b) . \end{array}$ $\begin{array}{l} m^{'} & = C^{- 1} (b) . \end{array}$

(22.12b)

Looking at Fig. 22.7, we see that the host object A plays the role of an optional input to the recovery process. In some applications the recovery process has access to the original host object A (perhaps because the embedding and the recovery are performed by the same entity), so in Fig. 22.7 we may add the original host object A as an additional input to the detector/decoder block. If this is the case, the watermarking scheme (or equivalently the recovery process) is said to be nonblind (and the optional input A in Eqs. (22.11) and (22.12) has to be considered); otherwise, if the recovery block does not have any knowledge of the host object A, it is called blind. The spread-spectrum watermarking technique described above and in Cox et al. (1997) is an example of nonblind watermarking, because the original unwatermarked image is needed during the recovery phase. On the other hand, the QIM paradigm as illustrated in Chen and Wornell (2001) is based on a blind watermark recovery stage.

Using realistic assumptions on the system, nonblind systems surely achieve better robustness over their blind counterparts, but one must keep in mind that such a framework is not always applicable in real-world applications; moreover, the performance gap is not as large as one would intuitively expect.

To summarize, the recovery function can assume one of the forms listed in Table 22.1.

Table 22.1

Watermark Recovery Function Forms

	Blind Recovery	Nonblind Recovery
Detectable watermarking	$D (b, A_{w}^{'} [, K]) = 1 / 0$ $D (b, A_{w}^{'} [, K]) = 1 / 0$	$D (b, A_{w}^{'}, A [, K]) = 1 / 0$ $D (b, A_{w}^{'}, A [, K]) = 1 / 0$
Decodable watermarking	$D (A_{w}^{'} [, K]) = m^{'}$ $D (A_{w}^{'} [, K]) = m^{'}$	$D (A_{w}^{'}, A [, K]) = m^{'}$ $D (A_{w}^{'}, A [, K]) = m^{'}$

22.1.4 Watermarking System Evaluation

Once a watermarking system has been implemented, its performance should be evaluated as objectively as possible (Cox et al., 2008, Chapter 7). Although additional parameters can play a role, such as complexity and memory usage, here we will focus on the satisfaction of the main requirements that we discussed earlier.

Imperceptibility of the watermarked image refers to its indistinguishability from the original, unwatermarked image, and it can be evaluated either subjectively or objectively. The former usually involves user tests performed in controlled environments, but they are seldom used in watermarking contexts. The latter relies on the computation of objective metrics which may or may not be driven by HVS models. In almost all the studies that are the subject of this chapter, imperceptibility was evaluated through either the PSNR, which, although providing a rough estimate of the embedding distortion, is not very well correlated with actual human perception, or more sophisticated metrics for HDR data such as HDR-VDP (Mantiuk et al., 2005) and its successor HDR-VDP-2 (Mantiuk et al., 2011) (see Chapter 17).

Evaluating security is a much a more challenging issue and is totally ignored in many studies, even in security-critical applications such as copyright protection. In most cases, the secret key is used only as the seed of a pseudorandom number generator responsible for constructing the watermark sequence. In this sense, only recipients with the correct secret keys can read the watermark; however, all the other security aspects are not covered by this approach. For example, an intelligent attacker can be interested in disabling the watermark recovery by simply trying to delete the watermark (eg, by embedding a spurious watermark, which is always possible if one assumes that he/she knows the details of the watermarking system and the location of the watermark in the embedding domain is invariant). Although theoretical analysis can done when the problem is simplified, mostly evaluating the security of a watermarking system is done empirically.

Finally, evaluating robustness is related to how the watermark is recovered. In every system, in some part of it a threshold (or more than one) needs to be chosen to differentiate between watermarked and unwatermarked images, or to decode bits of the embedded message. The threshold should be selected by minimization of a loss average, where the loss is a function describing the damage caused if an error occurs in the decision process. In particular, for decodable watermarking, the bit error rate (BER) of the recovered watermark is usually considered. In the case of detectable watermarking, if the system believes that the watermark is present even when this is not the case, we are in presence of a false alarm; conversely, if the system wrongly believes that the watermark is absent, a miss has occurred. The goal of the designer is to estimate the probability of a false alarm and the probability of a miss of the watermarking system given any host object and any other input (eg, any secret key). Watermarking systems are usually designed according to the Neyman-Pearson criterion (Neyman and Pearson, 1933), where the threshold is selected such that the false alarm probability is less than a given target figure and then the resulting miss probability is evaluated. Both probabilities are then depicted on a receiver operating characteristic (ROC), which one usually obtains by letting the false alarm probability vary, and then calculating the correspondent threshold (the order is inverted when experiments correspond to practical rather than theoretical evaluations) and finally observing the miss probability. Sometimes, the equal error rate, which is the point in the ROC where miss probability and false alarm probability are equal, is provided. An example of an ROC is given in the next section.

22.2 Digital Watermarking for HDR Images

As HDR images are gaining ground in day-to-day applications, early efforts with regard to more sophisticated requirements for applications have been pioneered. Digital watermarking has carved itself a niche in security-oriented applications but its deployment could represent a challenging problem, so it should not be surprising that at the time of this writing only a handful of studies have been published on the subject. In what follows, we will give a brief description of these studies, following an approximate temporal order, that includes their classification according to the criteria described earlier to give a better idea of how they relate to each other. Before that, we will discuss, in general terms, the requirements for HDR image watermarking.

22.2.1 Requirements for HDR Image Watermarking

In the case of HDR images, some peculiarities exist with respect to classic LDR images. These should be taken into account when one is designing a watermarking system. For this reason, it is generally impossible to directly port an established LDR image watermarking technique to the HDR domain because both imperceptibility and robustness would suffer greatly. It is obvious that a small modification in the LDR domain could become huge when ported to the HDR domain because the range of pixel values there is far more extended with respect to the usual eight-bit, [0, 255] range. Also, the pixel value itself may be redundant — that is, if one suitably modifies the exponent and mantissa of the floating-point representation, sometimes there is more than one way to express a given pixel value.

The pixels’ value range difference in the two domains also has far-reaching implications for how humans perceive HDR images, so perceptual models should be corrected when we are dealing with them. That generally implies tuning the specific parameters of the usual perceptual masks, considering how sharper and richer the visual representation of HDR images is with respect to LDR images. For example, the contrast masking effect of the HVS is surely higher in the HDR domain given its richness of details.

One way to tackle these issues is to first transform the HDR image into an LDR one, watermark it by an LDR image watermarking process, and then revert to the HDR domain by some inverse transformation, as many systems do. In doing so, one should exercise particular caution to guarantee that the modifications in the LDR domain will not be perceptible in the HDR domain. An example of such reasoning can be found in the following when we describe Fig. 22.8.

f22-08-9780081004128 — Figure 22.8 Conceptual representation of the framework in which the watermarking system of Guerrini et al. (2011) is expected to operate.

Moreover, it is worth noting that the high visual value of HDR images is always put at a premium. Therefore, when one is considering what processing HDR images can (or should) undergo, only processing that does not severely alter their perceptual value should be studied. Most systems try to be robust against a single type of manipulation: tone-mapping operators. They are common nowadays to permit the rendering of HDR images on LDR displays, so naturally they are considered common signal processing in HDR imaging. Therefore, watermarking systems should be robust against tone mapping and possibly permit watermark recovery in both the original HDR image domain and the LDR of the tone-mapped versions, whatever specific operator has been used.

More comments on the applicability of digital watermarking in the HDR domain can be found in the last section, where we make some remarks based on the current state of the art, which we review next.

22.2.2 Survey of the Current State of the Art

The work by Guerrini et al. (2008), further expanded in Guerrini et al. (2011), is to the best of our knowledge the first published work on data hiding for HDR images. It is a blind detectable watermarking method, so its capacity is a single bit. The capacity is sacrificed to favor the other requirements: robustness against tone mapping and noise addition, security, and imperceptibility.

The rationale behind this watermarking scheme is depicted in Fig. 22.8. Tone-mapping operators are well modeled through a logLUV process (L), so the HDR image is first transformed into the logLUV domain and then an LDR watermarking robust against nonlinear attacks and invariant to constant gain modifications is applied to the luminance component only. The HDR watermarked image is obtained by exponentiation L⁻¹. It is argued that each tone-mapping operator (TMⁱ) is only a mild nonlinear transformation away from the logLUV image, so in the end I_TMⁱ^W, the tone-mapped version of the HDR watermarked image, still retains the watermark.

The embedding method is quite complicated: the flowchart is shown in Fig. 22.9A. It is based on the QIM paradigm, which in this case encodes information into the shift of a nonuniform quantizer. The quantization is applied to the kurtosis of the approximation coefficients resulting from a wavelet decomposition, taken into randomly positioned, randomly shaped blocks. Imperceptibility is aided by use of an HVS-derived perceptual mask, computed with use of the detail subbands as well (that are left untouched by the watermarking process), taking into account brightness, neighborhood activity, and the presence of edges. Security is very high, because it relies on both the shift of the quantizer and the random position and shape of the blocks on which the kurtosis feature is computed, with the attacker needing to use an attack that is at least visually perceptible as the mask to disable the watermark. Watermark recovery, illustrated in Fig. 22.9B, is relatively straightforward. The secret key drives the extraction of blocks and the computed kurtosis feature in each of them is quantized by means of the same codebook used in the embedding phase. If the number of blocks correctly decoded is higher than a threshold T, the image is judged as watermarked.

f22-09-9780081004128 — Figure 22.9 Watermark system flowchart for the system of Guerrini et al. (2011): (A) watermark embedder; (B) watermark detector.

The experiments were performed on 15 HDR images and with seven tone-mapping operators. The imperceptibility was measured with HDR-VDP, the average of which is below 0.5%, signifying that the watermark is almost imperceptible. If the false alarm probability is fixed at 10⁻⁵, the miss probability can be as high as 10⁻² for small images but goes well below 10⁻⁶ for larger images, which is the commonest case for HDR imagery. One of the ROCs obtained can be seen in Fig. 22.10, for an image watermarked with use of N = 700 blocks (which is reasonable for a medium-sized HDR image). As is usual for ROCs, the axis are drawn on a logarithmic scale and the curve in the cases of various robustness attacks (tone-mapping operators in this case) is obtained by the changing of the detection threshold. It is worth noting that no nonrealistic assumptions are made with regard to the distribution of errors; instead, a binomial distribution based on the actual block decoding error probability is assumed. Fig. 22.10 also suggests how hugely different can be the scale of introduced distortion for different tone-mapping processes.

f22-10-9780081004128 — Figure 22.10 Example of an ROC. This is obtained with use of the method in Guerrini et al. (2011) setting the number of blocks N = 700. Each curve represents a different tone-mapping operator (see Guerrini et al., 2011 for further details).

The work in Guerrini et al. (2008) went mostly unnoticed until the appearance of the work in Guerrini et al. (2011). Meanwhile, other work appeared that was actually targeted toward steganography, so caution should be adopted when one is evaluating such work for watermarking purposes. Nevertheless, it is interesting to report here those efforts that, exploiting the characteristics of HDR images to hide data, inspired later work on HDR image watermarking.

The work in Cheng and Wang (2009) is the first that tackled the subject of steganography for HDR imagery. As is customary for steganography, the emphasis is put on capacity and imperceptibility at the expense of robustness. In this case, no manipulation is even expected between the message embedding and its subsequent recovery, so robustness has not been tested. The method is based on the so-called least significant bit (LSB) embedding, in which the LSB or LSBs carry the embedded message. LSB embedding can be considered a waveform-based, additive scheme dependent on the original image pixel values and is a popular choice in steganographic applications. Early watermarking techniques also proposed LSB embedding until robustness and security issues mostly barred further efforts in this direction. The embedded message recovery is blind, another common feature of steganography.

Imperceptibility is pursued through some heuristics: no explicit HVS model is used. First, 32-bit, RGBe-format pixels (Reinhard et al., 2005) are classified into flat and nonflat image classes and different embedding methods are used for each. This classification is performed by comparison of the exponent terms in neighboring pixels. Then, the number of LSBs that are modified to carry the message are computed in a way to be higher for dark and contrast image areas and smaller for bright and smooth areas. Also, RGB channels are weighted to reflect the higher sensitivity of the human eye to the red and green channels.

Security requirements are basic in this case. The (symmetric) secret key is used to encrypt and decrypt the plain text message, and a form of authentication through message digesting (eg, MD5) is also advised and implemented.

Given that this work is not concerned with robustness, the experiments focus on imperceptibility with PSNR as a metric and capacity. The tests were performed on seven standard HDR images and report values ranging from one to three bits per color channel per pixel (with a total capacity per image in the megabit range) and a PSNR above 30 dB.

The work in Li et al. (2011) is inspired by the work in Cheng and Wang (2009) and therefore embraces most of its premises. For instance, the suggested application is still steganography and, as such, in this work too, robustness is not taken into account. Security concerns are neglected as well, but one can assume that the same kind of basic requirements can be used for this scheme. Of course, if the same type of message digesting security as in Cheng and Wang (2009) is to be included, the net capacity is bound to decrease through the inclusion of security information in the embedded bits.

Again, no HVS model is considered. Instead, a simple assumption is made: to minimize the embedded information perceptibility, the total variation of each pixel’s value should be minimized. The HDR format considered is logLUV TIFF (Reinhard et al., 2005), where the luminance value is floating point, and each image is first normalized to a given set of luminance exponent values before the embedding to cope with variable HDR ranges across different images.

The embedding strategy is once again based on the LSB paradigm. The information is embedded in the mantissa for luminance values and the exponent is then selected to minimize the difference between the new luminance value and the original one. The chrominance channels have a direct value representation; hence, classic LSB is used there. The best trade-off between capacity and imperceptibility is reported as 6-bit embedding for the luminance channel and 10-bit embedding for each chrominance channel.

The experiments once again consider only capacity and imperceptibility, using a testbed of 10 HDR images. Moreover, the PSNR is still chiefly used to measure imperceptibility. Li et al. (2011) reported an increase in capacity (more than doubled) and PSNR (up to 2–3 dB) with respect to the findings of Cheng and Wang (2009). However, the set of test images (10 in this case) is different, so these conclusions should be taken with caution. For a single image, HDR-VDP₇₅ and HDR-VDP₉₅ were also computed and both were under 1%.

The work in Yu et al. (2011) is again on HDR image steganography, but it considers a different set of requirements. Robustness is neglected in this work as well, and basic security is achieved by simple pseudorandom scrambling of the order of the pixels. The same assumptions as above about increasing security at the expense of net capacity still apply. Furthermore, the capacity could still be decreased when the presence of an intelligent attacker trying to detect the presence of a message is assumed because only a subset of the available pixels is used (see later). The embedded message recovery is blind.

With respect to the previously described methods, imperceptibility in this case is almost maximized, but the capacity is much lower. This method exploits the redundancy in the RGBe representation format for a given pixel, which is the fact that adding 1 to the exponent and halving the other channels (or subtracting 1 from the exponent and doubling the other channels), with some obvious extra assumptions on overflowing and underflowing and rounding, does not change the pixel value. Embedding information into pixels, therefore, consists in choosing one of these equivalent representations, once they have been sorted by means of the exponent value, using the embedded bits as index. Obviously, not all pixels admit equivalent representations, and the number of embeddable bits depends on the number of these representations.

The experiments are only concerned with capacity, as imperceptibility is all but guaranteed by this technique. Depending on the intended application, Yu et al. (2011) reported two different average capacities. In the less security critical environment of image annotation, it is in the 0.1 bpp range. When an intelligent attacker is assumed, the message is embedded only in a random, secret key driven subset of the available pixels. In particular, the selected pixels are such that altering them does not alter the statistics of the image with respect to the original image. In this case, the capacity is in the 0.001 bpp range.

The work in Wang et al. (2012) directly builds on that in Yu et al. (2011). The proposed variation is to group pixels for embedding purposes into so-called segments, instead of considering pixels one at a time, and it also uses a more sophisticated approach to scramble the pixel order to construct the segments. This mechanism makes a more effective use of the number of equivalent representations for groups of pixels and improves the capacity by just around 5%, so there is probably not much more room for improvement on the capacity side.

Following the above-mentioned articles, other watermarking methods started to appear in 2011. As already stated, the commonest requirement is robustness against tone-mapping operators, while the other requirements are variably addressed.

The work in Xue et al. (2011) discusses two approaches to implement a blind detectable (one-bit-capacity) watermarking scheme. Both techniques aim at robustness against tone mapping and do not consider security. Also, imperceptibility is not addressed explicitly but relies on the watermark embedding domain to ensure it is achieved. Both proposed techniques are based on the multiplicative spread-spectrum watermarking paradigm. Hence, correlation-based blind recovery is used in both, with a basic modicum of security provided by the secret key acting as the seed of the pseudorandom watermark.

The first technique approximates the tone-mapping process with a μ-law function applied on the HDR image, producing an LDR image and a ratio image (the original HDR image divided by its tone-mapped version). Then, the LDR image is wavelet decomposed and the watermark is embedded into the vertical and horizontal detail subbands. Last, the watermarked HDR image is obtained by multiplication of the wavelet-reconstructed watermarked LDR image by the ratio image.

The second technique applies bilateral filtering to the HDR image to obtain a large-scale part and subsequently a detail part by subtracting the large-scale part from the HDR pixel values, after having taken the logarithm of both. Then, the detail part undergoes the same processing described above: wavelet decomposition and reconstruction with the spread-spectrum watermark embedding in between. Finally, the watermarked detail part is summed to the large-scale part and, recalling we have applied the logarithm, the sum is exponentiated to obtain the watermarked HDR image.

The experiments conducted on five HDR images first address imperceptibility, giving PSNRs in excess of 50 dB for the first technique and lower values (32 dB in one image) for the second technique. Not surprisingly, the second technique is more robust against tone mapping according to a limited set of experiments, although some correlation values appear to be under the threshold, hence resulting in missed detection.

The work in Wu (2012) is a blind decodable watermarking technique that aims to embed a 4800-bit logo in HDR images. Aside from tone-mapping operators, robustness against noise addition, cropping, and blurring is sought, while security is completely neglected.

To be robust against tone mapping, a prototype tone-mapping operator is first applied to the HDR image and the watermark embedding is performed on the resulting LDR image. The HDR watermarked image is obtained again by the storing of a ratio image by which to multiply the watermarked LDR image. The LDR watermarking method is a variation of the classical spread-spectrum method in the DCT domain, where instead of direct embedding of the watermark with the additive rule of Eq. (22.1) on the DCT coefficients, the value difference between specific pairs of medium-frequency to low-frequency coefficients is modified. The watermark blind recovery is based on the computation of a correlation coefficient.

The experiments were performed on a single HDR image at a time. Imperceptibility was evaluated through the PSNR, which was reported around 70 dB. Robustness against attacks was given by the correlation coefficient, which, on average, ranges from 0.5 to 0.8 (with BERs approximately ranging from 12% to 23%), giving mostly human-readable recovered logos because the HVS tends to compensate for errors in logo images with those BERs.

The work in Solachidis et al. (2013a) is directly derived in the HDR domain. A just noticeable difference (JND) mask is obtained from the HDR image, and then it is used to embed 128 bits. Again, the target is robustness to tone-mapping operators, and a minimum of security is obtained by use of a secret key to scramble the data.

The imperceptibility is pursued by use of a mask based on a contrast sensitivity function, on bilateral filtering, and on the aforementioned JND. The embedding method is of the multiplicative spread-spectrum type, applied in the wavelet domain, and the blind recovery is based on a threshold. The mask is used to temper the embedded information depending on the mask value. Only the luminance channel is used to embed the watermark in this work as well.

The experiments were performed on three HDR images and used a set of seven tone-mapping operators. The watermark decoding can be performed on both HDR and LDR images obtained through tone mapping. The reported BER is, in the worst case, around 5%. Miss and false alarm probabilities were extrapolated under the assumption of Gaussian distributions for the bit errors, and almost negligible miss probabilities for false alarm probabilities equal to 10⁻¹⁰ are reported. No experimental evaluation on imperceptibility is reported.

The authors of Solachidis et al. (2013a) proposed two other techniques. The first, which is proposed in Solachidis et al. (2013b), is a blind detectable watermarking scheme (so the capacity is one bit) with, again, robustness against tone mapping as the main target.

Here, the original HDR image is decomposed into a set of LDR images, each representing a subset of the original dynamic range. On each LDR image, an LDR image watermarking already proposed in the literature is applied and the watermarked HDR image is then obtained by combination of the set of LDR watermarked images. The LDR watermarking scheme belongs to the additive spread-spectrum family, it works in the wavelet domain, and it uses an HVS-based mask to achieve imperceptibility. Security resides in the pseudorandom sequence to be added to the original wavelet coefficients, so the watermark cannot be read but it can be disabled by an intelligent attacker. Collusion attacks are also not considered. The recovery can be performed on both the HDR image directly and an LDR image obtained by scaling the dynamic of the HDR image.

The experiments were performed on six HDR images. The strength of the watermark was adjusted so as to have an HDR-VDP-2 under 5% for 90% of the image pixels, which is assumed as a good value for imperceptibility. The miss and false alarm probabilities are very small, when Gaussian distributions are assumed for the detection scores. The tests considered five different tone-mapping operators.

The second work (Maiorana et al., 2013) is a blind decodable watermarking method. In this case, the method used is completely different, although it also aims at being robust against tone-mapping operators.

Maiorana et al. (2013) propose considering one of the first-level wavelet subbands of the logarithm of the luminance ( $log L$ $log L$ ) component of the HDR image, separating it into blocks, and then applying the Radon transform-DCT (Do and Vetterli, 2003), leaving the chrominance channels unmodified. Then, a QIM watermarking process is applied, hence classifying this method into the direct embedding family. The features to be quantized are the most energetic directions, so as to embed the information into the edges of the image to ensure maximum imperceptibility. The secret key controls the quantizer shift and is therefore the only security mechanism present. The capacity is the range of tens of kilobits and depends on the number of blocks present in the image. In this case as well, the blind watermark recovery can be done on both the watermarked HDR image and a tone-mapped LDR image.

Experiments were conducted seven images and five different tone-mapping operators were considered. The imperceptibility was best when the HH subband was used, where HDR-VDP-2 gives a 5% probability of detecting a modification on around 5% of the image. However, on average for the HH subband, the BER is as high as 22%.

The work in Autrusseau and Goudia (2013) also focuses on robustness against tone mapping. It is a detectable watermarking technique, so its capacity is one bit. Imperceptibility relies once again on the use of wavelet decomposition.

A nonlinear variant of the classical multiplicative spread-spectrum watermarking paradigm is applied on all the first-level subbands of a wavelet decomposition. Thus, security consists only of the seeding for pseudorandom noise, which is the watermark. The blind watermark recovery is, as usual for such watermarking methods, based on correlation between the extracted watermark and the original watermark.

With regard to robustness, considered for eight HDR images and six tone-mapping operators, no experimental values were given, but it was reported that false detection occurs more frequently than true detection for at least some combinations of the original HDR image and the tone-mapping operators. The imperceptibility was in this case also evaluated through HDR-VDP and was reported to be around 95% (except for a single image with 85%).

22.2.3 Summary of HDR Image Watermarking Systems

In Table 22.2 we have reported a summary of the proposed HDR image watermarking systems described above. The point of the table is not to allow experimental comparisons (see the next section). Instead, here we just want to list the various approaches and point out a few characteristics that can be inferred by comparing them.

Table 22.2

Summary of the Current State of the Art in HDR Image Watermarking

			Watermark Requirements
Reference	Embedding Domain and Algorithm	Recovery	Capacity	Robustness	Security	Imperceptibility
Guerrini et al. (2008, 2011)	QIM (direct embedding) on the kurtosis of approximation subband (AS) coefficients in the logLUV domain (2-level wavelet decomposition)	Blind	1 bit (detectable)	7 TMOs, Gaussian noise (masked). Evaluated by ROC	Features computed in blocks of random shape and in random locations, random quantization shift	Use of a perceptual mask, based on brightness, activity, and edges. Experiments with HDR-VDP
Cheng and Wang (2009)	LSB embedding in HDR domain, RGBe format	Blind	3–9 bpp	None (steganographic)	In the clear, MD5 for message authentication	Experiments with PSNR
Li et al. (2011)	LSB embedding in HDR domain, logLUV TIFF format	Blind	26 bpp	None (steganographic)	In the clear	Experiments with HDR-VDP
Yu et al. (2011) and Wang et al. (2012)	Using redundant representation of HDR pixel values in RGBe format	Blind	0.001 bpp	None (steganographic)	Pixel scrambling	Completely imperceptible by design
Xue et al. (2011)	Bilateral filtering of HDR image, multiplicative spread spectrum in wavelet domain	Blind	1 bit (detectable)	4 TMOs. Evaluated by detection scores	Pseudorandom watermark	Experiments with PSNR
Wu (2012)	LDR domain (after TMO), variation of additive spread-spectrum watermarking applied to medium-frequency to low-frequency DCT coefficients	Blind	4800 bits	4 TMOs, noise, blurring, and cropping. Evaluated by BER	None	Experiments with PSNR
Solachidis et al. (2013a)	HDR domain, multiplicative spread spectrum applied to wavelet coefficients	Blind	128 bits	7 TMOs. Evaluated by BER	Pseudorandom watermark	Use of a perceptual mask, based on JND, contrast, and bilateral filtering. No assessment
Solachidis et al. (2013b)	LDR domain, obtained through bracket decomposition, additive spread-spectrum watermarking applied to wavelet coefficients	Blind	1 bit (detectable)	5 TMOs. Evaluated by the equal error rate obtained under the assumption of Gaussian error distributions	Pseudorandom watermark	Experiments with HDR-VDP-2
Maiorana et al. (2013)	QIM (direct embedding) of most energetic Radon-DCT directions, applied on the logL LDR domain	Blind	Tens of kilobits	5 TMOs. Evaluated by BER	Quantizer shift	Experiments with HDR-VDP-2
Autrusseau and Goudia (2013)	Wavelet transform of HDR pixel values, nonlinear variant of multiplicative spread spectrum	Blind	1 bit (detectable)	6 TMOs. Evaluated by BER	Pseudorandom watermark	Experiments with HDR-VDP

t0015

The first observation is how uneven the proposed embedding schemes are, including purely steganographic systems. In addition, some systems are of the detectable kind and others embed a variable quantity of information in the host image. This is not necessarily a bad thing, but indicates how a clear favorite application for HDR image watermarking (and a suitable requirements mix for it) has not emerged yet.

The second observation is that, besides a single work, security is mostly neglected and relies on simple strategies, such as the use of a secret key to construct the watermark sequence. While this prevents the unauthorized decoding or detection of the watermark, it does not guarantee that the watermark recovery cannot be impaired by more sophisticated attacks aimed directly at the recovery process. The latter fact, in general, forbids the deployment of the watermarking scheme in security-critical applications such as ones dealing with copyright protection.

Another couple of observations can be made with respect to how imperceptibility is handled. First, there is a discrepancy in the use of metrics to assess the perceptual distortion introduced by the watermark embedding process. Some of studies use the PSNR, which is probably unsuitable when applied to the high-quality content at hand: the breadth of work dedicated to the development of suitable perceptual metrics for HDR image quality is clear proof of this fact (see Chapter 17). Also, it is worth noting that no articles have even considered the possibility of implementing visible watermarking, but all aim to render the watermark imperceptible to retain the image content quality. In addition, most studies just rely on the watermark domain (eg, using the detail subbands of a wavelet decomposition) to achieve imperceptibility. Perceptual masks, possibly developed ad hoc for the HDR domain, can possibly provide a needed upgrade to guarantee that the watermarked image retains high visual quality.

There are some common points as well that arise from Table 22.2. For example, watermark recovery is always blind, a scenario justified for those applications in which it is assumed that the entities performing watermark embedding and recovery differ. Also, all the proposed techniques that aim to achieve robustness (ie, excluding the steganographic systems) recognize the importance of being robust against the tone-mapping operators, which constitute the most widely diffused form of processing that HDR images are currently expected to undergo.

22.3 Concluding Remarks

In this chapter we briefly introduced the data hiding branch known as digital watermarking, with a particular focus on the still image case. Turning our attention to the HDR domain, we wished to provide a high-level overview of the current state of the art. To conclude our discussion, in this section we offer some remarks on the present status of digital watermarking applied to HDR images.

First, as can be inferred from our discussion in the preceding section, referring in particular to Table 22.2, we avoided comparing the HDR image watermarking techniques described. We did this purposefully: we wish to highlight here how difficult it is to properly conduct critical evaluations of the current state of the art for a number of reasons. First, the number of available studies is still too low to draw conclusions, suggesting that much more work is needed in the field. Second, there is a distinct lack of recognized “standard” HDR image databases, and the few that are readily available consist of a small number of images. In both these aspects, it is easy to conclude that HDR image watermarking is completely out of pace with respect to the huge amount of research conducted in the LDR image field. Instrumental to this detachment is also what we have previously pointed out: that it is, in general, not possible to directly transpose LDR image watermarking to the HDR domain.

These are not the only problems preventing a critical comparison between available work. More importantly, there is a feeling that no research group agrees on the requirements HDR image watermarking should satisfy. This is probably dictated by the complete absence of the deployment of a watermarking system in real-world scenarios. Once the need to explore watermarking in a real application will arise, it is likely that all these problems will be dealt with simultaneously and comparisons between proposed techniques will be made possible.

Speaking of watermarking requirements, looking again at Table 22.2, we find it somewhat surprising that all the proposed technique are based on the blind recovery paradigm. It is quite easy to imagine applications catering to the perceptual quality of HDR content that might make use of a nonblind recovery stage, considering the advantages in terms of robustness achievable by such a framework. For example, some company could sell its HDR images after having properly and imperceptibly watermarked them. Then, an interested party might want to verify the authenticity of the content of a tone-mapped version of one such image. To do that, it could upload that LDR image to a site controlled by the selling party, who would then extract the watermark by means of a nonblind watermark recovery system since it possesses the unwatermarked original image. The only security requirement in such an application would be the impossibility to completely remove the watermark to prevent misappropriation — simply disabling its recovery would damage the image quality and prevent its authentication. This application is perfectly viable, so we suggest that more effort should be put into nonblind watermarking. We also wish to point out that, with these assumptions in place, steganography could still arise as the killer application, and that only time will tell of this is the case.

In addition, as one can see from Table 22.2, imagining critical security-free applications such as the one described above also matches the poor attention that security has enjoyed until now in the literature. Techniques with tighter security can, of course, also proposed (eg, letting the buying party in the scenario above perform the authentication itself). Such applications would likely require additional security infrastructure besides the one provided by the watermarking system — for example, enlisting the aid of an asymmetric key cryptography framework: for example, a public key could be needed to recover the watermark, while a private key could be necessary to embed it.

As a brief note, robustness against tone-mapping operators seems to be the present focus of the proposed methods. However, it is easy to predict that HDR image watermarking will have to be robust against other types of processing as well: transcoding, as HDR image coding solidifies itself, springs to mind.

As a last note, a major absence from this chapter is the HDR video medium. In fact, to the best of our knowledge, no one has proposed an HDR video watermarking technique yet. However, given the infancy of even the still image field at the present time, that should be hardly surprising. We are quite certain that, soon after the still image case reaches an adequate level of maturity, work concerning HDR video watermarking will begin to appear.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.

Table of Contents for Chapter 22: HDR Image Watermarking

Create new playlist

Sign In

Sign Up