Chapter 15

Future of the Field

Reviewing the state of the art in steganalysis, it is reasonable to ask what future the area has. It may be a future within science and research, or it could be as an applied and practical art. It could be both or neither.

Steganalysis no longer enjoys the broad and popular support it did in the first couple of years of the millennium. As we discussed in early chapters, the hard evidence of real steganography users is missing from the public domain. Nevertheless, there are a number of experts, both in industry and academia, who maintain that steganography and steganalysis are important fields with potential for increasing future use. In 2010 we saw the BOSS contest, Break Our Steganographic Scheme (Bas et al., 2011), where a number of groups competed to get the record accuracy for a steganalyser against a particular steganographic system. The subsequent issue of the Information Hiding Workshop (Filler et al., 2011a) contained a large number of solid contributions within the area of steganography and steganalysis not limited to BOSS-related work. Is this showing the return of steganography as a hot topic?

15.1 Image Forensics

A field closely related to image steganalysis is image forensics, and as the steganography hype at the start of the century started to cool, several distinguished individuals and groups in steganalysis changed focus. We can view image steganalysis as a special case of image forensics, which aims to discern as much information as possible about the source and history of an image. While image steganalysis is only interested in a single aspect of the image, namely covert information, image forensics covers a wide range of questions, including the following well-known ones:

Camera source identification aiming to classify images according to the camera or capturing device which produced them.
Printer source identification aiming to identify the device which produced a given hard copy.
Post-processing analysis aiming to classify images according to what kind of image processing they have undergone, including compression, colour correction, brightness adjustment, etc.
Synthetic image classification aiming to distinguish between natural photographs and computer-generated images.
Tamper detection aiming to detect if an image has been modified (‘Photoshopped’) after capture.

All the different areas of image forensics have important practical uses. Tamper detection might be the one with the most obvious and wide-ranging applications. In recent years, we have seen a number of tampered images published in the international press, where the tampered image was intended to tell a more serious or scary story than the original image would have. In several cases, news agencies have been forced to recall imagery or let staff go (e.g. BBC News, 2006). The motivation for image tampering may vary, from a freelance photographer yearning for a reputation, to the propaganda of a national government. Either way, news editors will be looking for assurance that images received are genuine and honest, and image forensics is one possible source of such assurance. A press agency or newspaper may receive thousands of images daily from the general public, so the task is not trivial.

Synthetic image classification has received attention because of a number of US court cases involving child pornography. Whereas real photography displaying child pornography is evidence of child abuse which is clearly illegal, not only in the USA but in most countries, it is argued that computer graphics displaying pornographic scenes where no real child is actually involved, would be legal under free speech. Hence, the prosecutor may have to prove that the images presented as evidence come from a real camera, and have not been synthetically generated. Image forensics using machine learning could solve this problem.

Illicit photographs or prints, can possibly be traced back to the offender by using printer and camera identification. A printout may be evidence of industrial espionage, and identifying the printer is one step towards identifying the user. Similarly, illegal imagery can be traced to a camera as a step towards identifying the photographer. The application is very similar to more traditional forensic techniques, used for instance to identify mechanical type writers.

Other areas of image forensics concern more general investigation of the history of a given image. Without directly answering any hard and obvious questions, the applications of image forensics may be subtler. It is clear, though, that knowledge of the history of the image establishes a context which can add credibility to the image and the story behind it. It is very difficult to forge a context for a piece of evidence, and the person genuinely presenting an image should be able to give details about its origin. Thus forensic techniques may validate or invalidate the image as evidence. For instance, a source claiming to have taken a given photograph should normally be able to produce the camera used. If we can check whether the photograph matches the alleged camera, we can validate the claim.

There are many successful examples of steganalysers being retrained to solve other forensic problems. Lyu and Farid's 72-D feature vector is a classic, and it was used to distinguish between real and synthetic photographs in Lyu and Farid (2005). Another classic, Avcibas et al. (2004), used image quality measures for tamper detection (Chupeau, 2009). More recently, Wahab (2011) showed that the JPEG features discussed in Chapter 8 can discriminate between different iPhone cameras, and he achieved particularly good results with the Markov-based features and with conditional probabilities. Even if steganalytic feature vectors do not necessarily provide optimised forensic techniques, they make a good background for understanding image forensics in general.

15.2 Conclusions and Notes

Steganalysis has come a long way over the last 10 years, with a good portfolio of attacks. All known stego-systems have been declared broken, typically meaning that steganalysers have been demonstrated with 80–90% accuracy or more.

The fundamental short coming of the current state of steganalysis research is the missing understanding of real-world use cases. It remains an open question if 80–90% accuracy is good or bad for a steganalyser. Similarly, it is an open question what embedding capacity one should expect from a good stego-system. As long as no means to answer these questions is found, the design targets for new systems will remain arbitrary, and quite likely guided by feasibility rather than usefulness.

A second fundamental question for further research is a statistical model for clean images that could allow us systematically to predict detectability of known steganographic distortion patterns. Again, this would be an enormous task, but there are many related open questions which may be more feasible in the immediate term, such as

Cover selection improved methods for feature selection to allow Alice to outwit current steganalysers.
Steganalysis for adverse cover sources some cover sources are already known to be difficult to steganalyse, even without cover source mismatch. The design of steganalysers targeting such sources would be useful, if possible.

Such questions are likely so see increasing attention over the next couple of years.

In spite of all stego-systems being reportedly broken, there is a good case to claim that the steganographer has the upper hand. Steganalysis as we know it is focused entirely on steganography by modification. Yet, it is well known that there are many options for Alice to think outside the box, while steganalysis remains inside it. Steganography by cover selection and cover synthesis is an example of this. Even though we know very little about how to design stego-systems using cover selection or synthesis, even less is known about how to steganalyse them. Just as in classical times, Alice's best shot may very well be to devise a simple ad hoc stego-system relying on secrecy of the algorithm. As long as she keeps a low profile and steganograms remain rare, Wendy does not stand much of a chance of detecting them.

The theory of Bayesian inference illustrates this point very well. If the prior probability of observing a steganogram is (say) one in a thousand, a steganalyser with 10% error probability will not have any significant effect on the posterior probability.

We can expect very interesting steganalysis research over the next couple of years. It would be arrogant to try to predict exactly what questions will be asked and which will be answered, but it should be safe to say that they will be different from the questions dominating the last 10 years of research.

