Foreword

John Billingsley

University of Southern Queensland

An important focus of advances in mechatronics and robotics is the addition of sensory inputs to systems with increasing “intelligence.” Without doubt, sight is the “sense of choice.” In everyday life, whether driving a car or threading a needle, we depend first on sight. The addition of visual perception to machines promises the greatest improvement and at the same time presents the greatest challenge.

Until relatively recently, the volume of data in the images that make up a video stream has been a serious deterrent to progress. A single frame of very modest resolution might occupy a quarter of a megabyte, so the task of handling thirty or more such frames per second requires substantial computer resources.

Fortunately, the computer and communications industries’ investment in entertainment has helped address this challenge. The transmission and processing of video signals are an easy justification for selling the consumer increased computing speed and bandwidth. A digital camera, capable of video capture, has already become a fashion accessory as part of a mobile phone. As a result, video signals have become more accessible to the serious engineer. But the task of acquiring a visual image is just the tip of the iceberg.

While generating sounds and pictures is a well-defined process (speech generation is a standard “accessibility” feature of Windows), the inverse task of recognizing connected speech is still at an unfinished state, a quarter of a century later, as any user of dictation” software will attest. Still, analyzing sound is not even in the same league with analyzing images, particularly when they are of real-world situations rather than staged pieces with synthetic backgrounds and artificial lighting.

The task is essentially one of data reduction. From the many megabytes of the image stream, the required output might be a simple All wheel nuts are in place” or This tomato is ripe.” But images tend to be noisy, objects that look sharp to the eye can have broken edges, boundaries can be fuzzy, and straight lines can be illusory. The task of image analysis demands a wealth of background know-how and mathematical analytic tools.

Roy Davies has been developing that rich background for well over two decades. At the time of the UK Robotics Initiative, in the 1980s, Roy had formed a relationship with the company United Biscuits. We fellow researchers might well have been amused by the task of ensuring that the blob of jam on a “Jaffacake” had been placed centrally beneath the enrobing chocolate. However, the funding of the then expensive vision acquisition and analysis equipment, together with the spur of a practical target of real economic value, gave Roy a head start that made us all envious.

Grounded in that research is the realization that human image analysis is a many-layered process. It starts with simple graphical processing, of the sort that our eyes perform without our conscious awareness. Contrasts are enhanced; changes are emphasized and brought to our attention. Next we start to code the image in terms of lines, the curves of the horizon or of a face, boundaries between one region and another that a child might draw with a crayon. Then we have to “understand” the shape - is it a broken biscuit, or is one biscuit partly hidden by another? If we are comparing a succession of images, has something moved? What action should we take?

Machine vision must follow a similar, multilayered path. Roy has captured the essentials with clear, well-illustrated examples. He demonstrates filters that by convolution can smooth or sharpen an image. He shows us how to wield the tool of the Hough transform for locating lines and boundaries that are made indistinct by noise. He throws in the third dimension with stereo analysis, structured lighting, and optical flow. At every step, however, he drags us back to the world of reality by considering some practical task. Then, with software examples, he challenges us to try it for ourselves.

The easy access to digital cameras, video cameras, and streaming video in all its forms has promoted a tidal wave of would-be applications. But an evolving and substantial methodology is still essential to underpin the “art” of image processing. With this latest edition of his book, Roy continues to surf the crest of that wave.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.147.51.191