Overview
This chapter describes the Amazon Rekognition service for analyzing the content of images using various techniques. You will be introduced to the Rekognition service for image analysis using computer vision, through which you will be able to detect objects and scenes in images. You will learn how to analyze faces and recognize celebrities in images. You will also be able to compare faces in different images to see how closely they match with each other.
By the end of this chapter, you will be able to apply the Amazon vision and image processing AI services in fields such as biology, astronomy, security, and so on.
In the preceding chapters, you have done lots of interesting exercises and activities with the Amazon Web Services (AWS) Artificial Intelligence (AI) and Machine Learning (ML) services. You combined the serverless computing paradigm and conversational AI to construct chatbots, as well as a fully functional contact center that enables anyone to converse with the chatbots through a voice interface that's available by dialing a local phone number. You also learned about text analysis and topic modeling, all using the AWS services.
In this chapter, you will use the Amazon Rekognition service to perform various image processing tasks. First, you will identify objects and scenes within images. Then, you will test whether images should be flagged as needing content moderation. Next, you will analyze faces using Rekognition. You will also recognize celebrities and well-known people in images. You will compare faces that appear in different images and settings (for example, in groups or isolation) and recognize the same people in different images. Finally, you will extract text from images that might have some text displayed in them.
Amazon Rekognition is a deep learning-based visual analysis service from AWS that allows you to perform image analysis on pictures and videos using machine learning. It is built on the same scalable infrastructure as AWS itself and uses deep learning technology to be able to analyze billions of images daily if required. It is also being updated constantly by Amazon and is learning new labels and features.
Some of the use cases for Amazon Rekognition are as follows:
Note
Amazon Rekognition is also a HIPAA-eligible service for healthcare applications. If you wish to protect your data under HIPAA, you will need to contact Amazon customer support and fill out a Business Associate Addendum (BAA). For more information about HIPAA, go to the following link: https://aws.amazon.com/compliance/hipaa-compliance/.
For this book, you will be using the free tier services of Amazon Rekognition. Be aware of the limits of the free tier services and the pricing options. These are the free services you can use for image processing:
Note
You should not need to use more than the free tier limits, but if you do go beyond the limits of the free tier, you will get charged by Amazon at the rates published at this link: https://aws.amazon.com/rekognition/pricing/.
Deep learning is a branch of artificial intelligence and a subfield of machine learning. Deep learning works by inferring high-level abstractions from raw data by using a deep neural network graph with many layers of processing.
Deep learning structures such as Convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs) have been employed in natural language processing, audio recognition, speech recognition, and computer vision to deliver significant results. Neural Machine Translation has replaced all human-curated translation engines, object detection in autonomous cars uses CNN-based architectures extensively, and conversational AI is powering a variety of customer interactions.
The Rekognition service employs deep learning to provide its various features behind the scenes. It uses pre-trained models so that users do not have to train the system. The exact details are proprietary and confidential to Amazon, but we will learn how it works and how to use Rekognition in this chapter. As we mentioned earlier, one interesting aspect of Amazon Rekognition is the fact that the algorithms are monitored and trained periodically to increase their accuracy and capabilities. It can also be extended with custom labels and models trained with your images.
Note
For any questions you have, the Amazon Rekognition FAQ page (https://aws.amazon.com/rekognition/faqs/) is an excellent resource.
Amazon Rekognition provides a feature that can detect objects and scenes in an image and label them. This label may be an object, scene, or concept such as a person, water, sand, a beach (scene), and the outdoors (concept).
Each label comes with a confidence score, which measures, on a scale from 0 to 100, the probability that the service got the correct answer for that label. This allows you or your application to judge the threshold against which to allow or discard results for itself.
Rekognition supports thousands of labels from categories such as those shown in the following table:
Additionally, Amazon is continuously training the system to recognize new ones, and you can request labels that you might wish to use that are not in the system through Amazon customer support.
To create an Amazon Rekognition analysis of a sample image, you can do the following:
The result for the image is as follows:
Nature
Outdoors
Sky
Sun
Dawn
Sunset
You can choose the threshold amount of the confidence level at which you would like to cut off the results for your application.
In this exercise, we will detect objects and scenes of custom images using Amazon Rekognition. The custom images can be either taken from online sources, or you can upload them from your local machine. The following are the steps for detecting objects and scenes:
Note
We have collected images from a stock photo site called https://unsplash.com/, which contains photos that can be downloaded for free and used without restrictions for this book. Always obey copyright laws and be mindful of any restrictions or licensing fees that might apply in the jurisdiction where you reside (if applicable). You may view the license for images from unsplash.com here: https://unsplash.com/license.
The following are the images. This is the second test image:
The following are the results of the second image provided:
This is the third test image:
The following are the results of the third image provided:
The results for the second image did indicate it was a human head with > 83% confidence. Looking at the third image of the Golden Gate Bridge, it had more classes before Bridge with 96.5% confidence.
In addition to object and scene detection, Rekognition also provides the ability to filter out objectionable content. You can use moderation labels to give point-by-point subclassifications, enabling you to tweak the channels that you use to figure out what sorts of pictures you consider satisfactory or shocking. Amazon Rekognition Image provides the DetectModerationLabels operation to detect unsafe content in images.
You can utilize this component to enhance photograph-sharing destinations, gatherings, dating applications, content stages for youngsters, online business stages and commercial centers, and more. In this book, we will not use any adult or nude images, but we can show the use of this feature with content that may be considered racy or suggestive in some locales featuring women in revealing clothing such as swimsuits or clubwear.
The images are blurred by default in this section, so you do not have to view them unless you press the View Content button.
Note
If you find any racy or suggestive images offensive, please skip this section based on your own personal, moral, religious, or cultural norms.
Amazon Rekognition uses a hierarchical taxonomy to label categories of explicit and suggestive content. The two top-level categories are Explicit Nudity and Suggestive. Each top-level category has many second-level categories. The types of content that are detected and flagged using this feature are as follows:
To create an Image Moderation of a sample image, you can do the following:
With this, we've seen how Amazon Rekognition can filter out suggestive content, but let's see how it does when it comes to detecting objectionable content in images.
In this exercise, we will detect objectionable content in images. You can try this service on your images. We have selected three images that we will try out with this feature. Follow these steps to complete this exercise:
You should see that Rekognition correctly returns no results:
This one should have, once again, correctly been identified as containing content with Female Swimwear Or Underwear with a 99.6% degree of confidence:
As you've seen, Amazon Rekognition has powerful image analysis capabilities – including content moderation. As a suggestion, you can try some more images that may or may not be suggestive and check the results. You might find some gaps in the object detection deep learning algorithms.
Rekognition can perform a more detailed analysis of faces as well. Given an image with a detectable face, it can tell whether the face is male or female, the age range of the face, whether or not the person is smiling and appears to be happy, and whether they are wearing glasses or not.
It can also detect more detailed information, such as whether the eyes and mouth are open or closed, and whether or not the person has a mustache or a beard.
To create a facial analysis of a sample image, you can do the following:
Note
Click on the Facial Analysis link in the left toolbar to navigate to the Facial Analysis page.
You can either save the image onto your disk or download this book's GitHub repository, as we covered in Chapter 1, An Introduction to AWS.
You can click the Show more link in order to look at the other attributes that have also been identified.
All these identified qualities have an extremely high degree of confidence, showing that the service is very confident about its findings.
In this exercise, you have been provided with three images in the book's GitHub repository (https://packt.live/31X6w1Z) so that you can try out the Amazon Rekognition service with sample images. The images are provided courtesy of https://unsplash.com/ and Pinterest. Let's find out if they can identify the prominent facial attributes. Follow these steps to complete this exercise:
Note
Click the Facial Analysis link in the left toolbar to navigate to the Facial Analysis page.
You can either save the image onto your disk or download this book's GitHub repository, as we covered in Chapter 1, An Introduction to AWS.
It is 84.2% confident that the man is not smiling, and we can see from the image that he is not smiling very much, if at all. Finally, the service is 97.5% sure that he is wearing glasses, but it also says with 90.2% confidence that he is not wearing sunglasses. It shows that there is still room for lots of improvement and we should apply human logic and rules to validate the results from the image detection services. In this case, we can take the one with the larger confidence score (that is, wearing glasses) to show that we can use the relative score to evaluate the results.
As we all know, humans are born with extremely good object detection and image analysis capabilities, which we enhance as we grow. But this is very hard for machines as they do not have the capability to reason or perform semantic analysis. The image analysis domain is relatively new, with the bulk of advances coming in the last 5 to 6 years. New algorithms are being researched, new ways of training are being explored, and optimizations are being sought. Therefore, Amazon Rekognition is extremely effective—it has wrapped the algorithms and mechanisms in a set of useful and practical interfaces, masking the underlying algorithmic and computer theoretic complexities, and Rekognition learns and evolves by leveraging the current best practices and research. In this section, you were introduced to the capabilities of Amazon Rekognition's image analysis service. You will see more regarding this service in the following sections.
Rekognition provides us with the ability to recognize and label celebrities and other famous people in images. This includes well-known individuals from a variety of fields, such as sports, business, politics, media, and entertainment.
It is important to remember that Rekognition can only recognize faces that it has been trained on, and so does not cover a full, exhaustive list of celebrities. However, since Amazon continues to train the system, it is constantly adding new faces to the service.
Note
Click the Celebrity recognition link in the left toolbar to navigate to the Celebrity recognition page.
To create a celebrity recognition of a sample image, you can do the following:
In this exercise, we will use another site that has a larger collection of celebrity images. You can also use these for free without restrictions. You can also try out this service on your own images. We have selected three images that we will try out with this feature.
You may view the license for images from pexels.com here: https://www.pexels.com/creative-commons-images/. Follow these steps to complete this exercise:
Note
Click the Celebrity recognition link in the left toolbar to navigate to the Celebrity recognition page.
The following image shows the result of celebrity recognition:
The Learn More links under their names go to their respective IMDb pages. As we have done previously, we can verify this by clicking on them.
Rekognition allows you to compare faces in two images. This is mainly for the purpose of identifying which faces are the same in both images. As an example use case, this can also be used for comparing images with people against their personnel photo.
This section demonstrates industry standards so that you can utilize Amazon Rekognition to analyze faces inside an arrangement of pictures with different faces in them. When you indicate a Reference face (source) and a Comparison face (target) picture, Rekognition thinks about the biggest face in the source picture (that is, the reference) with up to 100 countenances recognized in the objective picture (that is, the examination images) and, after that, discovers how intently the face in the source picture matches with the appearances in the target picture. The closeness score for every examination is shown in the Results sheet.
Some restrictions on the usage of this feature are as follows:
Note
Click the Face comparison link in the left toolbar to navigate to the Face comparison page.
With the face comparison feature, there are two sections with images, side by side. You can choose to compare images in the left-hand section with images in the right-hand section. To create a facial analysis of a sample image, you can do the following:
In this activity, you can try out Rekognition with your own images. For example, we have provided links to two sets of images that display the same people. You can enter the sets of images into the left- (comparison) and right-hand (comparison) sides by using the Upload button. Remember that there are two this time, so there are two Go buttons to press as well. Follow these steps to complete this activity:
The images to be analyzed can be found in the https://packt.live/31X6IP6 and https://packt.live/3ebuSYz, https://packt.live/2ZLseUd files.
You can either save the images onto your disk or download this book's GitHub repository, as we covered in Chapter 1, An Introduction to AWS:
Additional Challenge
As an additional challenge you can try the same steps on these two images from Unsplash as well: https://images.unsplash.com/photo-1526510747491-58f928ec870f and https://images.unsplash.com/photo-1529946179074-87642f6204d7:
The expected output is the degree of confidence that the corresponding two images from the image sets are of the same person, even with different angles, lighting, and the position of the face. You will see that in the results section. This activity shows the image analysis capabilities of the Amazon Rekognition service.
Note
The solution for this activity can be found on page 348.
In the previous chapters, you learned how to extract text from scanned documents such as tax returns and company statements. Amazon Rekognition can detect and extract text from images as well—for example, street signs, posters, product names, and license plates. Of course, this feature is made to work with real-world images instead of document images. The Text in image link, which is accessible from the left toolbar, is where this capability resides in Amazon Rekognition.
For each image provided, the service returns a text label and bounding box, along with a confidence score. This can be extremely useful for searching text across a collection of images. Each image can be tagged with the corresponding text metadata based on the results from this and other capabilities of the service.
For now, the only texts that are supported are Latin scripts and numbers (Western script). Up to 50 sequences of characters can be recognized per image. The text must be horizontal with +/- 90 degrees rotation.
Note
Click on the Text in image link in the left toolbar to navigate to the Text in image page.
To identify a "text in image" of a sample image, you can do the following:
Rekognition surrounds the detected text with borders so that you can identify which text regions it has recognized. You can see the results of text extraction in the Results section:
Rekognition was able to find text in the image and put a box around it; then, it was able to "read" the text and even understand that there are two words! The Rekognition service has extracted text from the image with separators (|) between words in separate regions. Even though the sign's font is unique, with shadows, it was still able to extract the text.
Next, let's try out this capability with our own images from different real-life situations such as storefronts and license plates at different angles. As you will see, Amazon Rekognition does very well on photos taken from below store signs, as well as on photos taken at an angle above license plates.
In this exercise, you will extract text from your own images. Let's see how well Amazon Rekognition works with a variety of different text in images. We have provided three royalty-free images for you to use. Follow these steps to complete this exercise:
Note
Your results may not be as precise as the ones that we've got here.
You can see the bounding boxes around the image, which signify that Rekognition has recognized the text in the image. The results can be viewed in the Results panel to the right of the image. It did miss one hyphen between N and OUT, but didn't miss the I in IN, which is barely in the picture and slanted:
You can see that Rekognition has recognized the main text in the window of the shop: OCEAN GIFTS SPORTFISHING WHALE WATCHING and the separation between the words:
Even though the results are extremely good, Rekognition can get confused. This is something you should be aware of and watch out for in your results. It is possible for Rekognition to get confused and return spurious results.
This is another example of a license plate. The results are as follows:
It has done a good job isolating the number plate.
In this exercise, we learned that Amazon Rekognition can pick out text from images, even with different angles, fonts, shadows, and so forth. You should try this feature out with multiple images with different angles, lighting, and sizes.
In this chapter, you learned how to use various features of the Amazon Rekognition service and applied this to images. First, you used the service to recognize objects and scenes in images. Next, you moderated images that might have objectionable content by using Rekognition to recognize the objectionable content in the images.
You were able to analyze faces with Rekognition and were also able to identify their gender, age range, whether they were smiling, and whether they were wearing glasses.
You also recognized celebrities and famous people with the service and compared faces in different images to see whether they were the same. Finally, you were able to extract text that was displayed in images.
With this, we have come to the end of this chapter and this book. We hope it was an interesting journey discovering the enormous capabilities of serverless computing, Amazon AI and ML services, text analysis, image analysis, and so forth.
These types of features would have seemed unbelievable just a few years ago. The nature of machine learning and artificial intelligence is such that immense strides have been made, and will continue to be made in the foreseeable future, in terms of what computers are going to be able to do—and AWS will be able to provide these services for you and your applications.
18.221.39.55