You can download the ROS image recognition package from GitHub; it's also available in the section's code bundle. The image_recognition.py program can publish detected results in the /result topic, which is of the std_msgs/String type and is subscribed to image data from the ROS camera driver from the /image (sensor_msgs/Image) topic.
So how does image_recognition.py work?
First take a look at the main modules imported to this node. As you know, rospy has ROS Python APIs. The ROS camera driver publishes ROS image messages, so here we have to import Image messages from sensor_msgs to handle those image messages. To convert a ROS image to the OpenCV data type and vice versa, we need cv_bridge and, of course, the numpy, tensorflow, and tensorflow imagenet modules to classify of images and download the Inception-v3 model from tensorflow.org. Here are the imports:
import rospy from sensor_msgs.msg import Image from std_msgs.msg import String from cv_bridge import CvBridge import cv2 import numpy as np import tensorflow as tf from tensorflow.models.image.imagenet import classify_image
The following code snippet is the constructor for a class called RosTensorFlow():
class RosTensorFlow(): def __init__(self):
The constructor call has the API for downloading the trained Inception-v3 model from tensorflow.org:
classify_image.maybe_download_and_extract()
Now, we are creating a TensorFlow Session() object, then creating a graph from a saved GraphDef file, and returning a handle for it. The GraphDef file is available in the code bundle.
self._session = tf.Session() classify_image.create_graph()
This line creates a cv_bridge object for the ROS-OpenCV image conversion:
self._cv_bridge = CvBridge()
Here are the subscriber and publisher handles of the node:
self._sub = rospy.Subscriber('image', Image, self.callback, queue_size=1) self._pub = rospy.Publisher('result', String, queue_size=1)
Here are some parameters used for recognition thresholding and the number of top predictions:
self.score_threshold = rospy.get_param('~score_threshold', 0.1) self.use_top_k = rospy.get_param('~use_top_k', 5)
Here is the image call back in which a ROS image message is converted to OpenCV data type:
def callback(self, image_msg): cv_image = self._cv_bridge.imgmsg_to_cv2(image_msg, "bgr8") image_data = cv2.imencode('.jpg', cv_image)[1].tostring()
The following code runs the softmax tensor by feeding image_data as input to the graph. The 'softmax:0' part is a tensor containing the normalized prediction across 1,000 labels.
softmax_tensor = self._session.graph.get_tensor_by_name('softmax:0')
The 'DecodeJpeg/contents:0' line is a tensor containing a string providing JPEG encoding of the image:
predictions = self._session.run( softmax_tensor, {'DecodeJpeg/contents:0': image_data}) predictions = np.squeeze(predictions)
The following section of code will look for a matching object string and its probability and publish it through the topic called /result:
node_lookup = classify_image.NodeLookup() top_k = predictions.argsort()[-self.use_top_k:][::-1] for node_id in top_k: human_string = node_lookup.id_to_string(node_id) score = predictions[node_id] if score > self.score_threshold: rospy.loginfo('%s (score = %.5f)' % (human_string, score)) self._pub.publish(human_string)
The following is the main code of this node. It simply initializes the class and calls the main() method inside the RosTensorFlow() object. The main method will spin() the node and execute a callback whenever an image comes into the /image topic.
def main(self): rospy.spin() if __name__ == '__main__': rospy.init_node('rostensorflow') tensor = RosTensorFlow() tensor.main()