We will now look at how to use neural networks to perform optical character recognition. This refers to the process of identifying handwritten characters in images. We will use the dataset available at http://ai.stanford.edu/~btaskar/ocr. The default file name after downloading is letter.data
. To start with, let's see how to interact with the data and visualize it.
import os import sys import cv2 import numpy as np
# Load input data input_file = 'letter.data'
# Define visualization parameters scaling_factor = 10 start_index = 6 end_index = -1 h, w = 16, 8
# Loop until you encounter the Esc key with open(input_file, 'r') as f: for line in f.readlines(): data = np.array([255*float(x) for x in line.split(' ')[start_index:end_index]])
img = np.reshape(data, (h,w)) img_scaled = cv2.resize(img, None, fx=scaling_factor, fy=scaling_factor) cv2.imshow('Image', img_scaled)
c = cv2.waitKey() if c == 27: break
visualize_characters.py
file that's already provided to you. If you run this code, you will see a window displaying characters. For example, o looks like the following:The character i looks like the following:
3.144.38.24