Visualizing the characters in an optical character recognition database

We will now look at how to use neural networks to perform optical character recognition. This refers to the process of identifying handwritten characters in images. We will use the dataset available at http://ai.stanford.edu/~btaskar/ocr. The default file name after downloading is letter.data. To start with, let's see how to interact with the data and visualize it.

How to do it…

  1. Create a new Python file, and import the following packages:
    import os
    import sys
    
    import cv2
    import numpy as np
  2. Define the input file name:
    # Load input data 
    input_file = 'letter.data' 
  3. Define visualization parameters:
    # Define visualization parameters 
    scaling_factor = 10
    start_index = 6
    end_index = -1
    h, w = 16, 8
  4. Keep looping through the file until the user presses the Esc key. Split the line into tab-separated characters:
    # Loop until you encounter the Esc key
    with open(input_file, 'r') as f:
        for line in f.readlines():
            data = np.array([255*float(x) for x in line.split('	')[start_index:end_index]])
  5. Reshape the array into the required shape, resize it, and display it:
            img = np.reshape(data, (h,w))
            img_scaled = cv2.resize(img, None, fx=scaling_factor, fy=scaling_factor)
            cv2.imshow('Image', img_scaled)
  6. If the user presses Esc, break the loop:
            c = cv2.waitKey()
            if c == 27:
                break
  7. The full code is in the visualize_characters.py file that's already provided to you. If you run this code, you will see a window displaying characters. For example, o looks like the following:
    How to do it…

    The character i looks like the following:

    How to do it…
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.38.24