Before starting the training process, we need to pre-process the dataset into a simpler format, which we will use in the further automatic fine-tuning.
First, we make a Python package with named scripts in the project folder. Then, we create a Python file named convert_oxford_data.py and add the following code:
import os import tensorflow as tf from tqdm import tqdm from scipy.misc import imread, imsave FLAGS = tf.app.flags.FLAGS tf.app.flags.DEFINE_string( 'dataset_dir', 'data/datasets', 'The location of Oxford IIIT Pet Dataset which contains annotations and images folders' ) tf.app.flags.DEFINE_string( 'target_dir', 'data/train_data', 'The location where all the images will be stored' ) def ensure_folder_exists(folder_path): if not os.path.exists(folder_path): os.mkdir(folder_path) return folder_path def read_image(image_path): try: image = imread(image_path) return image except IOError: print(image_path, "not readable") return None
In this code, we use tf.app.flags.FLAGS to parse arguments so that we can customize the script easily. We also create two helper methods to make a directory and read images.
Next, we add the following code to convert the Oxford dataset into our preferred format:
def convert_data(split_name, save_label=False): if split_name not in ["trainval", "test"]: raise ValueError("split_name is not recognized!") target_split_path =
ensure_folder_exists(os.path.join(FLAGS.target_dir, split_name)) output_file = open(os.path.join(FLAGS.target_dir, split_name +
".txt"), "w") image_folder = os.path.join(FLAGS.dataset_dir, "images") anno_folder = os.path.join(FLAGS.dataset_dir, "annotations") list_data = [line.strip() for line in open(anno_folder + "/" +
split_name + ".txt")] class_name_idx_map = dict() for data in tqdm(list_data, desc=split_name): file_name,class_index,species,breed_id = data.split(" ") file_label = int(class_index) - 1 class_name = "_".join(file_name.split("_")[0:-1]) class_name_idx_map[class_name] = file_label image_path = os.path.join(image_folder, file_name + ".jpg") image = read_image(image_path) if image is not None: target_class_dir =
ensure_folder_exists(os.path.join(target_split_path,
class_name)) target_image_path = os.path.join(target_class_dir,
file_name + ".jpg") imsave(target_image_path, image) output_file.write("%s %s " % (file_label,
target_image_path)) if save_label: label_file = open(os.path.join(FLAGS.target_dir,
"labels.txt"), "w") for class_name in sorted(class_name_idx_map,
key=class_name_idx_map.get): label_file.write("%s " % class_name) def main(_): if not FLAGS.dataset_dir: raise ValueError("You must supply the dataset directory with
--dataset_dir") ensure_folder_exists(FLAGS.target_dir) convert_data("trainval", save_label=True) convert_data("test") if __name__ == "__main__": tf.app.run()
Now, we can run the scripts with the following code:
python scripts/convert_oxford_data.py --dataset_dir data/datasets/ --target_dir data/train_data.
The script reads the Oxford-IIIT dataset ground truth data and creates a new dataset in data/train_data with the following structure:
- train_data -- trainval.txt -- test.txt -- labels.txt -- trainval ---- Abyssinian ---- ... -- test ---- Abyssinian ---- ...
Let's discuss these a bit:
- labels.txt contains a list of 37 species in our dataset.
- trainval.txt contains a list of the images that we will use in the training process, with the format <class_id> <image_path>.
- test.txt contains a list of the images that we will use to check the accuracy of the model. The format of test.txt is the same as trainval.txt.
- trainval and test folders contain 37 sub-folders, which are the names of each class and contains all the images of each class.