Downloading and unpacking LSUN data

The LSUN dataset is quite popular in the ML field and is well-suited to the GAN task. This dataset in particular contains a few useful classes relating to our task. The following diagram illustrates how the data was labelled:

How the LSUN dataset is trained 

There are a few things to note when looking over the dataset—first, that it has been labeled by humans and a network, and so there may be a few label errors. The other thing to note is that when you see a class such as outdoor_church, as we will use, it will involve a wide variety of structures. The GAN will learn a general representation relating to the structure of the image, but clean results will require tuning the network and underlying parameters.

So, let's go over how to get data from the LSUN dataset. First, we need to use our newly built Docker image to download the data and unpack it. We'll create the create_data.sh shell script and give it executable privileges. This file will involve the following three steps:

  1. At the top of the script, you'll need the bin/bash statement and xhost commands that we learned about in previous recipes, as follows:
#/bin/bash
xhost +
  1. The first run command is simply going to use the download script from the LSUN group to download the data into your data folder. This file is about 3 gigabytes, as follows:
# Download the data into our data folder
docker run -it
--runtime=nvidia
--rm
-v $HOME/DCGAN/data:/data
ch4 python lsun/download.py -o /data -c church_outdoor && unzip church_outdoor_train_lmdb.zip && unzip church_outdoor_val_lmdb.zip && mkdir /data/church_outdoor_train_lmdb/expanded
  1. Next, the following run command expands the data into a flat directory structure containing about 126,000 images, which could take a while:
# Expand the data into our data folder
docker run -it
--runtime=nvidia
--rm
-v $HOME/DCGAN/data:/data
ch4 python lsun/data.py export /data/church_outdoor_train_lmdb
--out_dir /data/church_outdoor_train_lmdb/expanded --flat

  1. To run the Python script we'll create in the next recipe, we're need to add one more run command to our create_data.sh shell script in the scripts folder, which is as follows:
# Save to NPY File
docker run -it
--runtime=nvidia
--rm
-v $HOME/DCGAN/data:/data
-v $HOME/Chapter4/DCGAN/src:/src
ch4 python3 src/save_to_npy.py

In the next section, we'll complete the data processing by reading all of the preceding files into an NPY file for use in our GAN.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.143.247.53