Visualizing sample images from the dataset

Data cleaning and EDA are indispensable components of data science. Before we begin analyzing our data, it is important to understand some basic properties of what we have input. The dataset we are using comprises standardized images with regular shapes and normalized pixel values. The features are simple, thin lines. Our goal is straightforward as well, to recognize digits from images. Yet, in many cases of real-world practice, the problems can be more complicated; the data we collect is going to be raw and often much more heterogeneous. Before tackling the problem, it is usually worth the time to sample a small amount of input data for inspection. Imagine training a model to recognize Ramen just to get you drooling ;). You will probably take a look at some images to decide what features make a good input sample to exemplify the presence of the bowl. Besides the initial preparatory phase, during model building taking out some of the mislabeled samples to examine may also help us devise strategies for optimization.

In case you wonder where the Ramen idea comes from, a data scientist named Kenji Doi created a model to recognize in which restaurant branch a bowl of Ramen was made. You may read more on the Google Cloud Big Data and Machine Learning Blog post on https://cloud.google.com/blog/big-data/2018/03/automl-vision-in-action-from-ramen-to-branded-goods.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.145.95.7