Getting to Know Your Data - Exploring IoT Data

"Yes, I know the three-month average is 45.2, but what does it mean? We have over 2 terabytes of data now. But what does it tell us?"

Your boss is in one of his moods again. He is peppering you with questions, and you are not sure how to answer. Your first instinct was to spit back numbers from the weekly reports, but it obviously did not work.

"Is the data even any good?" he continues, "Are we building up value or just cluttering up the basement with junk?"

You start to shrug but wisely stop yourself. You think the data is very valuable, but you actually do not know much about the dataset beyond the numbers that are included in the reports you have been asked to develop. You realize you are not even sure what the individual data records look like. You know that averages can hide a lot of things, but no one has ever asked you to look any deeper.

You straighten your shoulders and say confidently, "There is value in the data. We are just not looking at all aspects of it. I will show you there is more to it."

You do not feel confident at all. You are not sure how you are going to show where business value is buried just waiting to be discovered. But you hide it well.

Your boss lets out a harrumph-like noise, clearly not really buying what you are selling. He turns around and walks off down the hallway muttering to himself.

Data is the sword of the twenty-first century, those who wield it well, the samurai.
-Eric Schmidt and Jonathan Rosenberg in How Google Works

This chapter is focused on exploratory data analysis for IoT data. You will learn how to ask and answer questions of the data. The first part of the chapter will be on understanding the data quality. Then, we will move on to getting to know the data better and what it represents. We will use Tableau and R for examples. You will learn strategies for quickly assessing data and starting the hunt to find value.

This chapter covers the following topics:

  • Exploring and visualizing data
  • A quick Tableau overview:
    • Techniques to understand data quality
    • Basic time series analysis
    • Getting to know categories in the data
    • Analyzing with geography
  • Looking for attributes that have predictive value
  • Using R to augment visualization tools
  • Industry-specific examples:
    • Manufacturing
    • Health care
    • Retail
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.144.243.184