Introduction to data analytics

Data analytics is the process of applying qualitative and quantitative techniques when examining data with the goal of providing valuable insights. Using various techniques and concepts, data analytics can provide the means to explore the data Exploratory Data Analysis (EDA) as well as draw conclusions about the data Confirmatory Data Analysis (CDA). EDA and CDA are fundamental concepts of data analytics, and it is important to understand the difference between the two.

EDA involves methodologies, tools, and techniques used to explore data with the intention of finding patterns in the data and relationships between various elements of the data. CDA involves methodologies, tools, and techniques used to provide an insight or conclusion on a specific question based on a hypothesis and statistical techniques or simple observation of the data.

A quick example to understand these ideas is that of a grocery store, which has asked you to give them ways to improve sales and customer satisfaction as well as keep the cost of operations low.

The following is a grocery store with aisles of various products:

Assume that all sales at the grocery store are stored in some database and that you have access to the data for the last 3 months. Typically, businesses store data for years as you need sufficient data over a period of time to establish any hypothesis or observe any patterns. In this example, our goal is to perform better placement of products in various aisles based on how customers are buying the products. One hypothesis is that customers often buy products, that are both at eye level and also close together. For instance, if Milk is on one corner of the store and Yogurt is in other corner of the store, some customers might just choose either Milk or Yogurt and just leave the store, causing a loss of business. More adverse affects might result in customers choosing another store where products are better placed because if the feeling that things are hard to find at this store. Once that feeling sets in, it also percolates to friends and family eventually causing a bad social presence. This phenomenon is not uncommon in the real world causing some businesses to succeed while others fail while both seem to be very similar in products and prices.

There are many ways to approach this problem starting from customer surveys to professional statisticians to machine learning scientists. Our approach will be to understand what we can from just the sales transactions alone.

The following is an example of what the transactions might look like:

The following are the steps you could follow as part of EDA:

  1. Calculate Average number of products bought per day = Total of all products sold in a day / Total number of receipts for the day.
  2. Repeat the preceding step for last 1 week, month, and quarter.

 

  1. Try to understand if there is a difference between weekends and weekdays and also time of the day (morning, noon, and evening)
  2. For each product, create a list of all other products to see which products are usually bought together (same receipt)
  3. Repeat the preceding step for 1 day, 1 week, month, and quarter.
  4. Try to determine which products should be placed closer together by the number of transactions (sorted in descending order).

Once we have completed the preceding 6 steps, we can try to reach some conclusions for CDA.

Let's assume this is the output we get:

Item Day Of Week Quantity
Milk Sunday 1244
Bread Monday 245
Milk Monday 190

 

In this case, we could state that Milk is bought more on weekends so its better to increase the quantity and variety of Milk products over weekends. Take a look at the following table:

Item1 Item2 Quantity
Milk Eggs 360
Bread Cheese 335
Onions Tomatoes 310

 

In this case, we could state that Milk and Eggs are bought by more customers in one purchase followed by Bread and Cheese. So, we could recommend that the store realigns the aisles and shelves to move Milk and Eggs closer to each other.

The two conclusions we have are:

  • Milk is bought more on weekends, so it's better to increase the quantity and variety of Milk products over weekends.
  • Milk and Eggs are bought by more customers in one purchase followed by Bread and Cheese. So, we could recommend that the store realigns the aisles and shelves to move Milk and Eggs closer to each other.
Conclusions are usually tracked over a period of time to evaluate the gains. If there is no significant impact on sales even after adopting the preceding two recommendations for 6 months, we simply invested in the recommendations which are not able to give you a good Return On Investment (ROI).

Similarly, you can also perform some analysis with respect to the Profit margin and pricing optimizations. This is why you will typically see a single item costing more than the average of multiple numbers of the same item bought. Buy one Shampoo for $7 or two bottles of Shampoo for $12.

Think about other aspects you can explore and recommend for the grocery store. For example, can you guess which products to position near checkout registers just based on fact that these have no affinity toward any particular product--chewing gum, magazines, and so on.

Data analytics initiatives support a wide variety of business uses. For example, banks and credit card companies analyze withdrawal and spending patterns to prevent fraud and identity theft. Advertising companies analyze website traffic to identify prospects with a high likelihood of conversion to a customer. Department stores analyze customer data to figure out if better discounts will help boost sales. Cell Phone operators can figure out pricing strategies. Cable companies are constantly looking for customers who are likely to churn unless given some offer or promotional rate to retain their customer. Hospitals and pharmaceutical companies analyze data to come up with better products and detect problems with prescription drugs or measure the performance of prescription drugs.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
13.59.234.214