© Dr. Umesh R. Hodeghatta and Umesha Nayak 2017

Umesh R. Hodeghatta and Umesh Nayak, Business Analytics Using R - A Practical Approach, 10.1007/978-1-4842-2514-1_1

1. Overview of Business Analytics

Umesh R. Hodeghatta and Umesh Nayak1

(1)Bangalore, Karnataka, India

Today’s world is knowledge based. In the earliest days, knowledge was gathered through observation. Later, knowledge not only was gathered through observation, but also confirmed by actually doing and then extended by experimenting further. Knowledge thus gathered was applied to practical fields and extended by analogy to other fields. Today, knowledge is gathered and applied by analyzing, or deep-diving, into the data accumulated through various computer applications, web sites, and more. The advent of computers complemented the knowledge of statistics, mathematics, and programming. The enormous storage and extended computing capabilities of the cloud, especially, have ensured that knowledge can be quickly derived from huge amounts of data and also can be used for further preventive or productive purposes. This chapter provides you with the basic knowledge of where and how business analytics is used.

Imagine the following situations:

  • You visit a hotel in Switzerland and are welcomed with your favorite drink and dish; how delighted you are!

  • You are offered a stay at a significantly discounted rate at your favorite hotel when you travel to your favorite destination.

  • You are forewarned about the high probability of becoming a diabetic. You are convinced about the reasoning behind this warning and take the right steps to avoid it.

  • You are forewarned of a probable riot at your planned travel destination. Based on this warning, you cancel the visit; you later learn from news reports that a riot does happen at that destination!

  • You are forewarned of an incompatibility with the person whom you consider making your life partner, based on both of your personal characteristics; you avoid a possible divorce!

  • You enter a grocery store and you find that your regular monthly purchases are already selected and set aside for you. The only decision you have to make is whether you require all of them or want to remove some from the list. How happy you are!

  • Your preferred airline reserves tickets for you well in advance of your vacation travels and at a lower rate compared to the market rate.

  • You are planning to travel and are forewarned of a possible cyclone in that place. Based on that warning, you postpone your visit. Later, you find that the cyclone created havoc, and you avoided a terrible situation.

We can imagine many similar scenarios that are made possible by analyzing data about you and your activities that is collected through various means—including your Google searches, visits to various web sites, your comments on social media sites, your activities using various computer applications, and more. The use of data analytics in these scenarios has focused on your individual perspective.

Now, let’s look at scenarios from a business perspective. Imagine these situations:

  • You are in the hotel business and are able to provide competitive yet profitable rates to your prospective customers. At the same time, you can ensure that your hotel is completely occupied all the time by providing additional benefits, including discounts on local travel and local sightseeing offers tied into other local vendors.

  • You are in the taxi business and are able to repeatedly attract the same customers based on their earlier travel history and preferences of taxi type and driver.

  • You are in the fast-food business and offer discounted rates to attract customers on slow days. These discounts enable you to ensure full occupancy on those days also.

  • You are in the human resources (HR) department of an organization and are bogged down by high attrition. But now you are able to understand the types of people you should focus on recruiting, based on the characteristics of those who perform well and who are more loyal and committed to the organization.

  • You are in the airline business, and based on data collected by the engine system, you are warned of a potential engine failure in the next three months. You proactively take steps to carry out the necessary corrective actions.

  • You are in the business of designing, manufacturing, and selling medical equipment used by hospitals. You are able to understand the possibility of equipment failure well before the equipment actually fails, by carrying out analysis of the errors or warnings captured in the equipment logs.

All these scenarios are possible by analyzing data that the businesses and others collect from various sources. There are many such possible scenarios. The application of data analytics to the field of business is called business analytics.

You have likely observed the following scenarios:

  • You’ve been searching for the past few days on Google for adventurous places to visit. You’ve also tried to find various travel packages that might be available. You suddenly find that various other web sites you visit or the searches you make show a specific advertisement of what you are looking for, and that too at a discounted rate.

  • You’ve been searching for a specific item to purchase on Amazon (or any other site). Suddenly, on other sites you visit, you find advertisements related to what you are looking for or find customized mail landing in your mailbox, offering discounts along with other items you might be interested in.

  • You’ve also seen recommendations that Amazon makes based on your earlier searches for items, your wish list, or previous Amazon purchases. Many times you’ve also likely observed Amazon offering you discounts or promoting products based on its available data.

All of these possibilities are now a reality because of data analytics specifically used by businesses. This book takes you through the exciting field of business analytics and enables you to step into this field as well.

1.1 Objectives of This Book

Many professionals are becoming interested in learning analytics. But not all of them have rich statistical or mathematical backgrounds. This book is the right place for techies as well as those who are not so techie to get started with business analytics. You’ll start with a hands-on introduction to R for beginners. You’ll also learn about predictive modeling and big data, which forms a key part of business analytics. This is an introductory book in the field of business analytics using R.

The following are some of the advantages of this book:

  • This book covers both R programming and analytics using numerous real-life examples.

  • It offers the right mix of theory and hands-on labs. The concepts are explained using business scenarios or case studies where required.

  • It is written by industry professionals who are currently working in the field of analytics on real-life problems for paying customers.

This book provides the following:

  • Practical insights into the use of data that has been collected, collated, purchased, or available for free from government sources or others. These insights are attained via computer programming, statistical and mathematical knowledge, and expertise in relevant fields that enable you to understand the data and arrive at predictive capabilities.

  • Information on the effective use of various techniques related to business analytics.

  • Explanations of how to effectively use the programming platform R for business analytics.

  • Practical cases and examples that enable you to apply what you learn from this book.

  • The dos and don’ts of business analytics.

  • The book does not do the following:

    • Deliberate on the definitions of various terms related to analytics, which can be confusing

    • Elaborate on the fundamentals behind any statistical or mathematical technique or particular algorithm beyond certain limits

    • Provide a repository of all the techniques or algorithms used in the field of business analytics (but does explore many of them)

1.2 Confusing Terminology

Many terms are used in discussions of this topic— for example, data analytics, business analytics, big data analytics, and data science. Most of these are, in a sense, the same. However, the purpose of the analytics, the extent of the data that’s available for analysis, and the difficulty of the data analysis may vary from one to the other. Finally, regardless of the differences in terminology, we need to know how to use the data effectively for our businesses. These differences in terminology should not come in the way of applying techniques to the data (especially in analyzing it and using it for various purposes including understanding it, deriving models from it, and then using these models for predictive purposes).

In layman’s terms, let’s look at some of this terminology:

  • Data analyticsis the analysis of data, whether huge or small, in order to understand it and see how to use the knowledge hidden within it. An example is the analysis of the data related to various classes of travelers (as noted previously).

  • Business analyticsis the application of data analytics to business. An example is offering specific discounts to different classes of travelers based on the amount of business they offer or have the potential to offer.

  • Data scienceis an interdisciplinary field (including disciplines such as statistics, mathematics, and computer programming) that derives knowledge from data and applies it for predictive or other purposes. Expertise about underlying processes, systems, and algorithms is used. An example is the application of t-values and p-values from statistics in identifying significant model parameters in a regression equation.

  • Big data analyticsis the analysis of huge amounts of data (for example, trillions of records) or the analysis of difficult-to-crack problems. Usually, this requires a huge amount of storage and/or computing capability. This analysis requires enormous amounts of memory to hold the data, a huge number of processors, and high-speed processing to crunch the data and get its essence. An example is the analysis of geospatial data captured by satellite to identify weather patterns and make related predictions.

1.3 Drivers for Business Analytics

The following are the growth drivers for business analytics:

  • Increasing numbers of relevant computer packages and applications. One example is the R programming environment with its various data sets, documentation on its packages, and ready-made algorithms.

  • Feasibility to consolidate related and relevant data from various sources and of various types (data from flat files, data from relational databases, data from log files, data from Twitter messages, and more). An example is the consolidation of information from data files in a Microsoft SQL Server database with data from a Twitter message stream.

  • Growth of seemingly infinite storage and computing capabilities by clustering multiple computers and extending these capabilities via the cloud. An example is the use of Apache Hadoop clusters to distribute and analyze huge amounts of data.

  • Availability of many easy-to-use programming tools, platforms, and frameworks (such as R and Hadoop).

  • Emergence of many algorithms and tools to effectively use statistical and mathematical concepts for business analysis. One example is the k-means algorithm used for partition clustering analysis.

  • The need for business survival and growth techniques in our highly competitive world. The highly competitive nature of business requires each company to deep-dive into data in order to understand customer behavior patterns and take advantage of them.

  • Business complexity arising from globalization. An economic or political situation in a particular country can affect the sales in that country, for example.

A note of caution here: not all of these problems require complicated analytics. Some may be easy to understand and to solve by using techniques such as visual depiction of data.

Now let’s discuss each of these drivers for business analytics in more detail.

1.3.1 Growth of Computer Packages and Applications

Computer packages and applications have completely flooded modern life. This is true at both an individual and business level. This is especially true with our extensive use of smartphones, which enable the following:

  • Communication with others through e-mail packages

  • Activities in social media and blogs

  • Business communications through e-mail, instant messaging, and other tools

  • Day-to-day searches for information and news through search engines

  • Recording of individual and business financial transactions through accounting packages

  • Recording of our travel details via online ticket-booking web sites or apps

  • Recording of our various purchases in e-commerce web sites

  • Recording our daily exercise routines, calories burned, and diets through various applications

We are surrounded by many computer packages and applications that collect a lot of data about us. This data is used by businesses to make them more competitive, attract more business, and retain and grow their customer base. With thousands of apps on platforms such as Android, iOS, and Windows, the capture of data encompasses nearly all the activities carried out by individuals across the globe (who are the consumers for most of the products and services). This has been enabled further by the reach of hardware devices such as computers, laptops, mobile phones, and smartphones even to remote places.

1.3.2 Feasibility to Consolidate Data from Various Sources

Technology has grown by leaps and bounds over the last few years. It is now easy for us to convert data from one format to another and to consolidate it into a required format. The growth of technology coupled with almost unlimited storage capability has enabled us to consolidate related or relevant data from various sources—right from flat files, to database data, to data in various formats. This ability to consolidate data from various sources has provided a great deal of momentum to effective business analysis.

1.3.3 Growth of Infinite Storage and Computing Capability

The memory and storage capacity of individual computers has increased drastically, whereas external storage devices have provided a significant increase in storage capacity. This has been augmented by cloud-based storage services that can provide a virtually unlimited amount of storage. The growth of cloud platforms has also contributed to virtually unlimited computing capability. Now you can hire the processing power of multiple CPUs, coupled with huge memory and huge storage, to carry out any analysis— however big the data is. This has reduced the need to rely on a sampling of data for analysis. Instead, you can take the entire population of data available with you and analyze it by using the power of cloud storage and computing capabilities.

1.3.4 Easy-to-Use Programming Tools and Platforms

In addition to commercially available data analytics tools, many open source tools or platforms such as R and Hadoop are available. These powerful tools are easy to use and well documented. They do not require high-end programming experience but usually require an understanding of basic programming concepts. Hadoop is especially helpful in effective and efficient analysis of big data.

1.3.5 Survival and Growth in the Highly Competitive World

Businesses have become highly competitive. With the Internet easily available to every business, every consumer has become a target for every business. Each business is targeting the same customer and that customer’s spending capability. Each business also can easily reach other dependent businesses or consumers equally well. Using the Internet and the Web, businesses are fiercely competing with each other; often they offer heavy discounts and cut prices drastically. To survive, businesses have to find the best ways to target other businesses that require their products and services as well as the end consumers who require their products and services. Data or business analytics has enabled this effectively.

1.3.6 Business Complexity Growing out of Globalization

Economic globalization that cuts across the boundaries of the countries where businesses produce goods or provide services has drastically increased the complexities of business. Businesses now have the challenge of catering to cultures that may have been previously unknown to them. With the large amount of data now possible to acquire (or already at their disposal), businesses can easily gauge differences between local and regional cultures, demands, and practices including spending trends and preferences.

1.4 Applications of Business Analytics

Business analytics has been applied effectively to many fields, including retail, e-commerce, travel (including the airline business), hospitality, logistics, and manufacturing. Furthermore, business analytics has been applied to a whole range of other businesses, including predictive failure analysis of machines and equipment.

Business analytics has been successfully applied to the fields of marketing and sales, human resources, finance, manufacturing, product design, service design, and customer service and support. In this section, we discuss some of the areas in which data/business analytics is used effectively to the benefit of the organizations. These examples are only illustrative and not exhaustive.

1.4.1 Marketing and Sales

Marketing and sales teams are the ones that have heavily used business analytics to identify appropriate approaches to marketing in order to reach a maximum number of potential customers at an optimized or reduced effort. These teams use business analytics to identify which marketing channel would be most effective (for example, e-mails, web sites, or direct telephone contacts). They also use business analytics to determine which offers make sense to which types of customers (in terms of geographical regions, for instance) and to specifically tune their offers.

A marketing and sales team might, for example, determine whether people like adventurous vacations, spiritual vacations, or historical vacations. That data, in turn, can provide the inputs needed to focus marketing and sales efforts according to those specific interests of the people— thus optimizing the time spent by the marketing and sales team. In the retail business, this can enable retail outlets (physical or online) to market products along with other products, as a bundled offer, based on the purchasing pattern of consumers. In logistics, which logistics company provides the services at what mode sticking to delivery commitments is always an important factor for the businesses to tie up for their services. An airline could present exciting offers based on a customer’s travel history, thus encouraging that customer to travel again and again via this airline only, and thereby creating a loyal customer over a period of time.

1.4.2 Human Resources

Retention is the biggest problem faced by an HR department in any industry, especially in the support industry. An HR department can identify which employees have high potential for retention by processing employee data. Similarly, an HR department can also analyze which competence (qualification, knowledge, skill, or training) has the most influence on the organization’s or team’s capability to deliver quality within committed timelines.

1.4.3 Product Design

Product design is not easy and often involves complicated processes. Risks factored in during product design, subsequent issues faced during manufacturing, and any resultant issues faced by customers or field staff can be a rich source of data that can help you understand potential issues with a future design. This analysis may reveal issues with materials, issues with the processes employed, issues with the design process itself, issues with the manufacturing, or issues with the handling of the equipment installation or later servicing. The results of such an analysis can substantially improve the quality of future designs by any company. Another interesting aspect is that data can help indicate which design aspects (color, sleekness, finish, weight, size, or material) customers like and which ones customers do not like.

1.4.4 Service Design

Like products, services are also carefully designed and priced by organizations. Identifying components of the service (and what are not) also depends on product design and cost factors compared to pricing. The length of warranty, coverage during warranty, and pricing for various services can also be determined based on data from earlier experiences and from target market characteristics. Some customer regions may more easily accept “use and throw” products, whereas other regions may prefer “repair and use” kinds of products. Hence, the types of services need to be designed according to the preferences of regions. Again, different service levels (responsiveness) may have different price tags and may be targeted toward a specific segment of customers (for example, big corporations, small businesses, or individuals).

1.4.5 Customer Service and Support Areas

After-sales service and customer service is an important aspect that no business can ignore. A lack of effective customer service can lead to negative publicity, impacting future sales of new versions of the product or of new products from the same company. Hence, customer service is an important area in which data analysis is applied significantly. Customer comments on the Web or on social media (for example, Twitter) provide a significant source of understanding about the customer pulse as well as the reasons behind the issues faced by customers. A service strategy can be accordingly drawn up, or necessary changes to the support structure may be carried out, based on the analysis of the data available to the industry.

1.5 Skills Required for a Business Analyst

Having discussed drivers and applications of business analytics, let’s now discuss the skills required by a business analyst. Typically, a business analyst requires substantial knowledge about the following:

  • The business and problems of the business

  • Data analysis techniques and algorithms that can be applied to the business data

  • Computer programming

  • Data structures and data-storage or data-warehousing techniques, including how to query the data effectively

  • Statistical and mathematical concepts used in data analytics (for example, regression, naïve Bayes analysis, matrix algebra, and cost-optimization algorithms such as gradient descent or ascent algorithms)

Now let’s discuss these knowledge areas in more detail.

1.5.1 Understanding the Business and Business Problems

Having a clear understanding of the business and business problems is one of the most important requirements for a business analyst. If the person analyzing the data does not understand the underlying business, the specific characteristics of that business, and the specific problems faced by that business, that person can be led to the wrong conclusions or led in the wrong direction. Having only programming skills along with statistical or mathematical knowledge can sometimes lead to proposing impractical (or even dangerous) suggestions for the business. These suggestions also waste the time of core business personnel.

1.5.2 Understanding Data Analysis Techniques and Algorithms

Data analysis techniques and algorithms must be applied to suitable situations or analyses. For example, linear regression or multiple linear regression (supervised method) may be suitable if you know (based on business characteristics) that there exists a strong relationship between a response variable and various predictors. You know, for example, that geographical location, proximity to the city center, or the size of a plot (among others) has a bearing on the price of the land to be purchased. Clustering (unsupervised method) can allow you to cluster data into various segments. Using and applying business analytics effectively can be difficult without understanding these techniques and algorithms.

1.5.3 Having Good Computer Programming Knowledge

Good computer knowledge is required for a capable business analyst, so that the analyst doesn’t have to depend on other programmers who don’t understand the business and may not understand the statistics or mathematics behind the techniques or algorithms. Having good computer programming knowledge is always a bonus capability for business analysts, even though it is not mandatory because analysts can always employ a computer programmer. Computer programming may be necessary to consolidate data from different sources as well as to program and use the algorithms. Platforms such as R and Hadoop have reduced the pain of learning programming, even though at times we may have to use other complementary programming languages (for example, Python) for effectiveness and efficiency.

1.5.4 Understanding Data Structures and Data Storage/Warehousing Techniques

Knowledge of data structures and of data storage/warehousing techniques eases the life of a business analyst by eliminating dependence on database administrators and database programmers. This enables business analysts to consolidate data from varied sources (including databases and flat files), put them into a proper structure, and store them appropriately in a data repository. The capability to query such a data repository is another additional competence of value to any business analyst. This know-how is not a must, however, because a business analyst can always hire someone else to provide this skill.

1.5.5 Knowing Relevant Statistical and Mathematical Concepts

Data analytics uses many statistical and mathematical concepts on which various algorithms, measures, and computations are based. A business analyst should have good knowledge of statistical and mathematical concepts in order to properly use these concepts to depict, analyze, and present the data and the results of analysis. Otherwise, the business analyst can lead others in the wrong direction by misinterpreting the results because the application of the technique or interpretation of the result itself was wrong.

1.6 Life Cycle of a Business Analytics Project

Figure 1-1 illustrates the typical steps of a business analytics project. These steps are as follows:

  1. Start with a business problem and all the data considered as relevant to the problem.

    or

    Start with the data to understand what patterns you see in the data and what knowledge you can decipher from the data.

  2. Study the data and then clean up the data for missed data elements or errors.

  3. Check for the outliers in the data and remove them from the data set to reduce their adverse impact on the analysis.

  4. Identify the data analysis technique(s) to be used (for example, supervised or unsupervised).

  5. Analyze the results and check whether alternative technique(s) can be used to get better results.

  6. Validate the results and then interpret the results.

  7. Publish the results (learning/model).

  8. Use the learning from the results / model arrived at.

  9. Keep calibrating the learning / model as you keep using it.

A978-1-4842-2514-1_1_Fig1_HTML.jpg
Figure 1-1. Life cycle of a business analysis project

1.7 The Framework for Business Analytics

As discussed earlier in this chapter, statistics contributes to a significant aspect of effective business analysis. Similarly, the knowledge discovery enablers such as machine learning have contributed significantly to the application of business analytics. Another area that has given impetus to business analytics is the growth of database systems, from SQL-oriented ones to NoSQL ones. All these combined together, along with easy data visualization and reporting capabilities, have led to a clear understanding of what the data tells us and what we understand from the data. This has led to the vast application of business analytics to solve problems faced by organizations and to drive a competitive edge in business through the application of this understanding.

There are umpteen number of tools available to support each piece of the business analytics framework. Figure 1-2 presents some of these tools, along with details of the framework.

A978-1-4842-2514-1_1_Fig2_HTML.jpg
Figure 1-2. Business analytics framework

1.8 Summary

To start with you went through an introduction as to how knowledge has evolved. You also went through many scenarios in which data analytics helps individuals. Many examples of business analytics helping businesses to grow and compete effectively were illustrated to you. You were also provided with examples as to how business analytics results are used by businesses effectively.

Next, you were taken through the objectives of this book. Primarily, this book is intended to be a practical guidebook enabling you to acquire necessary skills. This is an introductory book. You are not going to focus on terminology here but are going to look into the practical aspects. This book will also show how to use R for business analytics.

Next you went through the definitions of data analytics, business analytics, data science, and big data analytics. You were provided with these definitions in layman’s terms in order to remove any possible confusion about the terminology.

Then you explored important drivers for business analytics, including the growth of computer packages and applications, the feasibility of consolidating data from various sources, the growth of infinite storage and computing capabilities, the increasing numbers of easy-to-use programming tools and platforms, the need for companies to survive and grow among their competition, and business complexity in this era of globalization. Next you got introduced to the applications of business analytics with examples in certain fields.

You briefly went through the skills required for a business analyst. In particular, you understood the importance of the following: understanding the business and business problems, data analysis techniques and algorithms, computer programming, data structures and data storage/warehousing techniques, and statistical and mathematical concepts required for data analytics.

Finally, you also briefly went through the life cycle of the business analytics project.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.14.144.216