7

Big Data by Industry

By now, it should be clear that Big Data affects all organizations, small or big, in all industries. Let's dive deeper into 18 important industries and see what Big Data can do for them. The examples will help you understand the possibilities Big Data offers and perhaps will provide you with some out-of-the-box business use cases you can implement in your organization.

AGRICULTURE INDUSTRY

The agriculture industry has seen many changes in the past one hundred years. Since the birth of industrial agriculture in 1900, we have moved into the era of digitally enhanced agriculture, in which everything done before seeding and up to harvesting produces data that is used for analyses.1 Big Data has already transformed the agriculture industry, but in the coming decade, this will become increasingly visible in all areas of agriculture throughout the world. Three factors will be most affected by the opportunities presented by Big Data:

  1. Machines: Improved efficiency and reduced costs of operating.
  2. Crops and animals: Improved productivity and efficiency.
  3. Weather and pricing: Mitigation of the effect of weather conditions and optimized pricing.

The Machines

The Internet of Things, discussed in Chapter 3, will greatly change agricultural machines, including tractors, soil cultivating equipment, agricultural sprayers, harvesters, and cow-milking machines. The inclusion of sensors will provide the farmer with a lot of information in real time 24/7—without the farmer actually being present. These smart machines will talk to each other, and they will be able to anticipate problems and take appropriate action before actual damage is done. When a problem does occur, the farmer can immediately see the problem. If a problem is more serious, a service employee can visit the farmer before the equipment breaks down, thereby minimizing the downtime of machines. In addition, the effective use of sensors can increase productivity by streamlining many agricultural processes.

Apart from predicting failures and providing maintenance, sensors can also save farmers money spent on fuel. For example, computers can determine when and where conditions are best for driving when working the land, information that is especially useful for farmers with large acreages. When combined with machine-to-machine communication, this will help the farmer control the growing number of machines. Because machines communicate with each other, they know each other's positions and make necessary adjustments. With smart machines, one person can manage an entire fleet and still save time and money. In addition, diagnostics manage the machines in real time to ensure that the optimal settings are used to maximize productivity. All the data that is collected can be analyzed, so the farmer understands how the machines are operating and how operations can be improved even further.

The Crops and the Animals

Although optical, mechanical, electromagnetic, and radiometric sensors have been tested for almost a decade, Big Data technologies really make precision agriculture interesting. Precision agriculture means recognizing, understanding, and exploiting information that quantifies variations in the soil and crops.2

When ground sensors are combined with a smart irrigation system, farmers can optimize productivity. The irrigation system knows exactly which crops need what supplements, when, and how much. This means providing just the right amount of fertilizer, which saves money and increases output. To understand soil conditions and improve output even more, sensors can be implanted in the ground to better analyze soil conditions.3 Algorithms will improve output by telling the farmer when to plant which crops where, as well as when is the best time to plough or harvest.

In addition to improving crops, farm animals will also benefit from Big Data technology. Sensors in the sheds will automatically weigh animals and adjust feeding. The sensors will then adjust feeding by evaluating the real-time condition. The animals will receive the right food in the right amount at the right moment. Chips in the animals will also monitor their health. Sick animals will automatically receive medication in their food. Conditions in the shed can automatically be adjusted. For example, if the animals are stressed, sensors will indicate the need for measures to ease the problem. Special collars will also help farmers with massive plots of land track their herds on their smartphones.4 This might sound far fetched or scary, but if done correctly it will provide a lot of benefits for farmers.

The Weather and Pricing

Weather conditions can severely affect output. Although local weather conditions are difficult to predict, the right algorithms can warn farmers when to harvest or plough because of upcoming (extreme) weather. This can increase output.

If the data is combined with real-time market information, farmers can control price fluctuations better. The volatility in the agriculture market can be substantial. Speculation can increase or decrease profits on the sale of crops. With predictive analytics, the price of a certain crop can be determined upfront in each specific location. This will help the farmer get the right price for the right crop at the right moment in time at the right location.

Big Data turns the traditional agriculture industry upside down. Although investments for farmers can be substantial, the potential benefits of applying Big Data technologies in the field are enormous.

JOHN DEERE IS REVOLUTIONIZING FARMING WITH BIG DATA

John Deere is using Big Data to step into the future of farming. In 2012, the company released several products that connect John Deere's equipment to owners, operators, dealers, agricultural consultants—and other machines. This connectivity helps farmers improve productivity and increase efficiency.

John Deere added sensors to its machines to help farmers manage their fleets, decrease downtime of tractors, and save on fuel. Information from the sensors is combined with historical and real-time data regarding weather predictions, soil conditions, crop features, and other datasets. The data is then analyzed, and the farmer can retrieve the information on the MyJohnDeere.com platform, as well as on the iPad and iPhone app known as Mobile Farm Manager. This will help farmers figure out which crops to plant where and when, when and where to plough, where the best returns can be made from the crops, and even which path to follow when plowing. All this will lead to higher production and increased revenue.

Although John Deere claims that it does not yet use as many datasets as Walmart and Amazon, the company is collecting and processing massive amounts of data to truly revolutionize farming.5 To cope with all this data, it decided to use the open-source programming language R, which can be programmed to forecast demand, predict crop yield, define land area and usage, and anticipate the demand for (spare) parts for combines. Employees use the Open Database Connectivity to import the multiple data sources and data types. R is then used to export this data to different channels.

One such channel is FarmSight, which was launched in March 2011.6 FarmSight is designed to help farmers increase productivity in three ways.

  1. Machine optimization monitors machine productivity and tries to figure out how to make them more efficient. It uses proactive diagnostics on service issues, such as filter changes and other maintenance items, to help reduce downtime and keep machines up and running.
  2. Farming logistics data helps farmers control growing farms and the ever-expanding machine fleet. The objective is to improve machine-to-machine communications.
  3. Decision support helps farmers make better decisions that prevent mistakes and increase efficiency.

Another channel is the MyJohnDeere.com platform, which is a portal through which users can manage their fleets, see weather forecasts, access any application (including third-party applications of third-party machines), and view financial information related to their farming.7 Users have remote display access, so a consultant is able to see what is going on from a distance.

A third channel is the FarmSight Mobile Farm Manager, which provides farmers with all the information they need on the go.8 The Mobile Farm Manager gives users access to historical, as well as real-time, field information, evaluates soil samples, and enables users to share information directly with trusted advisers for live remote advice while in the field. They can even view their operation maps and reports from any year on their iPads or iPhones.

John Deere has additional plans for improving technology further by using Big Data as much as possible. The company wants to help farmers plan, run, and analyze their entire farming operation as efficiently as possible.

AUTOMOTIVE INDUSTRY

Any industry that builds products using many different moving parts can be improved with sensors. Thus, the automotive industry is a natural. Sensors can be used in cars, motorcycles, and trucks. Together with information collected by satellite navigations, such as TomTom, and traffic conditions, cars are becoming smarter.9 Before long, they will be able to drive independently.

Sensors provide a lot of possibilities. With them, cars have the potential to flag abnormal events in real time and proactively take corrective actions before performance problems arise. The car will inform the owner about an impending problem, and even make an appointment with the nearest car maintenance workshop, adding this to the owner's daily schedule.

Onboard sensors also give manufacturers information about how cars are used. How fast does someone drive or brake, and how did the car respond. The ability to monitor the performance of the car 24/7 will help manufacturers quickly identify the areas for improvement and adjustment. The time to bring new cars to market will be improved drastically.

The self-driving car from Google is already a true data creator. With sensors that enable the car to operate without a driver, it generates nearly 1 gigabyte of data every second.10 All that data is analyzed to decide where to drive and how fast. It can even detect a new cigarette butt thrown on the ground or anticipate when a pedestrian might suddenly appear in the road. Imagine the amount of data that will be created every year. On average, Americans drive 600 hours per year in their cars.11 That equals 2,160,000 seconds or approximately 2 petabytes of data per car per year. With the number of cars worldwide about to surpass one billion, it is almost unimaginable how much data will be created when Google's self-driving car becomes common on the streets.12

Improve the Supply Chain

Using public data, as well as data from the CRM database, car manufactures can predict which cars will be needed when and where. If sensors tell them when cars are about to break down and which parts are needed at what location, they can better predict their parts inventory. This will decrease inventory, optimize the supply chain for manufacturers and dealers, and improve customer satisfaction.

Dealer satisfaction can also be monitored and improved using the same data, as well as information from the blogosphere and social networks. When this happens in real time, complaints and possible crises can be avoided, thereby saving money and creating happy customers.

Improve the Customer Experience

Several data sources can be used to understand and monitor driving behavior13 These (real-time) insights can then be applied to the (re)development of cars to optimize and improve the driving experience. Sensors within the seats can monitor how someone drives, including which radio channels he or she listens to and other information, such as air-conditioning usage. This data can be used to build a personal profile that is automatically loaded when a driver “registers” or “logs-in” on a (new) car.

Algorithms can improve driver behavior by providing driving performance improvement recommendations directly on the dashboard and/or the driver's smartphone app tailored to the situation at that moment.

All that data can be combined with social public data to better understand customers. Such public data, including geographic locations, housing, and other demographics, can give a 360-degree view of customers.

Save Money

When cars are connected to each other and the Internet, they can talk to each other. Cars can “see” where other vehicles are on the road and take action if needed. The cars know when a traffic jam is coming up, can suggest a different route, and optimize routing to prevent accidents and save fuel.

The same sensors can track when and where cars are stolen and can easily locate them. Algorithms can also detect car theft by notifying the police when the driving pattern unexpectedly changes.

The automotive industry can potentially benefit significantly by using Big Data technologies. Improvements include better driving behavior, improved cars, fewer accidents, and happier customers. Cars are becoming information-driven machines, and it is for good reason that General Motors insourced 10,000 IT staff members to create a completely information-centric organization.14

BIG DATA IS IN THE DRIVER'S SEAT AT HERTZ

How do you keep track of tens of thousands of customer touch points every day, divided over 8,300 locations in 146 countries? Hertz used to do this by manually registering customer satisfaction via local paper surveys that took weeks to analyze. Thousands of surveys were collected daily, including comments from the website, emails, and other messages. All of these valuable customer insights could not be used properly, as location managers had to process them manually, which was a labor-intensive task. Whenever action was required, it was usually discovered too late, and a customer was lost.

Since Hertz implemented a Big Data strategy, it has turned all customer touch points into unique moments. With the instant feedback Hertz receives from around the world, immediate action can be taken to improve service and thereby retain customers.

To really start using that valuable knowledge, Hertz brought in jShare (a team from IBM that leverages the latest technologies) and Mindshare to bring the process under control.15 The plan was to enhance the collection of customer data and enable Hertz to perform all sorts of real-time analyses on the captured unstructured data. With this new software, Hertz can now make real-time adjustments in its operation to improve customer satisfaction levels, based on the real-time Net Promoter Score. In particular, Hertz used Mindshare's sentiment-based tagging solution to understand in real-time customer opinions in each of its 8,300 locations worldwide.16

Further, Hertz developed a “Voice of the Customer” program. This analytics system automatically captures customer experiences in real time, transforming the information into actionable intelligence. The system automatically categorizes comments that are received via email or online, as well as flags customers who mention #1 Club Gold or request a call back.

In the competitive market of rental cars, a company that understands customer feedback and can react to it in real time has a competitive advantage. The system allows Hertz to take instant action when signals start pouring in that service is low at a certain location. The Philadelphia location, for example, showed that delays for returns were occurring at specific times during the day on a regular basis. As soon as this was signaled, the company investigated the matter and solved the problems.

By applying advanced analytics solutions to its massive datasets, Hertz has cut the process time of information drastically, which has resulted in improved service at all its locations.

CONSUMER GOODS

The consumer goods industry is perfectly suited to the generation of unfathomable amounts of data in the coming years. Apart from the billions of 360-degree customer profiles that will be created by millions of companies, it could produce even more data if all those consumer goods were connected to the Internet of Things.

Improve Customer Satisfaction

For consumer organizations it is vital to know who the customer is.17 Big Data can help create a 360-degree view of customers. By combining information from social networks (Facebook, Twitter, Tumblr, Instagram, LinkedIn, etc.), the blogosphere, (online) surveys, and online click behavior, as well as sales, product sensors, and public and open data, companies can develop detailed personas and microsegments to better target customers, improve conversion, and increase sales. It is important to understanding signals that reveal what consumers are seeing, doing, thinking, and sharing at any point in time. This is valuable to help drive relationships. When a customer gets in contact with an organization, the company knows who the customer is. This will improve customer service and, as such, customer satisfaction.

On the other hand, consumers need to feel that organizations are willing to listen to them when they have a question or comment. These observations are not always addressed directly to an organization via mail or a call center, but more often they are found on the blogosphere or a social network. By using Big Data tools, organizations can identify these questions and comments and respond to them accordingly in a timely manner.

Innovate Faster and Better

Developing a new product or updating an existing one generally takes a lot of time, because it is important to understand the market, to know what customers really want, and to test various possibilities. With Big Data, this information can be available in real time, 24 hours a day, thereby providing a company with insights from the moment a product is used for the first time. Feedback from sensors and analyses of public data streams enable companies to discover mistakes easily and to also identify requirements for future product updates.

Customer interactions with a company also provide valuable feedback that organizations can use to enhance innovation and product development. Whether this feedback is provided via the website or the call center (every conversation recorded by a call center can be turned into text, which could then be data mined for insights), it can be turned into market information for product improvements. This offers very useful information about what and how consumers feel about the brand and its products, as well as what they deem important. This lessens the need for expensive market research, while simultaneously significantly reducing the time to market of new and/or improved products.

In addition, prototyping can also be done virtually. Companies can test thousands of variations on a product within hours—or even minutes—instead of months and distill the best option for the largest target group, thereby reducing the time to market even more.

In highly competitive markets, it is of course also very important to understand what the competition is or will be doing. If a competitor introduces a new product, it is important to know how that product is perceived in real time. What is the sentiment? What are the complaints? What changes do consumers want to see in future updates? What is the price and how does the competition react to price fluctuations? All this information can be used to improve products and better react to what the competition is doing.

Optimize Sales and the Supply Chain

Real-time insights in point-of-sale (POS) data from all retailers can provide valuable insights to consumer goods manufacturers. In particular, they can learn how much of the product is sold at any time in any location around the world. Algorithms can automatically detect anomalies and warn the head office if action is required. The effect of price changes and the effects of marketing campaigns can also be evaluated in real time, giving the marketing team the opportunity to make changes instantly if the outcome is not as expected.

To anticipate expected demand for a product, a manufacturer can use different sales data, such as POS analysis, news reports, market information, competitor assessments, and weather conditions. These can provide valuable insights. By using predictive algorithms, the inventory can be optimized for just-in-time (JIT) inventory based on real-time demand forecasts. Cooperation with retailers can help shape demand at the store level to deliver an improved customer experience, which makes retailers happy.

Continuous monitoring of all equipment used in the operations can improve operational efficiencies.18 Sensors added to equipment used in the production process give organizations a better understanding of how the machines are being used and where efficiencies can be achieved.

In addition, by using Big Data analyses, companies can predict upcoming price fluctuations and adjust purchases accordingly. These fluctuations can also be anticipated by following important parameters around the world that have an affect on the price. When these datasets are combined, predictive analytics can forecast price volatilities, demand, or shortages.

Make Knowledge Transparent

Consumer goods organizations tend to be large (international) businesses.19 Most of the time in such large organizations, employees have difficulties knowing whom to ask for the right information. Algorithms can be used to make the knowledge within an organization accessible and searchable for everyone. Key influencers can be easily identified, along with specialists in specific areas. This would greatly improve the efficiency of the organization and reduce costs. Different departments within the same organization should not each reinvent the wheel. With all knowledge indexed, employees can simply perform a search query over the entire organization and find the best information and or colleague to consult.

Companies producing consumer goods should embrace Big Data, as the benefits are plentiful. This is especially true for large multinational consumer goods companies.

APPLE SWIMS IN MASSIVE AMOUNTS OF BIG DATA

It is no surprise that Apple deals with Big Data, but the specifics of how are not well known. However, with over 60 billion apps downloaded from its app store, Apple is swimming in massive amounts of data that can, and will be, analyzed for additional insights.20

Although Apple is a bit more secretive than other companies, such as Google, that are more willing to share some of their Big Data innovations, Apple does use some of the Big Data technologies, such as Hadoop or Large Scale Data Warehousing. One of the uses of Big Data is to understand how its own applications are used on the iPhone, iPads, or Macbooks, as mentioned by Jeff Kelly, Principal Research Contributor at Wikibon.21,22 Using all the data that Apple has collected, it can test new features in their applications relatively easily and do A/B tests to improve the experience. The company uses data to understand how people are applying its applications. If it is a game application, the data can be used to understand where there is a bottleneck or where many people get stuck. That data will then be used to improve the gaming experience. In addition, feedback and reviews provided by users is also used in the App store to improve its products.

To store all that data, Apple uses Teradata equipment, and it now operates a multiple-petabyte Teradata system that is mostly driven by the launch of iCloud in 2011.23

Although Apple is gearing up for the Big Data era fast, it does not mean that the company has it all set out correctly. An article in Forbes in 2012 reported that one of the likely causes of Apple's failure to enter the Mobile Maps Application market was its struggle to catch up with Big Data.24 While Google had already opened up its mapping functionality in 2005, Apple had to catch up in a much shorter timeframe, which turned out to be too short to develop a successful product.

It is obvious that Apple deals with massive amounts of data. Each of its products communicates back to the Apple data warehouse continuously.25 Just think of the amount of data generated via Siri and stored for two years.26 Then add the data collected via iTunes and iTunes Match, iCloud, and all other software and hardware. Although this is a lot of data, it is nothing spectacular or unexpected.

Clearly, Apple has been working with Big Data for some time. How much and how they do it exactly is difficult to find out. Whether Apple will reinvent Big Data with a new device, such as the iWatch, remains speculative. One thing, however, is for sure. Apple generates and collects vast amounts of data using several Big Data technologies and applies the results of data analyses in the development of products and services to improve the customer experience.

EDUCATION INDUSTRY

New technologies allow schools, colleges, and universities to analyze absolutely everything that happens with students, teachers, and employees, from student behavior to testing results to career development to educational needs based on changing societies. A lot of this data has already been collected and is used for statistical analysis by government agencies, such as the National Center for Educational Statistics.27 With more and more online courses and the rise of Massive Open Online Courses (MOOCs), all that data acquires a completely new meaning. Big Data allows for very exciting changes in education that will revolutionize the way students learn and teachers teach. To fuel this trend, the U.S. Department of Education (DOE) was one of a host of agencies to share a $200 million initiative to begin applying Big Data analytics to their respective functions.28,29

Improve Student Results

The overall goal of Big Data within the educational system should be to improve student results. Better students are good for society, governments, and organizations, as well educational institutions. Currently, the answers to assignments and exams are the only measurements of student performance. During his or her student life, however, every student creates a unique data trail. This trail can be analyzed in real time to deliver an optimal learning environment for the student, as well to gain a better understanding of individual student behavior.

It is now possible to monitor all student actions. How long they need to answer a question? Which sources do they use? Which questions do they skip? How much research did they do? What is the relation between answers to questions? Which tips worked best for which student? Answers to questions can be checked instantaneously and automatically (except perhaps for essays), giving instant feedback to students.

In addition, Big Data can create more productive groups of students. Students often work in groups where the members do not complement each other. By using algorithms, it will be possible to determine the strengths and weaknesses of each student and create stronger groups that will allow students to have a steeper learning curve and deliver better results.

Create Mass-Customized Programs

All this data can help create customized programs for each student, even if a college or university has tens of thousands of enrollees. These will be created with blended learning, which combines online and offline courses. Students will have the opportunity to develop their own personalized program, following classes that interest them and working at their own pace, while having the possibility of (offline) guidance by professors. We already see this happening in the MOOCs that have been developed around the world. When Andrew Ng taught the Machine Learning class at Stanford University, 400 students generally participated.30,31 When it was developed as an MOOC at Coursera in 2011, it attracted 100,000 students, who generated a lot of data.32 It would take Andrew Ng 250 years to teach the same number of students. Being able to cater to 100,000 students at once also requires the right tools to process, store, analyze, and visualize all the data involved in the course. At the moment, these MOOCs are still mass produced, but in the future they can be mass customized.

With so many students participating in a MOOC, universities have the opportunity to find the best students from all over the world when making scholarship decisions. This will increase the overall ranking of a university.

Improve the Learning Experience in Real Time

When students start working independently in their customized blended learning program, students will teach themselves and be able to customize their courses. The professor can monitor students in real time and start more interesting and deeper discussions of topics of choice. This will give students an opportunity to gain a better understanding of the subjects.

When students are monitored in real time, digital textbooks and course outlines can be improved. Algorithms can monitor how the students read the texts, including which parts are difficult, which are easy, and which are unclear. Changes can be based on how often a text is read, how long it takes to read a text, how many questions are asked about a specific topic, how many links are clicked for more information, and how many and which sentences are underlined. If this information is provided in real time, authors can change their textbooks to meet the needs of students, thereby improving the overall results.

Even more, Big Data can give insights on how each student learns. This is important because it affects the student's final grade. Some students learn very efficiently, while others may be extremely inefficient. When course materials are available online, how a student learns can be monitored. This information can be used to provide a customized program for the student or provide real time feedback about how to become more efficient and thus improve their results.

Reduce Dropouts, Increase Results

All these analyses will improve student results and perhaps also reduce dropout rates. When students are closely monitored, receive instant feedback, and are coached based on their personal needs, it can help to reduce the number of dropouts.33 which benefits educational institutions and society.

Educational institutions using predictive analytics on all the data that is collected can gain insights into future student outcomes. These predictions can be used to change a program if negative results are predicted or even run scenario analyses on a program before it starts. Universities and colleges will become more efficient in developing programs that will increase results, thereby minimizing trial and error.

After graduation, students can continue to be monitored to see how they perform in the job market. When this information is made public, it will help future students choose the right university. Big Data will revolutionize the learning industry in the coming years. More and more universities and colleges are already turning to Big Data to improve overall student results. Smarter students who study faster will have a positive effect on organizations and society.

PURDUE UNIVERSITY ACHIEVES REMARKABLE RESULTS WITH BIG DATA

Purdue University, located in West Lafayette, Indiana, has over 40,000 students and 6,600 staff members.34 Founded in 1869, it was honored as having America's most innovative campus retention program in 2012.35 Purdue University has prepared for the future by adopting Big Data, which has already achieved significant results.

Purdue University developed Course Signals, a system that helps predict academic and behavioral issues and notifies teachers as well as students when action is required.36 The system ensures that each student achieves maximum potential, while decreasing the dropout rate and failing grades. The platform has been very successful and even won the Lee Noel and Randi Levitz Retention Excellence Award in 2012.37 Course Signals is commonly viewed as the best example of how analytics can be applied to higher education to help improve student results so that they graduate in a timely manner.

Course Signals combines predictive modeling with data mining on Blackboard.38 It uses various sources of data, such as course management and student information systems. As of week two in a semester, the data-mining tool Blackboard is able to interpret a student's academic preparation, engagement in, and effort within a course, and academic performance at a given point in time.39 To achieve this, it uses student characteristics and academic preparation, the effort students put into the course (sessions, quizzes, and discussions, as well as time required to perform a task), and (past) performance (grades received so far and book data).

The algorithm predicts a risk profile for each student based on an easy-to-understand system: green (there is a high likelihood of success in a particular course), yellow (there are potential problems), and red (there is a risk for failure). The fact that this prediction can be provided in only the second week of a semester gives the students ample opportunity to improve results. The system immediately provides the students with various resources that can help them improve in the course. The risk profile can be adjusted per course.

Course Signals provides the teachers with feedback as well. When they run the software, teachers can follow up with the students instantly as problems arise. Teachers can run the software as often as they want, but the prediction is only updated when the program is accessed. The system has been in use since 2007, and the results are remarkable: improved grades for students and higher retention rates. As the website of Course Signals states, “As and Bs have increased by as much as 28 percent in some courses. In most cases, the greatest improvement is seen in students who were initially receiving Cs and Ds in early assignments, and pull up half a letter grade or more to a B or C.”40

Purdue University has also partnered with EMC to solve its Big Data storage problems.41,42 All 40,000 students will receive 100 gigabytes of space (that is four petabytes of storage), and they will work together to develop new ways to process, analyze, transfer, and manage the massive research datasets in the field of bioinformatics, among others.

It may be obvious that Purdue University has identified Big Data as very important for research and education. The institution is currently recruiting for several faculty positions to stimulate and further develop its Big Data agenda.43 Purdue is far ahead of other educational institutions in adapting and implementing a Big Data strategy.

ENERGY INDUSTRY

Since the invention of the steam engine in the seventeenth century, we have come a long way in developing and supplying the world's energy. We have created networks that deliver electricity to 75 percent of the world.44 As a result of Big Data, we can now take the next step in the evolution of energy. Big Data can turn the existing old energy network into smart networks that understand individual energy consumption. This will increase efficiencies, lower prices and reduce our global carbon footprint.

The Smart Energy Grid

In the (near) future, more and more appliances will have sensors; they will become part of the Internet of Things. These sensors will enable bidirectional communication with energy companies, smart meters, and other in-home appliances.45 As a result, the energy consumption of individual devices can be monitored and adjusted, if desired. Energy organizations are already developing smart meters that record consumption of electric energy in intervals and send that information back to the energy company, which can then understand and predict demand.

When more devices have sensors, products will be able to talk to each other, as well as to the different networks. This will help energy companies manage the utilization across the network. This is especially useful and important for the future of electric cars. Energy grids will not be able to cope with peak demand when consumers plug in their electric cars at the same time when they get home from work.46 The more devices that have sensors and can talk to the energy network, the better energy companies can try to adjust to demand. A true smart grid, however, is still far away.

Such a smart grid will prevent energy losses and power outages across the network. Sensor systems called synchrophasors can monitor in real time the condition of power lines, collecting multiple data streams per second.47 The sensors can also detect how the energy travels across the network and where and when energy is lost. This information can detect blackouts and provide energy companies with the possibility of responding faster when an outage occurs.48

Battelle's Pacific Northwest Smart Grid Demonstration Project is such a pilot smart grid, with 60,000 participants across five states.49 This project aims to determine whether smart grids are as valuable as we think and whether they even make sense economically, as a smart grid requires substantial investments in hardware and software. Such a grid will also increase data tremendously, as meter reading will go from once a month to once every 15 minutes. This works out to 96 million reads per day for every million meters.50 The result is a 3,000-fold increase in data that can be overwhelming if not properly managed.

Change Consumer Behavior

Consumers who can manage their own energy consumption based on real-time data and energy prices will probably change their behavior. A smart meter can advise consumers when to use a device to take advantage of times when energy costs are lower based on prediction of estimated demand. This will help energy companies better manage the energy demand. If appliances (for example, washing machines) can determine themselves when the best time is to start working based on set price ranges and energy demand in the network, better results can be achieved.

Forecast Demand and Prices

A smart grid that has connections with millions of appliances can estimate energy consumption for large regions. Monitoring how devices are using energy provides valuable data that can be analyzed to predict energy needs and possible shortages. This information can be used to deliver the right amount of energy to the right place at the right time. It can help flatten the peaks of energy across time and place. Energy distribution organizations can thereby improve both customer satisfaction and regulatory compliance by reducing the number and duration of power outages.51 If energy companies can start making connections between network failures and events, they can also begin to understand the patterns that could indicate network problems, as well as isolate the locations and identify solutions in real time.52

When the smart grid flattens peaks in energy demand, the network will become more reliable. The problem with current networks is not so much capacity, but rather the ability to cope with peak demands. Smart grids can help minimize those extreme peaks that could cause power outages.

Big Data will also help optimize energy trading and thereby better anticipate price volatilities by performing almost real-time sophisticated analysis of the market based on thousands of different datasets.53 Predicting energy supply and demand will help organizations sell energy profitably and hedge if needed. By understanding the market, they can protect themselves against the fluctuating pricing of energy. In the end, they will be able to deliver the energy cheaper and increase customer satisfaction.

Future Investment and Maintenance

Insights generated by analyzing the vast amounts of sensor data that come from the network can provide extra information about the quality of the network itself.54 The data can help determine where future investments are necessary and where maintenance is needed. Instead of checking the network at regular intervals, Big Data tools can monitor equipment across the network in real time and take action only when it is required. This will save organizations money, as unnecessary investigations of possible problems will be prevented. The same information will inform companies which investments will yield the greatest returns.

Vattenfall, a Swedish power company, for example, has installed sensors in its wind turbines to predict when maintenance is needed.55 This saves the company money on needless helicopter flights to the turbines, unnecessary maintenance checks, and costly consulting.

Big Data can be also be used to improve wind turbine placement for optimal energy output. The constantly changing weather data on micro and macro levels can help organizations forecast the best spots for their wind turbines or solar systems by informing them where, on an annual basis, the most wind or sun is foreseen. When combined with structured and unstructured data, such as tidal phases, geospatial and sensor data, satellite images, deforestation maps, and weather modeling, it can help pinpoint the best place for installation.56

The Danish energy company Vestas Wind Systems, for example, uses IBM Big Data analytics to analyze many different datasets to determine the best place for each wind turbine. Placing wind turbines at the wrong spot can result in insufficient electricity to justify wind energy investments, as well as an increase in electricity costs.57

Therefore, Big Data most important effect on the energy sector is to develop smart energy grids so that existing networks become more efficient. This will reduce energy consumption and prices. Smarter energy management can keep overloaded grids running and prevent the need for building new and expensive power plants.58 Fewer power plants delivering more efficient energy at lower prices will affect our carbon footprint. So, in the end, it might turn out that Big Data is the most sustainable technology, reducing our impact on the environment even more than renewable energy sources.

FINANCIAL SERVICES INDUSTRY

If any industry can benefit from Big Data, it is the financial services industry. Of course, the first use that comes to mind is the ability to reduce risks, whether it be credit risk, default risk, or liquidity risk. But, many more possibilities exist. This industry, which has lost the trust of its customers since the crisis of 2008, can turn to Big Data to understand customers better and improve customer satisfaction. However, Big Data also offers other possibilities.

Reduce the Risk Companies Are Facing

Financial services firms have the ability to create an individual risk profile for each customer based on many variables such as past purchasing behavior, online and offline social networking, way of life, and information from public datasets. The more data that is used, the more accurate the risk profile, thereby decreasing credit default risks. The insurance company Insurethebox is a pioneer in using Big Data to reduce risk.59 Customers install a device in their cars that measures exactly how, when, and where each insured vehicle is driven. Based on this information, an algorithm determines the driver's behavior (including acceleration and deceleration behavior, among other factors) and a corresponding risk profile results. Each customer then receives a tailor-made insurance offer. The better a customers drives, the better the offer.

Algorithms can also analyze trades and conduct high-volume transaction in nanoseconds to optimize returns and reduce trading risks, while taking into account different market conditions, such as the pricing of products or future demand.60 Risky trades can be blocked automatically or the algorithm can flag high exposure in a changing market.

Big Data technologies also improve enterprise risk management. It is possible to add and use different data sets to determine the risk profile of a client who is requesting a loan. Factors such as claims, new business, investment management, or lifestyle of managers will provide a better picture regarding an organization's appetite for risk than a business plan based on many different and unknown future variables. The result will be more sophisticated and accurate predictive models that will reduce enterprise risks and help companies.

Fraud can also easily be detected with Big Data. For example, analyses could show when a customer suddenly deviates from a standard and long-standing pattern. Outlier detection is a powerful tool for discovering anomalies. Algorithms can easily detect instantly when a credit card is used in distant locations within a short time frame. This will indicate possible fraud detection. Even better is the ability to analyze a transaction based on different datasets while the transaction is taking place, allowing organizations to block a transaction before it has taken place rather than checking it afterward. Visa has implemented a system that can analyze 500 aspects of a transaction at once.61 With an annual incremental fraud opportunity of $2 billion, Visa has every reason to pay a lot of attention to Big Data.62

Big Data can also stop criminals who use the “old-fashioned” technique of robbing a bank. Big Data enables banks to understand which ATMs are the most likely to be targeted by criminals and how often. This is based on geographic location and many other datasets. Banks can then take appropriate measures to reduce the risk of a robbery or install smart cameras that can detect criminal activities before they happen.

With a 360-degree customer view, it is possible to understand the individual behavior of the customers and how that will impact future demand. This can be based on historical data (for example, monitoring how someone drives to determine car insurance) and risk models (for example, based on where someone lives in relation to the online presence of that person). This will tell a great deal about the potential risk of an individual and, as such, determine an appropriate price.

Regain Trust of Customers and Improve Customer Satisfaction

As with any industry, financial services firms also want to develop 360-degree profiles and microsegmentations to better understand and approach their customers. These companies offer many different products from insurance to credit cards to regular banking accounts. Analyzing the use of these products explains a lot about customer behavior. Although banks do not do this (or at least they say they do not), they have the capability to understand customers better than customers understand themselves based on payment information. For that reason, when Dutch payment provider Equens, which is the largest pan-European payment processor that takes care of all debit and credit transactions, decided to sell the transaction data, a lot of complaints appeared, and Equens had to withdraw the plan.63 Customer satisfaction can be improved in many different ways. Online tools can be made faster, additional services can be offered (such as instant search results when entering a bank account number), and customer service can be improved by ensuring that representatives who answer phone calls have access to all necessary details because internal systems are aligned and connected.

Develop Products Customers Need

Social media algorithms make it possible to understand the sentiment of customers in real time, providing information regarding how they think about or use new products and services or react to commercials. In addition, algorithms can be used to identify the most important influencers and how they think about products or services. An analysis of how products are used can give insights into how they need to be improved. For example, a bank can analyze how a mobile banking application is used based on location, time of day, where people click, how they move through the app, and how long they use the app or search for items within the app. This can indicate areas that need improvement. Instead of asking customers for feedback using long and expensive surveys, the feedback here is instantaneous and the customer is not bothered. This will help optimize the product.

Increase Sales and Reduce Costs

Humans live pretty predicable lives. As so many products are bought with a debit or credit card, it is possible to find patterns in consumer behavior based on where the cards are used, how much money was involved, and the purpose. When this behavior is monitored, financial services organizations can take action based on future events, such as selling additional products at the right time to the right customer, thereby increasing the conversion rate. An example would be a consumer who suddenly buys more groceries because a significant other moved in.

The financial services industry is also known for large legacy systems that are expensive to maintain. With a Big Data platform, it is possible to migrate legacy data to new platforms, while, in turn, adding valuable data to the analysis. This data can deliver new insights, which could then lead to new revenue opportunities or a reduction in operation costs. Operational efficiencies can further be improved when transaction and unstructured data, such as that collected from voice recognition, social comments, and emails, are monitored and analyzed to anticipate future workloads and change staffing needs accordingly in call centers and branches. In addition, when all customer contact points are collected and shown via one platform, staff will be able to help customers faster and better.

Big Data makes it possible to monitor the activities of clients that could indicate churn. If a customer suddenly starts engaging in fewer bank activities, it could be the result of dissatisfaction and indicate a customer who is about to churn. In addition, if the financial services industry knows who the influencers are within market segments, it can ensure that those influencers do not churn, because if they do, it is possible that others will follow. If these activities can be identified, financial services companies can take preventive actions to ensure customer loyalty.

The possibilities for the financial services industry are almost endless, but they come with significant privacy issues. As the example of Equens in The Netherlands showed, consumers are very sensitive regarding personal financial information that banks use to make more money, particularly since consumers have lost so much trust in financial institutions in past years. Therefore, even more than for the other industries, the four ethical guidelines are extremely important when implementing Big Data strategy.

MORGAN STANLEY UNDERSTANDS HOW TO LEVERAGE ITS BIG DATA

Morgan Stanley is an American global financial services firm headquartered in New York City. Through its subsidiaries and affiliates, it offers its products and services in 42 countries with more than 1,300 offices. Customers include corporations, governments, financial institutions, and individuals. Morgan Stanley has over $300 billion in assets under its care and employs over 60,000 employees globally. With such a large company, traditional databases and grid computing are insufficient to deal with the vast amounts of data created. To deal with the vast amounts of data, the company started using Hadoop back in 2010.64 In the past years, Morgan Stanley has have come a long way and is fully up-to-speed with Big Data.

In an article in Forbes, Gary Bhattacharjee, Executive Director of Enterprise Information Management at Morgan Stanley, describes the benefits for the firm by using Hadoop.65,66 Although the nature of the company means that much cannot be revealed, he does share some insights into how Hadoop helped Morgan Stanley create a scalable and vast solution for its portfolio analysis. Information that used to take months to amass can now be collected in real time as events happen. An example Bhattacharjee shares is that Morgan Stanley uses Hadoop to look at its entire web and database logs to discover problems. For example, when a market event occurs now, the company is capable of understanding the impact in real time. Problems are discovered in real time, and events is completely traceable as to who did what, how, and when, and what caused the issue.

When Morgan Stanley started with Hadoop, the company used 15-year-old commodity servers, on which Hadoop was installed. Nowadays, Hadoop helps with its mission-critical investment projects.67 With its Big Data projects, Morgan Stanley bets heavily on open-source tools.68 According to Bhattacharjee, open-source tools allow Morgan Stanley's ecosystem to be extremely agile, with short product cycles and innovations happening a lot faster than when using products from HP or IBM, both of which have long product vendor life cycles.

But of course, that's not all of the ways in which Morgan Stanley uses Big Data. For Morgan Stanley Smith Barney (MSSB), a joint venture between Morgan Stanley and Citigroup formed in 2009 that manages $1.7 trillion in assets for four million clients, predictive analytics are used to make better recommendations for investments in stocks, municipal bonds, and fixed income.69 For proper outcomes, predictive software requires vast amounts of data, and MSSB is not shy about data. Apart from the 450 reports the firm's equity analysts produce daily, employees use large amounts of public and social data to perform their analyses. All the information is used to make recommendations about whether to buy or sell stock based on real-time positions and market conditions. The system is constantly being improved and advisers can teach the program by deleting unnecessary or incorrect information.

In addition, Morgan Stanley decided to start using wire data to find errors within its applications.70 Wire data is all the data flowing in systems between all physical and logical layers. Real-Time wire data analytics can help detect and prioritize problems across their applications as they analyze how applications behave and then mine that data for useful information. To do this successfully, they use software from ExtraHop, a company that helps IT organizations harness the massive amounts of wire data flowing through their environments for real-time operational intelligence.71

Although Morgan Stanley is a global financial services firm, it understands that its data is one of its greatest assets and that it can be used to improve services and drive additional revenue across many different departments within the organization.

GAMING INDUSTRY

There are more than two billion videogame players worldwide. Electronic Arts (EA) has 275 million active users, who generate approximately 50 terabytes of data every day.72 The gaming industry does $20 billion in annual revenue in the United States alone, of which $2 billion is in subcategory social games.73 In the United States and Canada, the gaming industry is bigger than the movie industry, which sees annual ticket sales of $10.8 billion.74 The world of gaming is big, growing rapidly, and taking full advantage of the Big Data technologies. Gaming companies can drive customer engagement, make more money on advertising, and optimize the gaming experience with Big Data.

An Improved Customer Experience

As with any organization, the 360-degree customer view is important for the gaming industry. Fortunately, gamers leave a massive data trail. Whether it is an online social game via Facebook, a game played on an offline PlayStation, or a multiplayer game via Xbox, gamers create a lot of data in different formats. They create massive datastreams with everything they do, including. how they interact, how long they play, when they play, with whom they play, how much they spend on virtual products, with whom they chat, and so on. If the gaming profile is linked to social networks or a gamer is asked to enter demographic data, the information can be enriched with an understanding of what the gamer likes in real life, and gaming companies can adapt the game to the profile of the gamer.

Based on all that data, targeted in-game products can be offered that have a high conversion rate. They can function just like on ecommerce websites, where products are recommended based on what other customers bought. Gaming companies can then recommend certain features that other players also bought or virtual products based on the gamer's level. This can result in an increased up-sell or cross-sell ratio and additional revenue.

Engagement can also be increased if analytics show that a player will abandon the game if the first levels are too difficult or if later levels are too easy. Data can be used to find bottlenecks within the game, where many players fail in performaing the tasks at hand. Or, it can be used to find the areas that are too easy and need to be improved. Analyzing millions of player data gives insight into which elements of the game are the most popular. It can show what elements are unpopular and require action to improve the game. Constant engagement is vital. With the right tools, the right reward can be provided at the right moment for the right person within the game to keep a player engaged.

Big data technologies also help optimize in-game performance and end-user experience. When for example the databases and servers of the games have to cope with a steep increase in online players, it is important to have sufficient capacity. With big data it is possible to predict the peaks in demand to anticipate on the required capacity and scale accordingly. This will improve the gaming experience (who likes a slow game) and thus the end-user experience.

Deliver a Tailored Gaming Experience

Games that are developed for different consoles or devices (tablets versus smartphones or Xbox versus PlayStations) result in different playing experiences. When all the data is analyzed, it can provide insights into how games are played on different devices and whether the differences among devices present any problems that need to be resolved. When gamers switch between devices, the game should automatically be optimized for the new device based on the player's history.

Big Data also allows the tailoring of advertisements to the player's needs and wishes of the player. With all the data created by gamers, a 360-degree in-game profile can be created that, when combined with open and social data, gives insights into the likes and dislikes of that gamer. This information can be used to show within the game only those advertisements that match the profile of the gamer, resulting in a higher stickiness factor, more value for the advertiser, and, subsequently, more revenue for the game developer.

Ample opportunities exist for game developers to improve the gaming experience with Big Data, drive more revenue, and make the game faster and better. Game developers should not miss out on Big Data, because the benefits are too big to ignore.

ZYNGA IS AN ANALYTICS COMPANY MASQUERADING AS A GAMING COMPANY

How much data would an online developer like Zynga create and use on a daily basis? The answer is, not surprisingly, a lot. In fact, the company operates on such a large scale that on a regular day Zynga delivers one petabyte of content.75 To cope with the extreme high demands of data, Zynga has built a flexible cloud server center that can easily add up to 1,000 servers in just 24 hours. Its private and public cloud server park is one of the biggest hybrid clouds.76

Zynga is built on top of major platforms, such as Facebook, Google+, and Android/iOS, and offers its own Zynga API. Data at Zynga is divided into two types:

  • Game data, which is Vertica driven and generates approximately 60 billion rows of data and 10 terabyte of semistructured data daily.
  • Server data, which generates over 13 terabyte of raw log data from the server and app logs. This is stored in Vertica or Hadoop.

Interestingly, Zynga's database keeps growing, as it never deletes data because the process required is too complex.77

Metrics Driven Culture

At Zynga everything revolves around metrics, and for the management at Zynga, metrics are a discipline.78 Management has a strong desire to track progress using metrics. To support this, reports are freely accessible by everyone and integrating external services is easy. Brian Reynolds, a game designer, explains that at Zynga, the designers are separated from those who analyze the metrics.79 Analysts need to figure out what questions should be asked, and the designers develop/adjust the game to fit the answer.80

A great example of this data-driven decision-making is how the company pivoted the use of animals in Farmville 2.0. In the original version, animals were merely decorative.81 However, data showed that more and more people started interacting with the animals and even used real money to buy additional virtual animals. So, in Farmville 2.0, animals became much more central, which subsequently drove more revenue.

Due to this metrics-driven culture, Zynga combines art with science.82 Art is needed to create, develop, and implement an idea into a game. With the science behind it, the company listens to customers and determines whether the game is fun or not. Afterward, it can adjust and pivot games if necessary.

Zynga uses a large number of different databases for different tasks.83 For example, it uses Splunk to store primary log analytics.84 Seventy nodes and 650 million rows of data are stored daily in a streaming event database using a MySQL cluster. The company has sharded transactional databases and uses a Vertica Data Warehouse.85

The statistics that Zynga generates are gigantic as well. It generates over 6,000 different report types (3,000 every day) and receives 15,000 ad hoc queries from users on a daily basis. Analysts, product managers, engineers, and business intelligence teams use all these insights to optimize and improve operations and products.

HEALTHCARE INDUSTRY

Healthcare is rapidly becoming another digitized industry that will generate vast amounts of data that can be analyzed. If a fully sequenced human genome accounts for 100 gigabyte of raw data, it is clear that the healthcare industry will generate massive amounts of data in the coming years. All that data can be analyzed to create tailored medicines, improve treatments, and reduce fraudulent behaviors. According to Pricewaterhouse Coopers (PwC), fraudulent, wasteful, and abusive behaviors account for an estimated one-third of the $2.2 trillion spent on healthcare in the United States each year, so there is a lot to gain by using Big Data.86,87 The potential of Big Data is enormous, but it will also require substantial investments, time, and energy in the coming decades.

Improve Patient Care

Many countries around the world are implementing Electronic Health Records (EHRs) programs that will optimize and centralize patient information. These EHRs will create a lot of data that, when deidentified, aggregated, and analyzed, will provide a lot of valuable information.88 Such programs combine clinical data from labs and electronic medical records with patient data, historical background information, and social factors into one platform that can analyze and enhance predictive accuracy that will improve treatments. When the available data is centralized, it will enhance communication among all patient-care team members with the overall goal of improving the patient experience and quality of care.89 Algorithms can analyze which treatment (or no treatment) will have the best outcome based on all the data, as well as the personal DNA of the patient.

Big Data technology can also be used to enhance the patient's experience. Hospitals that give doctors, nurses, and patients RFID-enabled chips embedded in the patient's or doctor's card will be able to effectively manage the healthcare experience.90 These sensors can provide insights into the relationship between patient satisfaction and the amount of time a doctor or nurse spent with the patient. It can also show the distance nurses and doctors have to walk across the hospital to care for their patients and whether it is wise to relocate various departments to minimize travel time and optimize the use of expensive equipment.

Apart from monitoring the behavior of doctors or nurses, it is also possible to monitor patients all the time anywhere in real time. The data streams that are collected from bedside monitors when a patient is in a hospital can be analyzed in real time to detect subtle, but harmful, changes in vital signs and alert medical personnel when a change becomes dangerous. In the future, it will be possible to have sensors inside or attached to your body that will measure vital conditions and alert a doctor whenever necessary.91 Such a telemedicine platform can be especially useful for patients in remote areas or for patients who have difficulties getting to a hospital.92 By using predictive analytics in real time to analyze the data from these sensors, it will become possible to predict a stroke or heart attack before a patient notices anything is wrong.

Out-of-home healthcare can also be improved using quantified-self applications and medical applications on smartphones next to in-body sensors.93 These applications can measure vital signs at regular intervals and help the patient determine the reason for a change. If required, the patient can be sent to a hospital in time, where on arrival the doctors will already know what is going on with the patient.

The objective for doctors and nurses is to eventually improve treatments and reduce hospital (re)admission rates. Algorithms can assist doctors in recommending and determining the best treatment for each situation faced by a patient by taking into account the EHRs, social factors, demographic information, and geographic data. With predictive analytics, the outcome of certain treatments can be analyzed before a patient receives treatment. As such, the best treatment is provided, thereby reducing the need for readmission. This will save hospitals, patients, and insurance companies a lot of money.

Personalized Medicines and Treatments

Sequencing human genomes and DNA has become inexpensive and fast. Between 2003 and 2013, the cost of sequencing a human genome, including analyzing and interpreting it, went from $2.7 billion to only $5,000.94 A sequenced genome will provide doctors with a lot of information about a patient's health, including how the patient will react to certain medicines and the patient's risk for certain diseases. With Big Data, it is possible to improve the sequencing of DNA as well as lower the price to a point where it can be used in treatments regularly.95 In the future, it can be used to tailor medicine to the human genome of a patient to obtain the best results. Combining all the patient's EHRs, diet information, and social factors with the sequenced DNA will enable doctors to recommend a tailored treatment as well as personalized medicine.

Algorithms can also analyze the effects certain drugs will have on different types of patients when combined with other medicines. During simulations, minor changes can be made to medicines and scenario analysis can be performed to determine what will happen when a certain medicine is adjusted. With Big Data and the right algorithms, this will replace the need to perform research on real patients, thereby saving valuable time and money and reducing the time to market for new medicines.

Prevent Fraudulent Behavior

Fraudulent actions can be detected using Big Data when the health insurance company's datasets are linked to public and social data.96 This can determine if people are telling the truth if, for example, someone claims illness during a certain time but can be seen in holiday pictures during that same period on Facebook. In addition, doctors who submit insurance claims for treatments that were never performed can be more easily detected if aggregated claims over the entire population are analyzed, taking into account patient demographics, past procedures and treatments, diagnoses, and the ordering physician's utilization patterns.97 With outlier detection analysis, doctors who claim too many treatments will easily be discovered and can be investigated more closely.

In addition, Big Data can determine overutilization of treatments, services, or medicines in a short timeframe by comparing historical data. Data about patients traveling great distances to obtain controlled medicine, the billing of “unlikely” services by doctors, or data about treatments related to their geographic area can provide a lot of insight into fraudulent behavior by doctors, hospitals, or patients.

By connecting various (open) datasets, insurance companies and hospitals will heave a wealth of information that can reduce the massive amounts of money lost because of fraudulent, wasteful, and abusive behaviors. The way Big Data can help prevent health insurance fraud can also be used to target health campaigns more specifically, particularly if certain conditions, treatments, or medicine are seen more often in certain geographic and demographic areas. This will lead to tailored and effective campaigns that will save organizations and society a lot of money.

The opportunities to use Big Data are enormous in the healthcare industry, although it will require a lot of investment. Many challenges still remain to be overcome.

BIG DATA ENSURES A HEALTHY FINANCIAL POSITION AT AURORA HEALTH CARE

Aurora Health Care has 1.2 million customers, 15 hospitals, 185 clinics, more than 80 community pharmacies, and over 30,000 employees, including over 6,300 registered nurses and nearly 1,500 physicians. This creates massive amounts of data. The not-for-profit Aurora Health Care system has decided to put that wealth of data to use to improve decision making and make the organization more information-centric.

In 2012, Aurora finalized Smart Chart, a $200 million records system that gathers all data collected in the past ten years into a single data warehouse. It began as an effort to “get nationally recognized measures of our clinical performance by scoring them against national standards” explains Phil Loftus, Chief Information Officer of the health group, in Forbes.98 In other words, Aurora sought to combine nationwide data with its own data to benchmark results and create a national reputation for quality. The company used clinical data and data-mining tools to analyze the large amounts of data to achieve better insights.

Aurora created a hybrid Business Intelligence (BI) ecosystem blending a message-centric Exact-Transform-Load (ETL) approach, using a Relational Database Management System to do all dimension and fact table processing with a Big Data platform.99 The SQL-MapReduce and nPath enabled analytics platform controls the traditional business intelligence reporting, as well as the next-generation Big Data analytics.

Aurora processes 18 different main streams of data in near real time. It includes financial, pharmacy, laboratory, and procedure data. The goal is to use all data in a highly secure and effective manner. The tasks are computed using a massive parallel processing system with multiple low-cost microprocessors, giving it 20 to 30 times more computing power than a traditional data warehouse.100 This allows Aurora to look differently at the data, as well as to change the analytics from looking at individual patients to groups of patients who have the same diseases, such as diabetes or heart failure. This discloses new trends and insights and helps researchers more easily find the right patients for testing new medications. In addition, the Aurora system keeps thorough records of each patient's history. This data is available to doctors, nurses, and caregivers throughout the system, which ensures that the patients get accurate diagnoses and the best treatment based on their personal information.

Aurora also treats more patients at home. In 2013, nurses equipped with laptops visited approximately 2,300 patients at home. Secure wireless datacards enabled these nurses to access the computer system and review all relevant patient information.

Using all available data and near real-time data analytics, Aurora can predict and improve patient treatments and outcomes.101 Using the different data streams, Aurora has decreased patient readmissions by 10 percent, which translates into a total saving of $6 million. Data helps doctors lower the cost of care by analyzing outcomes and recommending different procedures.

Aurora also decided to join and participate in the development of the Oracle Health Sciences Network, an information-sharing platform in the cloud that enables cooperation among life sciences institutes, researchers, and healthcare providers.102,103 Aurora wanted to join this network as it will help improve the healthcare patients receive through participation in drug trials.

The results of the Big Data endeavor by Aurora Health Care are interesting. In addition to the savings achieved by reducing patient readmissions, the organization reduced query time, improved data insights, and saved 42 percent on treatment costs.

LEGAL PROFESSION

How Big Data Can Improve the Practice of Law

Many industries are starting to see the benefits that can be reaped from analyzing and visualizing the vast amounts of data created nowadays. The more conservative industries have been slower to adopt the new technology, but some are slowly waking up and looking in the direction of Big Data. One of these is the legal profession, including the judicial system and law firms.

However, many questions remain, such as: What are the benefits of Big Data for courts or law firms? How can Big Data help overcome common court procedural issues, such as overburdened dockets, delays, and rising costs? How should the legal system deal with sensitive data from trials? What are the implications of Big Data for legal practitioners? Clearly, many questions remain. Although Big Data is new to this industry, there are some great examples of its application. Let's dive deeper into this issue.

InformationWeek described the case of the boutique law firm Thomas Horstemeyer in Atlanta.104,105 It has approximately 60 employees and specializes in the field of intellectual property. Instead of maintaining archives of the documentation about its different cases, the firm has moved everything to the private cloud. It has several storage area networks (over a dozen terabytes of data) in the office, and it performs several types of analyses on that data. The law firm created a pure virtual environment and, as such, it upgraded its firewalls and added load balancing, virtualized its servers, and replaced its phone system with VoIP.106 In addition, it saved on capital expenses by no longer requiring vast amounts of storage space for old case files.

Although this does not seem to have much to do with Big Data, it is a beginning, as law firms traditionally tend to hold on to paper documents. However, digitization of paper documents allows for faster analysis of the available data and less time spent searching for information in old case files.

Other applications show how Big Data can contribute to the legal profession. First, it can greatly reduce costs and speed up court procedures, particularly when vast numbers of files and other relevant data can be analyzed instantly and correlated. To do this, law firms need to correctly collect, store, catalogue, and organize all their data. Computing power is now strong enough and cheap enough to store all that data. In the future, this could lead to completely new insights related to cases and could enable lawyers and public prosecutors to answer questions that are currently unanswerable.

For example, law firms could use algorithms that offer predictions on certain cases based on how similar cases fared in the same jurisdiction in the past. The small Californian law firm of Dummit, Buchholz & Trapp uses such technology, developed by LexisNexis, to determine in 20 minutes whether a case is worth taking on or not.107 In the past, this took 20 days.

Second, Big Data will drive transparency into the legal profession, which will benefit both lawyers and corporate clients. The tool TyMetrix LegalView Analytics, for example, has collected vast numbers of invoices totaling tens of billions of dollars in legal spending on an ongoing basis.108,109 This helps law firms benchmark themselves against the industry and determine the right price for certain cases. There are also tools like Sky Analytics that help law firms reduce and benchmark legal spending and control costs.110 These tools give law firms an unparalled macroview into the costs of services, as well as provide advice for clients on how to cut the best deal on legal services in any given location.

Consumers can also take advantage of the democratization of data in the legal profession. The app RateDriver enables users to quickly determine the appropriate rate they should expect to pay for attorney's fees in 51 U.S. markets.111,112

Finally, Big Data can find new evidence that can be used in court. Several American case examples indicate that big data collected and analyzed from public datasets can be admitted as evidence.113 In addition, the legal profession has always been data driven, but until recently all that data was on paper. Now that legal firms are slowly moving to the digital world, vast new opportunities are being created to improve research. Digital data at law firms can easily be connected to open and public datasets, thereby providing additional clarification and new insights. As LexisNexis Chief Architect Ian Koenig explains, “it is allowing us to find the right needle in the stack of needles.”114

More and more big data startups are appearing in the market that focus specifically on the legal profession. One is Juristat, a St. Louis–based start-up that's doing the moneyball approach in court jurisdictions in America.115 Juristat provides actionable analytics to lawyers and law firms, allowing them to optimize litigation strategies, marketing, and internal operations.116 But the tool from Juristat can go even further; it can even predict how a flu outbreak might affect a jury verdict.

Big Data is still just beginning to affect the legal profession, and there is still a long way to go. Among others reasons, law forms are reluctant to digitize data because so much of its data consists of confidential information that might raise privacy and security issues. For many firms, Big Data therefore poses many risks and challenges, but, in the end, the only way forward is the digital way.

MANUFACTURING INDUSTRY

In 2013, General Electric (GE) announced the development of the “predictivity” platform in cooperation with Amazon Web Services.117 The goal was to develop an Industrial Internet ready for the Big Data era. The Industrial Internet can be seen as the integration of complex physical machinery with networked sensors and software. It is a Hadoop-based software platform for high-volume, machine data management, and it will provide industrial companies with a collective architecture, merging intelligent machines, sensors, and advanced analytics. The development of such a platform shows the massive opportunities in the manufacturing and industrial industry for using Big Data.

The Industrial Internet will need to overcome some serious challenges before it is common in the manufacturing industry. First of all, turning a factory into a smart factory requires a large investment and a new way of working. It also requires Big Data standards and an ecosystem to ensure smooth operation between different companies, which has yet to be developed. Of course, the large number of sensors in the machinery will create a lot of data, probably driving us into the brontobytes era fairly soon. This will require powerful analytics that can handle vast amounts of data. In addition, providing security for all this data is a vital issue, as malware in the Industrial Internet could do more than “just” affect sensitive information; it could trigger direct physical destruction.118 The Industrial Internet is just one of the benefits of applying Big Data in the manufacturing industry.

Optimize Operational Efficiencies

Using data during the production process and sensors attached to machines, the entire production process can be analyzed to understand how each section is performing. The moment a process deviates from the standard, an alert can inform the factory of the problem. Mistakes can be found much faster, and errors can be solved and bottlenecks eliminated. The sensors can identify the problem for the engineer, who will then know how to fix it expeditiously. With Big Data technologies, it is also possible to model the production of industrial products virtually, thereby optimizing the production process.

When all this information is visible in one central dashboard, the transparency created can help manufacturers improve their production processes. In addition, many organizations keep their data in silos across the company. In large multinational organizations, this information can be especially difficult to retrieve. Big Data can help organizations centralize all information on to one platform (in the cloud), giving all employees access to relevant information based on their role. Creating a product lifecycle management (PLM) platform that integrates datasets from multiple systems will significantly increase effective and consistent collaboration across the organization.

When information is accessible from a centralized platform in the cloud, it ensures that all departments throughout the organization will work with the same data. This will decrease the number of errors and, as such, increase operational efficiency. In addition, operational efficiency will be further increased when data sources from relevant suppliers are taken into consideration. As a result, suppliers will have more accurate information about when to deliver what materials.

Optimize the Supply Chain

Large Original Equipment Manufacturers (OEMs) can have thousands of suppliers delivering tens of thousands of different products to manufacturers. Each will trade at its own price, depending on market forecasts and other variables, such as sales data, market information, events happening in the world, competitor data, and weather conditions. Using sales data, product sensor data, and data from the supplier's database, industrial manufacturers will be able to predict demand accurately in different regions around the world.

The ability to track and predict inventory and prices and to buy when prices are low can significantly reduce costs for manufacturers. If they also know, using sensors in the products, when products are about to break and which part is needed where, they can forecast inventory needs and optimize the supply chain even more. Collaboration with different players within the supply chain can help shape demand at the manufactories to deliver a better B2B experience.

Save Money

Centralized monitoring of all processes during the manufacturing of the equipment can be accomplished using sensors. This can show anomalies or peaks in energy consumption, which can be used to optimize energy usage during the production process. For example, heat can be adjusted in buildings based on the number of people present.

The Industrial Internet enables manufacturing companies to use Big Data technologies to improve operational efficiencies, reduce costs, and create better products. It offers a lot of possibilities. In the coming years, more organizations within the manufacturing industry will see the need to optimize the supply chain and operational processes using Big Data, helping the Industrial Internet as envisioned by GE to become reality.119

FORD DRIVES IN THE RIGHT DIRECTION WITH BIG DATA

Ford uses Big Data to learn what its customers want and to develop better cars in less time. To develop a product that requires 20,000 to 25,000 different parts, Ford is betting heavily on Big Data. Ford actually opened a lab in Silicon Valley just for this purpose.120 To improve its cars in terms of quality, fuel consumption, safety, and emissions, Ford aggregates data from over four million cars that have in-car sensors and remote app management software. All data is analyzed in real time, thereby allowing engineers to notice issues immediately, understanding how the cars respond in different road and weather conditions and to any other forces that affect them.

Ford is already installing over 74 sensors in cars, including sonar, cameras, radar, accelerometers, temperature sensors, and rain sensors.121 As a result, its Energi line of plug-in hybrid cars generates approximately 25 gigabytes of data every hour.122 This data is processed in the factory in real time, and data is fed to the driver through a mobile application. The cars in Ford's testing facility generate up to 250 gigabytes of data per hour from high-resolution cameras and an array of sensors.123

Big Data at Ford is nothing new. As far back as the 1990s, the company began using in-car analytics. Then, in 2004, it developed a self-learning neural network system for its Aston Martin DB9.124 This system was capable of keeping the engine functioning correctly, optimizing conditions to match the driver's behavior, and adjusting alerts and performance accordingly.125 Since then, the culture of Ford has become data driven, although selling internal Big Data opportunities was more difficult than selling external Big Data opportunities.

Externally, for example, Ford used Big Data to find what improvements people wanted in their cars. Nowadays, Ford listens carefully to what customers are saying online, on social networks, or in the blogosphere. The company wants to learn whether, for example, the Ford Escape sport-utility vehicle should receive a standard liftgate or a powerliftgate.126 In addition, Ford performs sentiment analysis on all sorts of content online and uses Google Trends to predict future sales.127

Internally, Ford uses Big Data to optimize its supply chain and to increase operational efficiency. From before parts even reach the factory to the car in the showroom, Big Data has infiltrated every part of the supply chain, thereby generating vast amounts of data.128 With so many different parts coming from so many different suppliers, it is vital for Ford to get a complete and detailed overview of where all parts are located within the supply chain at any moment in time.

Information from websites, call centers, and the company's credit-processing arm, as well as sensors within cars, is used to improve products and services to better match customer demand and needs. In addition, Ford uses assembly sensors to optimize the production of its cars.

To collect and process all that data requires the right Big Data tools. Ford relies mainly on open-source tools, such as Hadoop, to manage the data and the programming language R to perform the statistical analysis. A range of other open-source applications related to text mining and data mining are also used.

The car manufacturer can no longer operate without understanding every aspect of its production. It also needs to know how cars are being used by drivers. Competition is fierce, and those companies that obtain valuable insights from Big Data will outperform their peers. Ford is driving in the right direction with its Big Data strategy to be ahead of it competitors.

NOT-FOR-PROFIT SECTOR

How Big Data Can Help the Developing World Beat Poverty

The amount of data created is not only growing in the developed world, but also in the developing world. However, a large part of the data created in the developing world has a different origin than elsewhere; the developing world is skipping the desktop and wired era and progressing rapidly to the mobile era. This requires a completely new approach, but it also offers a wide range of possibilities for overcoming poverty.

The United Nations also see the possibilities of Big Data. In 2009 U.N. Secretary-General Ban Ki-moon launched the Global Pulse initiative.129 Global Pulse serves as an innovation lab created to raise awareness of the opportunities of Big Data and bring together different stakeholders, such as Big Data Scientists, data providers, governments, and development sector practitioners. The objective is to help catalyze the adoption of Big Data tools and technologies and to help policymakers understand human well-being and emerging vulnerabilities in real time to better protect populations from shocks.

In addition to the United Nations, the World Economic Forum (WEF) is discovering the possibilities of Big Data for the developing world. The WEF prepared a white paper discussing the possibilities of Big Data and the new possibilities it offers for international development.130 The World Bank is researching Big Data and has developed a map that visualizes the locations of World Bank-financed projects to better monitor development impact, improve aid effectiveness and coordination, and enhance transparency and social accountability.131,132 Finally, the International Aid Transparency Initiative makes information about aid spending easier to access, use and understand.133 Of course, these are just a few of the many new initiatives.

Anoush Rima Tatevossian, who leads the global strategic partnerships and communications for the United Nations Global Pulse, notes that Big Data “offers a new tool in the development toolkit, and must be approached with a nuanced appreciation of its power, and also of its limitations.”134,135

Mobile Data

For the vast numbers poor people, a simple or basic mobile phone is the only interactive interaction with the web. Although smartphones are the common device in the developed world, they still account for only 10.44 percent of global mobile website traffic.136 On the other hand, traditional mobiles represent 78.98 percent of mobile worldwide website traffic (with tablets taking 10.58 percent of the traffic). Luckily, vast opportunities exist for the developing world to use data created by basic mobile devices to identify needs, provide services, and predict and prevent crises to the benefit of the poor.

For example, Cignifi, a Brazilian startup, developed technology to recognize patterns in the usages of mobile devices.137 The system recognizes phone calls, text messages, and data usage. Based on this information, it can recognize someone's lifestyle and his/her corresponding credit risk profile.

Cignifi uses Cell-phone Call Detail Records (CDRs) to determine a person's credit risk profile and thereby captures vast amounts of data that can be analyzed, such as time, location, recipient's location, duration of call, and so on. These all provide extremely valuable information when analyzed correctly. As Emmanuel Letouzé describes on his blog, CDRs from a city in Latin America could predict socioeconomic levels.138

But CDRs are not the only mobile data that can be used. What about data from the 100 million users who use the app Facebook for Every Phone?139 Most of the Facebook users in the developed countries have probably never heard of this app, but every month it has 100 million active users who connect with each other via their mobile phones (not smartphones). All this valuable mobile data can be used. When combined with other datasets, it can help citizens in developing countries.

Business Use Cases

The Engineering Social Systems department (ESS) of Harvard has collected several inspiring use cases.140 Big Data offers, for example, the possibility of predicting food shortages by combining variables such as drought, weather, migration, market prices, seasonal variations, and previous productions.

Or, what about the possibility to better understand the dynamics of residents in poor neighborhoods using mobile data to develop predictive models to better serve them. For example using CDR information to map changes in the neighbood's population and direct water pipe building efforts for the benefit of residents in poorer neighborhoods.141 Time-series analyses performed on CDR combined with random surveys can lead to better insights about the dynamics of rural economies and provide insights on how governments should respond to economic shocks in rural and poor environments.

The World Bank offers an example in which Big Data is used to ensure the right distribution of the right medicines to the right location at the right moment in time.142 A pilot programme called SMS for Life improved the distribution of malaria drugs at a health facility level in rural Tanzania, reducing facilities without stock from 78 percent to 26 percent.143

Big Data as a Catalyst

Big Data can be as a catalyst for long-lasting improvements, but we will have to look further ahead to see that. Mobile data alone is not sufficient to really create opportunities that could impact developing countries in the long run. Therefore, additional data sources are required, ranging from data from nongovernmental organizations (NGOs) to public and social data.

Many different NGOs are active in the developing world; they all do very valuable work to overcome poverty and reduce disease and hunger. What if all those NGOs used one standardized mobile app (smartphone or tablet) to collect data (a predefined set of metrics) in the same consistent manner across villages, countries, and continents? It would could create a fantastic high-level overview of what's going on in the developing countries.

When the data collected by the NGOs is combined with data from the mobile devices carried by citizens, social data from applications like Facebook for Every Phone, world food market data, and public data from (local) governments, Big Data can truly have a long-term impact on poverty by providing important insights.

The question, of course, remains: Why should NGOs cooperate in creating such a tool? Well, the answer to that is simple: If you share data, you can use the data. It will enable the NGOs to do a better job. Even more, the same data can also be shared with the private sector, such as, for example, FMCG companies or manufacturers that want to obtain a better view of emerging markets. Of course, this only occurs when private companies also share data. This is already happening; it is called “data philanthropy.” The WEF refers to it as “corporations that are encouraged to share anonymized data for use by the public sector to protect vulnerable populations.”144 Nike is one of the pioneers of this approach, as it shares data from the 57,000 different materials it uses with the entire supply chain.145

Of course, governments should also open up their data to the public, private organizations, NGOs, journalists, and entrepreneurs. Kenya is one of the pioneers in Africa when it comes to opening up data. As the WEF report notes: “In 2009, Kenya opened the Open Data Portal where the government shares 12 years of detailed information regarding expenditures and household income surveys, as well as health facilities and school locations.”146,147 The portal can be accessed by anyone via the web or mobile devices.

As is the case in the developed world, governments in the developing countries should take the lead in creating the legal framework for sharing and using open data to protect privacy and ensure transparency, simplicity, compatibility, and security. In addition, the governments should stimulate the development of the required technical infrastructure and creation of an environment where smart individuals and organizations can use data to create new tools and applications. Governments can organize hackathons to develop, together with entrepreneurs, new solutions for poor people. This will ensure that data can be updated continuously through different organizations and becomes available and useful to citizens.

Big Data does, however, also requires a cultural and policy change, as a blog post by the World Bank shows.148 Data in a study by Esther Duflo and Abhijit V Banerjee of 18 developing countries showed that the people in those countries were not literally starving of hunger (the study showed that they received enough food), but rather that their diets were not sufficiently nutritious.149 This means that governments should not provide or subsidize more basic foods, such as rice or noodles, but they should provide or subsidize more nutritious food. This very important difference was made visible with Big Data.

Big Data offers many opportunities for the developing world to overcome poverty, but it requires that different organizations work together to achieve lasting results. In addition, the joining organizations should ensure the transparency and availability of the data. Transparency will stimulate awareness of the possibilities, ensure data governance, and reduce bureaucracy and corruption.150 Availability of the data will ensure that multiple data sources can be fused, such as CDR, open, social, government, NGO, and corporate data, to create valuable and relevant new insights that will truly have a long-term impact.

MEDIA AND ENTERTAINMENT INDUSTRY

A Paradigm Shift Awaits the Media and Entertainment Industry

The media and entertainment industry is awaiting a paradigm shift. For many years, this industry focused on sending information and entertainment to viewers and users at a moment the industry thought appropriate (think TV channels). Broadcasting schedules were based on historical analyses and what the chief editor deemed best for the audience. With Big Data, this is all changing. Not only will advertising fundamentally change within this industry, but also what shows, series, or movies are developed, when they are developed, and for whom.

To achieve this, media and entertainment organizations should start building detailed 360-degrees profiles of their audiences. They can use a vast amount of different data sets to accomplish this goal. Behavioral analytics can help discover patterns in (un)structured data across consumer touch points, giving organizations better insights into the different types of customers they have. Consumer patterns derived from analyses of demographic, geographic, psychographic, and economic attributes will help organizations better understand and approach their customers. Accurate information about customers can be derived from sales and marketing data, including campaign information, point-of-sales data, and conversion data. All of this can be used to increase customer retention and acquisitions, grow upselling and cross-selling opportunities, increase online conversion, and improve the entertainment experience for the end user. When users also connect their social profile, the media and entertainment organization can obtain a true 360-degrees view of viewers and users.

These profiles can be used in several ways. First, organizations with such detailed customer profiles can increase advertising revenue by offering very targeted advertising that is more relevant and therefore more expensive for the advertiser. Advertising can become multi-platform (think second-screen advertising via tablets during television programs). When they start mining all the data, these organizations will gain valuable and nuanced insights into what the audience really wants.151 This information can be used to create new products around existing shows or develop new programming or movies.

Big Data can also be used to predict whether a movie or a series will be a hit before shooting begins. Historical data about different shows, such as when a user pauses, forwards, rewinds, replays, or stops a television series, provides valuable information.

A combination of the detailed (social) profile of the person watching and the many different tags related to a series or movie created by viewers creates an extremely valuable data stream that provides insights into whether a series or movie will be successful. The best example here is Netflix's purchase of House of Cards based on a thorough data analysis of its 33 million users.152 All that data even allowed Netflix to outbid other major players, including HBO and AMC.153 In addition, the data showed Netflix that a significant number of its users were watching marathon style, so Netflix decided to go against tradition and release the entire season of House of Cards all at once.154

Finding the secret messages in the data that the audience is unknowingly broadcasting is key if companies in the media and entertainment industry are going to outperform their peers. They need to know: What phase of their lives are viewers in now, what do they think is important, what are they looking for, what do they recommend to whom and why, what motivates and inspires them, what are they watching, and how do they respond to it via social networks.155 This kind of data, which is really generated at any point of contact through all channels, enables media and entertainment organizations to understand viewers’ sentiment in real time. All the collected information can be used to deliver a personalized television experience via top boxes or smart televisions, where content is recommended based on the user's profile.

In addition, Big Data can be used to optimize multichannel advertising campaigns. Consumers are watching multiple devices at the same time. Optimizing advertising campaigns across multiple devices will strengthen the message of the commercial. With Big Data, it is possible to understand which consumers use a second screen, as well as where and when. The right message can then be delivered via the right channel.

The sports industry is, of course, a great example of how the entertainment industry leverages the experience for viewers using Big Data and second screens. During three tennis grand slams in 2013, IBM enabled viewers to access every available statistic about their favorite player, vote for players, and interact in other ways. IBM's program, SlamTracker, which was also used during the Australian Open, is a real-time statistics and data visualization platform that leverages predictive analytics technology.156 The mobile application provided detailed statistics to the viewer while watching the game on television. This was a great step forward for the viewer, and it is only a matter of time such tools will also be available for television series or movies.

HOW TIME WARNER CABLE USES BIG DATA TO OPTIMIZE THE VIEWERS’ EXPERIENCE

Time Warner Cable is an American cable telecommunications company that operates in 29 states. It was formed in 1989 after a merger between Time Inc.'s cable television company, American Television and Communications Corp., and Warner Cable. It has over 34,000 employees and over 14 million customers.157 The company offers a vast range of services using the latest technology, including video-on-demand, HDTV, digital video recording, Internet access, and premium services such as HBO. In addition, consumers are now using many different streaming services, such as Netflix or Hulu (also called Over-the-Top content [OTT]), all running over the cable network.

As such, Time Warner Cable deals with a lot of data and uses Big Data tools to navigate through the changing media landscape to adjust its infrastructure to the changing needs of customers. The audience metrics received via customers provides a lot of insights into what customers want, as well helps create detailed customer profiles for personalized advertising. This can lead to a lot of new revenue streams, which can help cable companies like Time Warner Cable that are seeing customers move to the Internet.

Personalized Advertising

Time Warner Cable uses a vast array of datasets to create detailed customer profiles. In an interview with Fast Company, Joan Gillman, President of Time Warner Cable Media, explained that TWC combines public datasets, such as real estate records, demographics, and voter registration information with local viewing habits.158 This enables Time Warner Cable Media, the company's advertising arm, to create and deliver advertising campaigns that are highly targeted.

But the company does do focus only on personalized advertising, but also on multichannel advertising. With millions of users downloading iPad apps and receiving data from their networks on different devices, a consistent experience is important.159 In a pilot project in Texas, Time Warner allowed clients to create campaigns that simultaneously targeted the same customers via cable television, mobile apps, social media, the Internet, and other platforms. Then, the company used Big Data techniques to measure the engagement of the users on each platform and adjust the advertising campaign on each platform as necessary. For users, this means a consistent advertising experience across all platforms, which is extremely valuable to Time Warner Cable Media's clients.

Detailed Metrics Lead to Detailed Information

For cable companies like Time Warner Cable, aggregated user data is very important.160 With this data, Time Warner can optimize its network and programming. Although it does not track what viewers are watching, it does know how often customers use which services (ranging, for example, from OTT services to interactive television to mobile television via iPads). This data provides the company with information regarding what customers are affected by bandwidth and how to deal with peaks in network demand.

To understand all this data, Time Warner Cable uses Alteryx.161 It enables the company to understand how viewers watch programming, as well as how advertising clients performed. With interactive campaigns, Time Warner has been able to map the location of responding viewers to the locations of relevant stores. In addition, thanks to data analysis, it was able to perform cross-platform analysis to predict which homes would be interested in what movies via their Movie-on-Demand platform. This allowed the release of the right movies at the right time to the right homes, thereby increasing sales.

Growing Amount of Data

Time Warner Cable Media operates in 15 different markets and reaches over 7.9 million subscribing customers. All these customers provide a vast amount of data. To store all that data, Time Warner built its own warehouse. Currently, its database grows by 0.6 terabyte every day.162 This might not seem like a lot in terms of Big Data, but it is sufficient for Time Warner Cable Media to create tailored advertising campaigns for its customers.

Running a cable company is an expensive business. With so many new services and Internet television initiatives, cable companies are not even guaranteed successes anymore. Therefore, they will have to innovate to create a tailored experience for viewers as well as for advertisers. Time Warner Cable understands that generating data is an inevitable part of the process and has already successfully used it to find new revenue streams, thereby improving marketing efforts and network infrastructure.

OIL AND GAS INDUSTRY

The multibillion dollar oil and gas industry uses Big Data technologies to optimize processes that enable the collection of data that will provide additional insights, better monitoring, and more revenue. Ample opportunities exist to also use Big Data to increase oil and gas production, as well as increase safety and mitigate environmental risks.

Exploration and Discovery

Seismic monitors with 2D, 3D, and 4D capabilities can generate vast amounts of data during oil and gas explorations.163 This data can help find new oil and gas fields, as well as help identify potentially productive seismic trace signatures previously overlooked. With multiple parallel processing platforms, the data can be analyzed quickly and accurately, taking into account different data variables that affect the profitability of a new oil well, including production costs, transport of oil or employees, weather related uptime or downtime, and so on.

Massive amounts of drilling data can be analyzed and monitored in real time, alerting company employees to anomalies that might occur based on such different variables as weather, soil, and equipment sensor data. This will predict in real time the success of drilling operations.164

Seismic data can also be used to determine the amount of oil or gas in new or previously overlooked oil wells.165 Combining various datasets, such as historical production and drilling data from local sites, can give additional insights into future production volumes. This is especially useful when environmental restrictions prevent new surveys. When added to public data about weather, ocean currents, and ice flows, an accurate prediction can be made regarding future production volumes.

An additional advantage of placing sensors within oil wells and across the earth's surface is that it provides even more information about how drilling affects seismic activities. The closer the sensors are to the activity, the earlier the seismic activity is detected. Warnings can then be sent to citizens who would be affected by a possible earthquake.

Optimization of Production

Data can be collected from various sources. These include sensor data from equipment about pressure, temperature, volume, shock, and vibration, geological data such as scientific models related to understanding the earth's subsurface, and weather data such as the impact of storms on rigs. This information can be used to detect errors or upcoming failures during drilling.

Sensors attached to drillheads and other equipment can be monitored to determine how the equipment is operating and to predict when a machine is about to fail or when maintenance is required. When combined with historical data about equipment failure, it becomes possible to monitor all equipment around the world in real time to minimize the impact of equipment failure.

All data can be collected and analyzed centrally to better understand what equipment works best in what environment. This will enable organizations to optimize how the equipment is used and reduce latency. Sensor data on equipment can predict failures and indicate required repairs before they affect operations. When this data is added to the ERP of the organization, new spare parts can be ordered before a machine fails and arrive on time for the engineer to use when the data indicates the need. Maintenance planning can be adjusted accordingly, thereby reducing downtime and inventory levels.

Mitigate Risk and Ensure Safety

Data from various sources can detect anomalies in drilling in real time so that decisions can be made faster to shut down, if necessary to prevent any major environment risks. In addition, videodata from smart cameras can show what is happening in real time. Algorithms can identify patterns or outliers that may indicate security breaches online and offline. Security can then be alerted to take action wherever security is compromised around the world.

Big Data and the oil and gas industry are a powerful combination. Apart from the massive benefits for the oil and gas companies, there are also side benefits for other industries due to the massive numbers of sensors used by this industry, as will be shown in the vignette about Shell.

SHELL DRILLS DEEP WITH BIG DATA

Increasingly, Shell uses Big Data to improve its operations and increase the output from its oil and gas wells. For the last two years, the company has been lowering optical fiber cables fitted with sensors into its wells. With this data, Shell can improve its analysis of wells and determine how much oil or gas is still left. These supersensitive sensors help Shell find additional oil in wells that were thought to have run dry. The sensors, which were developed by HP, generate massive amounts of data that is stored at a private section of Amazon Web Services.166 In two years, Shell has already collected 46 petabytes of data; the first test in one oil well resulted in 1 petabyte of information. Knowing that Shell wants to deploy these sensors in approximately 10,000 oil wells, we are talking about 10 exabytes of data. This is the same amount of data that was created daily on the Internet in 2013. Because of these huge datasets, Shell started piloting with Hadoop in the Amazon Virtual Private Cloud.167 All data that is received from the seismic sensors is analyzed by artificial intelligence developed by Shell and rendered in 3D and 4D maps of the oil reservoirs.168 Although the analyses are done in the cloud, the visualizations are immediately available to the crew working at the local factory.

Currently, Shell has a large team of 70 people working full time in its data analysis department, plus hundreds more scattered over the world who participate on an ad hoc basis. The department consists of a mix of specialists in IT, oil and gas technology, mathematics, and physics. They are all working toward the same goal of getting more oil and gas out of the same or new wells.

Although Shell refuses to reveal how exactly it uses what data, it is doing something remarkable with the data that it does want to share. Shell and a few environmental organizations agreed to share data. As Gerald Schotman, Chief Technology Officer at Shell, explains, “during a brainstorm with environmental organizations we noticed that we can help migratory birds across the Sahara in their search for water. Or we share data with organizations concerning the migration of whales.”169

PUBLIC SECTOR

The public sector creates massive amounts of data. It therefore offers ample opportunities for governments to save a lot of public money by using Big Data. According to the United Kingdom free-market think tank Policy Exchange, the government could save up to £33 billion a year by using public Big Data more effectively.170 McKinsey has estimated that the potential annual value to Europe's public sector from Big Data is €250 billion.171 There are many uses available to the public sector, from reducing tax fraud to improving services for citizens.

Improve Transparency and Decision Making While Reducing Costs

Governments can become more transparent and reduce the time officials and citizens require to comply with tax regulations.172 A lot of government tax agencies store personal data, which is copied all over the public sector. Over and over again, citizens complete new forms with information most governments already possess. Prefiling forms reduces errors and speeds up process time. The Swedish Tax Agency offers services to help citizens complete prefilling of forms with personal data in a manner that reduces processing times.173 The Dutch government also prefiles annual tax forms with information from employers, as well as bank accounts.

When all data is stored in one central location, it will be possible to give government officials access to all information. This will reduce errors and inefficiencies within the government and ensure that the correct information is used, while enabling all government officials to have access to the most up-to-date information on its citizens. Governments can make use of open-source tools like Accumulo, which was developed by the National Security Agency (NSA), to enable search queries and results based on the government employee's authorization.174

Governments that open up their massive Big Data sets and stimulate the free flowing of information contribute to transparency and build trust with citizens.175 It will enable citizens to understand what data governments collect and what they do with the information. Sharing these data sets will also help governments develop new and innovative services, as citizens will help build solutions, and probably also make money out of it.176 The transparency will enable citizens to monitor and understand how governments spend public money; this will encourage government officials to spend public money wisely or possibly not be reelected.177

A great example is provided by government officials who travel a lot to other countries. With an intelligent travel-and-expense management system powered by Big Data technologies, governments could gain a complete overview, including which officials have traveled where for what purpose and when.178 Government officials who want to arrange a trip can use the system to understand whether travel is really required, so that only the required number of people are sent to the same place. This could save a lot of money.

Personalize Citizens’ Experiences

By analyzing unstructured and structured social/public data, governments can respond to changing events and react quickly if citizens are not satisfied or are in need.179 Segmentation and personalization could help identify citizens who need help because they have become unemployed or otherwise vulnerable. Algorithms can automatically define the help they require.

Governments can use the unstructured and structured online data as well as voice recognition data from citizen phone calls to understand (national) sentiment, learn what citizens are looking for (from the local neighborhood level to the national level), and help policymakers develop and prioritize new public services.180 In addition, sentiment analysis can be used to discover potential areas of civil unrest so that preventive action can be taken, if required.

Such a personalized approach can also be used during elections to gain a complete understanding of what the voters are looking for and how they can best be targeted. The Obama campaign of 2012 is a great example of how Big Data can help politicians win elections.

Reduce Tax and Social Security Fraud

Taxes mean big numbers and a lot of data. With Big Data tools, governments can minimize tax and social security fraud. Algorithms can use pattern detection to find suspicious transactions occurring in real time. The combination of different local and national datasets will provide insights into the tax paying behavior of citizens.181

Abnormal behavior patterns can be spotted that indicate fraudulent actions.182 Patterns can be used to create profiles and statistical parameters can identify suspicious transactions, which can then be monitored more closely.183 Governments can also use demographic or social data to validate information and determine whether suspected outliers are really performing fraudulent actions, such as applying fraudulently for social security.

Keep a Country Safe and Healthy

Data can reveal trends in criminal prosecution and can create profiles of the prison population to determine whether the majority are low-level, nonviolent offenders or highly violent people. Officials in the justice system need to understand when and where violent crime is happening. The Las Vegas police use algorithms to detect the blocks in town where crime is likely to occur. These algorithms are based on historical datasets and a broad range of other datasets. By applying an information-centric approach based on many different datasets, the efficiency and effectiveness of a criminal justice system will improve.184 Of course, Big Data tools enable governments to monitor what is happening within their country and to uncover malicious activities that could indicate an upcoming (digital) terrorist attack. Governments can collect, process, and analyze data from their own networks, as well as public data sources, to protect their countries from attack. This is exactly what PRISM (see Chapter 5) is doing, and it immediately raises the privacy issues involved in securing a nation against (digital) terrorist attacks.

Big Data can also protect the environment when there are risks of flooding or other weather-related disasters. Weather, water, and climate data can be combined with data from sensors in dikes, lakes, and rivers to gain a real-time understanding of the current environmental status. For example, these sensors could register when dikes are about to break and villages will be flooded, enabling governments to take preventive action. Instead of replacing or fixing protection barriers that do not yet require repair, sensors can help target maintenance protection efforts. This will increase safety levels and reduce costs.

The smart dike (“IJkdijk”) in The Netherlands is full of sensors that help researchers understand complex information flows in water management systems.185 The IJkdijk is a unique international testing facility that led in 2013 to the installation of four LiveDikes in The Netherlands.186 These LiveDikes have sensors that are centrally monitored to provide a real-time picture of the condition of those dikes.

In addition, Big Data can help governments to predict epidemics. Google Flu Trends is better at predicting how many people have the flu in certain countries or regions than are government officials.187 Certain search terms are very good indicators of flu activity. Google Flu trends combines and analyzes those search terms to estimate flu activity around the world in near real time. Like Google Flu Trends, governments can analyze EHRs, as well as social and search data, to understand and predict epidemics related to other diseases. This will enable governments to respond faster, thereby improving healthcare while reducing costs.188

When governments embrace the possibilities of Big Data, they can make a country more effective and efficient, decrease costs associated with bureaucracy, and improve citizen services.

BIG DATA DURING THE OBAMA CAMPAIGN

During the 18 months before Election Day in the United States in November 2012, the Obama campaign collected and spent more than $1.5 billion. In addition, over 1,000 paid staff worked on the campaign, along with tens of thousands of volunteers. More than 100 data analysts ran more than 66,000 computer simulations every day.189 The objective of the campaign set out by Jim Messina was to “measure everything.” The idea was to collect data on everything that happened during the campaign to ensure that organizers were being smart about everything. According to Chris Wegrzyn, the Director of Data Architecture of the Democratic National Committee (DNC), they had defined three major ways to influence the outcome of the campaign190:

  • Registration: Increase the number of eligible voters.
  • Persuasion: Convince voters to support Obama.
  • Turnout: Increase the turnout on Election Day.

Each potential swing-state voter was assigned a number ranging from 0 to 100. There were four different scores based on the three different ways to influence voters:

  • The likelihood that they would support Obama.
  • The possibility that they would show up at the poll.
  • The odds that an Obama supporter who was an inconsistent voter could be nudged to the polls.
  • How much could someone be persuaded to vote for Obama by a conversation on a particular issue.

This metric was at the heart of the campaign and influenced the message sent to swing-state voters. To effectively manage this during the campaign, they divided the campaign team into different channels:

  • Field channel (actively approach voters in the field).
  • Digital channel (focus on recruitment of staff and volunteers and fundraising).
  • Communication/press (focus on the persuasion aspect).
  • Media (focus on persuasion by buying media time).
  • Finance (focus on fundraising).

The problem with these different channels, however, was that data management was fragmented and an overview was difficult to achieve. That was when Big Data made its appearance. During the previous campaign, they had already learned a lot regarding new technologies and the use of social media. Now it was time to move forward. In 2008, the new technologies used and the analytics captured during the campaign allowed Obama workers to build an unprecedented massive and efficient program that evaluated data entered by field staff. They then progressed to data modeling and deep analytics. In 2012, they built an analyst-driven organization and an environment in which smart people freely pursued their (data-driven) ideas.

The DNC focused on three dimensions:

  1. Volume: By Big Data standards, the amount of raw data collected was small. The committee had less than 10 TB to start with, but because they allowed their analysts to pursue data-driven ideas, this number increased tenfold in a short period of time.
  2. Variety: Many sources of data were new to the DNC. Because of the short timespan, members did not have time to built ETL processes to bring it all together nicely.
  3. Velocity: The data analysts, staffers, and volunteers created new data at high speed and that issue needed to be addressed.

To resolve all this, the DNC decided to use a SQL MPP database, a Massively Parallel Processing database.191 It offered a high-speed performance, stability and scalability, which meant it could easily grow with the needs of the DNC.

In addition, the DNC built a positive feedback loop, so that the engineers could build on top of each other. This proved to be a powerful tool, and it led to unexpected innovations. For example, potential voters could receive tailored news on a topic they told a volunteer was of interest to them. This information was included in the database and resulted in personalized relevant emails that could be sent to these potential voters.

RETAIL INDUSTRY

Retailers that implement a Big Data strategy can achieve a 60 percent increase in their margins, as well as boost employee productivity by one percent, meaning there is every reason to move forward.192 The retail industry collects vast amounts of data because any product purchased in a retail store or online generates data that can be analyzed for additional insights. The volume of that data will grow exponentially in the coming years, due in part to emerging new data sources such as RFID tags. Whether the purpose is to provide a smarter shopping experience that influences the purchasing decisions of customers to drive additional revenue or to deliver tailor-made relevant real-time offers to customers, Big Data offers opportunities for retailers to stay ahead of their competition.

Personalized Shopping Experiences

As with any industry, available data can be used to create detailed customer profiles that can be used for microsegmentation and offerings, such as a personalized shopping experience. A 360-degree customer view will inform retailers how to best contact customers to achieve the best results, while geographic location data (for example, using the geo tag in tweets, Facebook posts, or smartphones that are Bluetooth enabled) will allow retailers to know just the right moment to make a personalized and relevant real-time offer. Such optimized customer contact moments can also be used to recommend suitable and relevant products.

Retailers can recommend products by cross-analyzing in-store interactions with online behavior and then combining it with demographic, geographic, and transaction data collected online and offline (e.g., loyalty programs), In addition, the combination of different datasets will enable retailers to pinpoint the most likely purchasers for certain products. This is done using social media data, purchasing history, online and offline browsing patterns, the blogosphere or forums, customer loyalty, and demographic data.

Sentiment analysis will tell retailers how customers perceive their actions, commercials, and available products. What customers are saying online will provide retailers with additional insights into what they are really looking for, and it will enable retailers to optimize their assortments to suit local needs and wishes.

Accurate Demand Forecasts

Retailers can predict future demand using various datasets, such as web browsing patterns, industry advertising buying information, enterprise data, social media sentiment, and news and event information, to predict the next hot items.193 Using such data as customer transactions, demographics, shopping patterns, research, and local buzz, the demand in local areas and different channels can be predicted. When combined, this information will enable retailers to stock and deliver the right products in the right amounts to the right channels and regions. In addition, retailers can improve shipments by evaluating top-selling products, making markdown decisions based on seasonal sell-through, stopping shipments for bottom-selling products, and communicating more effectively with their supply-chain partners to optimize inventory.194 Such accurate demand forecasting will help retailers optimize inventory, improve just-in-time delivery, and reduce related costs.

Based on the forecasted demand, retailers can start using models that incorporate weather, seasonal, and news data to have the right number of staff present in stores.195

Innovative Optimization Possibilities

Customer demand, competitor activity, and shareholders value data can create models that automatically synchronize pricing with inventory levels, demand, and competition. In addition, market information about events or stories in the news and relevant supply chain information can be used to adjust pricing.

Big Data technology will also enable retailers to optimize floor plans. Using smart cameras within stores, retailers can gain insights into customer movements and behavior. In addition to cameras, retailers can also use passive information, such as Wi-Fi or Bluetooth data, to monitor movements throughout the stores and determine the most efficient traffic patterns. This can indicate which locations are more or less attractive and can be used to improve displays and product placement. Layouts can be adjusted accordingly to create a revenue-optimized floor plan.

Using financial, sales, and inventory data across all points of sale worldwide—department stores, specialty stores, and catalogs/Internet—retailers can also discover patterns that identify new revenue optimization opportunities or cost reduction possibilities.196

In this highly competitive market, retailers need to do what it takes to stay ahead of the competition. Big Data technology can help them outperform their peers financially while improving customer satisfaction.

WALMART IS MAKING BIG DATA PART OF ITS DNA

Walmart started using Big Data even before the term became known in the industry. In 2012, the company moved from an experiential 10-node to a 250-node Hadoop cluster.197 At the same time, it developed new tools to migrate existing data on Oracle, Neteeza, and Greenplum hardware to its own Big Data systems. The goal was to consolidate ten different websites into one and store all incoming data in the new Hadoop cluster.

Many of the Big Data tools were developed at the Walmart Labs, which was created after Walmart took over Kosmix in 2011.198 Some products developed at Walmart Labs are Social Genome, ShoppyCat, and Get on the Shelf.199,200

Walmart uses Social Genome to offer customers or their friends, who have mentioned a specific product online, a discount. To do this, the organization combines public data from the web, social networks, and proprietary data, such as customer purchasing data and contact information. This has resulted in a vast, constantly changing, up-to-date knowledge base with hundreds of millions of entities and relationships. It helps Walmart better understand the context of what customers are saying online. An example mentioned by Walmart Labs shows a woman tweeting regularly about movies. When she tweets “I love Salt,” Walmart was able to understand that she was talking about the movie Salt, and not the condiment.

Walmart encountered several technical difficulties when developing the Social Genome, among others the quantity and velocity of data that pours into its Hadoop clusters. As the regular Map-Reduce/Hadoop framework was not able to cope with the amount and velocity of the data, Walmart developed its own tool called Muppet.201 This now open-source tool processes data in real time over all clusters and can perform parallel analysis.

The Shoppycat product developed by Walmart could recommend suitable products to Facebook users based on the hobbies and interests of their friends. It uses the Social Genome technology, among others, to help customers purchase gifts for their friends. An interesting aspect of this Facebook App is that Walmart will direct Facebook users to a different store if the desired product is sold out at a nearby Walmart store.

Get on the Shelf was a crowd-sourcing solution that gave everyone the chance to pitch his or her product in front of a large online audience.202 The best products would be sold at Walmart with the potential to suddenly reach millions of consumers. More than a million votes were cast; in the end, three of the products are now carried at Walmart. In addition, Walmart is able to optimize the local assortment of its stores based on what the customers in the neighborhood are saying on social media networks.

Mobile Big Data Solutions

With over 200 million customers visiting one of its stores every week, Walmart obviously is focusing on mobile development. It has developed several iOS and Android Apps that use the latest technology to give customers the best shopping experience. The company created and subsequently open-sourced two tools: Thorax and Lumbar.203,204 Thorax is a framework to build large-scale web applications; Lumbar is a Java-built tool that can generate modular platform specific applications.

Walmart's Big Data ecosystem processes multiterabytes of new data and petabytes of historical data on a daily basis, covering millions of products and hundreds of millions of users from internal and external sources. It analyzes over 100 million keywords to optimize the bidding of each keyword on a daily basis.

Walmart's use of Big Data impressively illustrates what can be accomplished if Big Data is truly incorporated into a company's DNA. To date, Walmart has been able to optimize the local assortments of merchandise available in Walmart stores based on what customers in the neighborhood are saying on social media. When Walmart combines all Big Data efforts with its mobile efforts, truly exciting solutions can be created. Walmart is also developing in-store mobile navigation using personal smartphones with which it can steer customers through aisles of products they have been talking about on social media and therefore are more willing to buy. Of course, this will result in increased revenue for the already largest retailer in the world.

TELECOM INDUSTRY

If they choose, telecom organizations can know everything about their customers, including where they were when, with whom they connect frequently, what their daily habits are, and so on. This is all thanks to the growing number of call detail records, location data, social media data, and network traffic data.205 The global telecom industry has experienced a massive growth in data because of the invention of smartphones and tablets, the next generation mobile networks, and the world that is becoming connected to the mobile Internet. Those telecom companies that use this vast amount of data efficiently will outperform their peers, grow their market shares, and improve their bottom lines.

Improve the Customer Experience

Telecom organizations are collecting such vast amounts of data because the European government requires it. As a result, they can relatively easily generate a 360-degree view of their customers using their own data (call data, geo data, internet usage data, and so on) and public data from social networks. With such a detailed view of their customers, they can start to create highly customized experiences with targeted promotional offerings. These intelligent, mass-personalized, multichannel marketing campaigns can target the customer at the right moment at the right location with the right message. The integration of customer intelligence, behavior segmentation, and real-time promotion execution can increase sales, promotional effectiveness, and market share, while reducing costs.

When all the relevant data is centrally stored in a platform that is accessible by call center representatives, it will be possible to modify subscriber calling plans when necessary, thereby driving customer satisfaction and improving customer profitability.206 New personalized products or services can be offered to customers based on real-time usage patterns, which will reduce costs for the customer and increase customer satisfaction.

Innovate and Build Smarter Networks

Network traffic is increasing to double digits because of better positioning and the rollout of 4G worldwide.207 Understanding how, when, and where customers are using networks can lead to better networks that automatically adapt to high demands. Algorithms could monitor and analyze network traffic data in real time, thereby optimizing routing and quality of services while decreasing outages and increasing customer satisfaction. The analyses can also be used to optimize the average network quality, coverage, and deployment over time.

Data from tracking all connected devices in real time on the network can be combined with public datasets about events happening in real time. If an event drives up Internet or cell phone usage, a telecom organization could learn this in real time and take preventive action as needed. Moreover, sensors in the network, for example at antennas, can monitor the equipment and notify the office if an action or maintenance is necessary.

In additional, Big Data tools can be used to easily identify problems, perform real-time troubleshooting, and quickly fix network performance issues, which will improve network quality and lower operating costs. For example, when sensors in the network suddenly notice a high rate of dropped calls, immediate action can be taken to decrease downtime and optimize the network.

Although real-time, deep-packet inspection can be used to optimize traffic routing and drive network quality of service even more, in many countries, including The Netherlands, it is forbidden.208 It even caused a stir when consumers found out that telecom organizations were monitoring what applications or which websites they were visiting from their mobile devices.

Decrease Churn and Reduce Risks

To decrease churn rates, telecom organizations can start to better understand which customers are influencers and what their (latent) needs are. This understanding can provide valuable information. For example, if one of these influencers switches companies, it could have a domino effect. The ability to combine billing, drop-call, and sentiment analysis can give telecom organizations the ability to reduce churn rates by knowing upfront what is going to happen. Predictive analytics can automatically inform when action is required to prevent a customer from going to a competitor. As a result, the telecom company can offer a tailor-made deal just in time.

Big Data tools can also be used to reduce losses from customer or dealer commission fraud. Calls from the same number from two different locations could indicate a cloned subscriber identity module (SIM) card, which means fraud. Preventive measures can be taken immediately and automatically, if needed. In addition, historical payment data or call data records can be used to detect and identify fraudulent behavior in real time.209

Telecom organizations generate vast amounts of customer data that can also be used to deliver relevant and timely location-based promotional offers and other services to third parties. Telecom organizations’ ability to sell their data, with its anonymous customer insights, to third parties or local governments could be a welcome source of additional revenue. There are a lot of benefits for telecom organizations, so operators should start experimenting with the data to gain an understanding of its vast possibilities and opportunities.210

T-MOBILE USA REDUCES ITS CHURN RATE BY 50 PERCENT IN ONE QUARTER

Some of the metrics captured by telecom organizations includes when people call, how long they talk, direct messaging peaks, Internet usage, and so on. If you have 33 million customers, as T-Mobile USA does, we are talking serious Big Data.211 Strangely enough, not many telecom organizations are putting all this Big Data to use. T-Mobile USA, however, does, and with its Big Data strategy, it reduced churn rates by 50 percent in just one quarter.

To fully use all of this data, T-Mobile USA combines a lot of subscriber and network data from multiple databases and source systems. It uses several tools to store, analyze, search, and visualize all its data. The hardware is based on the Informatica Power-Center.212 The company uses Splunk to search through log files and Tableau Software to visualize all data.213,214 Backed by these technologies, it started using six “data zones” that are connected to business objectives:

  1. Customer data zone: Provides a 360-degree view of each customer used to minimize customer dissatisfaction.
  2. Product and services zone: Determines which products and services are used by whom and when in order to drive innovation.
  3. Customer experience zone: Identifies the channels that interact with the customer and when. Used to regain and optimize service levels.
  4. Business operations zone: Contains all billing and accounting information, as well as the finance and risk management data. Used to define the best areas for optimization and performance.
  5. Supply chain zone: Identifies how purchase orders, shipments, and logistics operate. Used to drive innovation within the supply chain and to cut costs.
  6. Network zone: Stores all (raw) data that supports management. Used to drive innovation and grow quality customers.

These zones place physical data storages and networks in a virtualized environment. The virtualized data zones help T-Mobile USA identify complex systems, differences in data definitions, or incompatible data. It also prevents duplicate content or incorrect business rules and decentralizes rule management.

But how did T-Mobile USA tackle the churn rate? By using a “tribal” customer model. This is based on the fact that some people have greater influence over others because of their extensive social networks and connections to different (online) groups. If one of these customers switches telecom providers, it could cause a domino effect by leading others in his or her network to follow. For each of these influential customers, an additional Customer Lifetime Value (CLV) is calculated. This new CLV allows T-Mobile USA to determine its most valuable customers.

Next, the churn expectancy of a customer is based on three different analyses:

  1. Billing analysis: This includes customer product usage, such as how often and where and for how long a user calls whom, how many text messages are sent to whom, and internet usage. If more and more calls are going to a different provider, this could indicate that the network of the customer is switching, resulting in a higher chance that the customer will also switch.
  2. Drop call analysis: If a user relocates to a different area and the data shows that the customer receives limited coverage in the new area, an alert is sounded and a customer representative can offer a new phone or a free femtocell to prevent the customer from switching.
  3. Sentiment analysis: This includes predicting triggers and indicators of what customer actions are going to be and how they think of T-Mobile USA, allowing the company to respond proactively.

These different analyses are combined into an integrated single view for customer care. This system, called “Quick View,” offers agents and retail store associates multiple key indicators, including the CLV in a split second on one screen. Additional information regarding high-value subscribers is sent automatically to agents along with customer-specific offers, such as a new service plan.

This tailor-made and customer-centric approach caused a drop in monthly leaving by T-Mobile USA customers. After losing almost 100,000 customers in the first quarter of 2011, the company reduced the churn rate to 50,000 in the second quarter. Since then, T-Mobile USA has focused on retaining its loyal, high CLV subscribers, as well as on upgrading its customers to higher quality products, thereby leading to higher customer satisfaction and increased revenue.

TRANSPORTATION INDUSTRY

The transportation sector is on the brink of a paradigm shift thanks to Big Data.215 Smarter transportation will result in operational efficiency, improved end-to-end customer experiences, reduced fuel consumption, and increased flexibility. Logistics companies are already working hard to use sensor data in trucks to optimize routing and decrease fuel consumption. American logistics company US Xpress has installed almost 1,000 sensors in each truck to monitor where drivers are going, how fast they drive, how often they break, when maintenance is required, and the drivers’ capabilities. But, many more opportunities exist for the transportation industry.

Optimize Freight Movements and Routing

Consolidating shipments and optimizing freight movement can enable same-day regional delivery. Knowing exactly which products are in which warehouses can help companies like Amazon deliver the right product at the right time to the right customer within 24 hours. Removing supply-chain waste and analyzing transaction-level product details will ensure efficient transportation of freight.216

Satellite navigation and sensors can track trucks, trains, airplanes, or ships in real time. Routing can be optimized using public data about road conditions, traffic jams, weather predictions, delivery addresses, location of gas stations (in the cases of trucks), and so on. Whenever a change of address comes in from the head office, it can be pushed to the driver or captain in real time. The system will automatically calculate and optimize the ideal and least expensive route to the new destination.

Sensors in trucks, trains, ships, and airplanes can also give real-time information about how the vehicle is performing, how fast it is going, how long it is on the go, how long it is standing still, and more. All this data, combined with sensors that monitor the health of the engine and equipment, can predict errors and arrange for necessary maintenance without losing too much time. It is even possible to automatically book maintenance at the most efficient location, while the engineer instantly knows what the problem is and how it can be solved.

Large logistic organizations can have hundreds or thousands of trucks, trains, airplanes, or ships. If their usage is not optimized, a company can lose a lot of money. With sensor data, companies can locate all their trucks at any moment, as well as know their inventory and destination. This information can help the transportation company optimize its fleet and increase efficiency.

Determine Inventory on Hand

In-transit stock is still part of an organization's inventory, even though it has physically left the warehouse. It is important to know the exact inventory at all times, especially if last-minute changes need to be made. When all products contain sensors, they can be tracked in real time and adjustments and/or inventory counting becomes very simple.

Inventory management analytics can be used to create a centralized platform that offers organizations a detailed overview of departure and arrival times, order cuts, and the ability to provide customers with detailed information on their freight.217

Improve the End-to-End Customer Experience

Customers want to know exactly where their freight is and when it will be delivered. With a smart transportation system, freight shippers and customers are given the information and tools to decide the best way to get their product from origin to destination across different modes of transport, considering cost, time, and convenience.218 A package can use several modes of transport; with a smart transportation system, customers can determine how their freight goes from A to B. This will enable customers to better manage their supply chain, as well as costs.

Transportation organizations will be able to develop 360-degree profiles of their customers to create a single enterprise view.219 Using various open, public, and social datastreams, as well as company information about customers, shippers will be able to improve their marketing effectiveness and increase customer loyalty and revenue.

Reduce Environmental Impact and Increase Safety

Fuel consumption can be reduced in several ways. First, sensors can monitor the engine and optimize fuel input. When combined with optimized routing, which is created by taking into account weather conditions, driving behavior, road conditions, and location, a lot of fuel can be saved.

Sensors can also monitor how fast a driver is going and whether the driver is sticking to the rules of the road. They can monitor if the driver is behind the wheel too long or if breaks are too long. It can keep the driver awake and thus prevent accidents, while keeping the driver accountable. More and more cities around the world are experimenting with smart transportation systems that will reduce pollution and increase road safety. The city of Brisbane, Australia, has developed a complete, real-time overview of the city's transportation network, which provides a platform to develop and test new strategies in a stable and real-time virtual environment.220 This platform enables the city to predict and reduce traffic congestion, resulting in happier commuters and shippers, while reducing emissions. The city also uses variable speed limits and roadway queue management algorithms to improve highway safety. With increasing demands by customers to have their freight delivered as fast as possible and as inexpensively as possible, transportation companies face a challenge that luckily can be tackled with Big Data.

US XPRESS DRIVES EFFICIENCY WITH BIG DATA

US Xpress combines 900 different data elements from a vast number of trucking systems, such as sensors for petrol use, tires, brakes, engine operations, geospatial data, and driver comments across a fleet of 8,000 tractors and 22,000 trailers into one Hadoop database and analyzes it in real time to optimize what's happening. US Xpress in Chattanooga, Tennessee, uses hundreds of billions of data records to save over $6 million a year.221

When the company started in 2009, its IT environment was distributed over 130 different databases.222 Employees managed to enter 178 different ways of writing Walmart. They had 90 mainframe screens to monitor and performing a query on data took weeks or even months. Data was a mess and, therefore, useless. It needed to be cleaned and combined before it could help improve efficiency and save money. The company developed a one-stop solution that could combine all different data streams into one interface, the DriverTech system.

The DriverTech system processes, analyzes, and reports back data from tens of thousands of sensors in real time. With real-time data, US Xpress can interpret how drivers are driving, why some trucks are standing still with engines running, how they could reduce fuel consumption, and even where all this happens. The geospatial analysis that US Xpress performs allows the company to monitor what is going on in real time and minimize any downtime. This happens because employees know when a truck arrives at a depot for maintenance or reloading.

US Xpress combines truck data with unstructured data gathered when drivers talk on trucker blogs about the DriverTech interface that is used inside the cabin. This allows US Xpress to quickly address issues with a new release, especially if it is combined with records about how the interface is used by the drivers. For example, when the new touchscreen buttons appeared to be too small, the company learned about it through social media and made the necessary adjustments quickly.223

An important aspect of managing the fleet is its mobility. All fleet managers control dozens of trucks. Each of them now has an iPad that provides real-time data they need to know about their trucks. Nowadays, US Xpress actually owns a private App store, where drivers can download different apps to meet their needs. The most important aspect is that managers and drivers all have the necessary information at their fingertips in real time.

To move from a data-poor to an information-centric organization, the company came-up with a 36-month information management strategy, including 13 strategic projects that were implemented one after another. Each new project is built on the success of its predecessor, so that development keeps flowing. It is clear that US Xpress developed and executed a Big Data strategy. Analyzing all the data collected and optimizing routing led to $20 million in fuel savings in the first year. In addition, US Xpress can now take corrective action when drivers are on idle for too long.224 As a result, they have won several technology awards, including the Ventana Leadership Award for “Overall IT Leadership.”225,226 Not a bad achievement for a trucking company.

TRAVEL AND LEISURE INDUSTRY

The global travel industry is expected to grow to 10 percent of global GDP by 2022; this means an annual revenue of around $10 trillion.227 This massive industry will become a lot more efficient when Big Data is thoroughly implemented in every aspect. In the past years, large steps forward have been made. For many consumers, travel without Big Data would be impossible—or at least tedious and annoying.228

Travel companies are known for capturing and storing massive amounts of data. During every step of a traveler's journey, they collect data, including flight paths, transaction data, customer data, yielding, check-ins, and more. Almost every hotel has a Customer Relationship Management (CRM) program, and let's not forget that yield revenue management was invented in the travel industry years ago. Until recently, all the data was just stored and travel companies had difficulty actually putting it to use by combining various datasets. With Big Data tools, however, this information can be used to make customers feel more appreciated and better served, resulting in additional revenue and higher profits.

Exceed Customer Expectations

Particularly in the travel industry, a personal approach is vitally important, and the opportunities for Big Data are tremendous. If we look at the conversion rate on travel websites, 92 percent of customers do not convert and 60 percent never return after a first visit.229 A 360-degree customer profile includes data collected from social networks (reviews via Yelp or TripAdvisor), the blogosphere, (online) surveys, click behavior, reservation systems, and CRM loyalty programs. Detailed purchasing history and timing, browsing history, service history, revealed price sensitivity, and known or inferred demographics, as well as other interests available online, can determine what drives a customer, whether booking a rental car, a hotel, or a flight.230

The ability to analyze all that data instantly and determine a customer profile will enable travel companies to deliver the right message at the right time to the right person via the right channel. With that, their conversion rate should increase.

Tailored messages during a booking process can help a consumer decide which channel to choose. When, for example, the data shows that a customer appreciates a hotel room overlooking the sea, that room is automatically suggested during the booking process.

It is also possible to predict and understand future demand patterns for hotels, conference centers, or holiday destinations, such as theme parks or casinos. By combining financial data, events data, and news information, room rates can be adjusted to reflect anticipated demand.

The challenge in the travel industry will therefore be to connect all these different platforms, websites, and products during the journey of a traveler. Would it not be great if a traveler would get a message if his plane is delayed, including the new gate, that would allow him or her to leave later from work and still plan a meeting? The hotel he booked would also receive a message that he is delayed, knows the new expected time of arrival, and can have refreshments ready to minimize the impact of the delay. That would truly be exceeding expectations.

Speed Is Key in Online Travel

Online, speed is everything in travel. Consumers generally move away within seconds if an online answer takes too long. After all, there are many other websites offering exactly the same service. Each website needs to sift through millions of records from various sources, such as airline agencies or global distribution companies, and deliver a result. The one that does this faster will see a positive impact on revenue. By building its own Big Data system, a German travel company is now able to process 1,000 queries per second while searching through 18 billion offers across 20 parameters and to deliver an answer within a second.231

Big Data is also increasingly applied at airports, where, for example, it is used to count the number of people present in real time, to develop heatmaps for expected noise pollution in the surrounding areas, or to visualize retail sales at departure gates to see how far travelers wander. The potential of Big Data in the passenger industry is tremendous.

Airlines can also better serve their customers using Big Data. British Airways used Big Data to develop a personalized service program called Know Me. It tracks as much information as possible during a customer journey and acts accordingly.232 If a customer's bag is accidently lost, he or she might receive a free upgrade on the next flight.

With Big Data, such customized offers can be suggested in real time before, during, or after booking or checking in at a counter.

In the data intensive travel industry, Big Data offers a lot of opportunities, including developing new and innovative (online) products based on insights found after combining different datasets. Organizations in the travel industry should start investigating and experimenting with the massive possibilities of Big Data.

FOR CAESARS ENTERTAINMENT, BIG DATA IS MORE IMPORTANT THAN GAMBLING

Caesars Entertainment, known for its numerous luxury hotels and casinos in, among other places, Las Vegas, has become an information-centric organization, in which data drives decision making.233 As a result of the collection of massive amounts of data, Caesars can cultivate customer loyalty and surprise guests with free gifts after, for example, a bad day at the casino.

Its data-driven strategy is based around the Total Rewards program, which has more than 45 million members.234 All members are tracked throughout their entire travel journey, from the moment they book until the moment they leave the hotel or casino. The entire data trail is tracked and analyzed and used to provide superior services to guests.

Because of its data-driven strategy, Caesars has been able to recoup 85 percent of all costs spent on customers, up from 58 percent in 2004. It has also given Caesars valuable insights into guest behavior. As Joshua Kanter, Vice President of Total Rewards for Caesars Entertainment, Las Vegas, says, “Big Data is even more important than a gaming license.”235,236

With all this data, Caesars gives loyal visitors very targeted benefits, while avoiding spending too much money without results. The goal is to define the right profile for each guest who arrives at a casino. Cameras record everyone's action. A player who is guessing is more likely to lose money than a disciplined player. Caesars combines this kind of data with data collected during the customer's trip. Such data includes where they book their stays or travel arrangements, dining choices, gambling preferences, and other activities at the company's properties. All this information is stored, analyzed, and used to provide personal benefits to guests. Consequently, a guest receives a free dinner or hotel room to keep him or her satisfied. On the other hand, the tracking software is also used to prevent any of the 75,000 employees being too big-hearted in giving away freebees.

Caesars Entertainment uses the same type of analytical program to analyze the insurance claims of all its employees and their family members. Managers of Caesars are able to track many different variables about how employees use medical services. This aggregated and anonymous data can help the organization identify differences. One property, for example, Harrah's in Philadelphia, showed a higher use of the emergency room compared to the overall organizations. Managers brought this to the attention of the employees and the rate dropped significantly.237

What will the future look like? Kanter expects that security cameras that are now everywhere in the casinos will be able to predict traffic flows in casinos and determine bottlenecks.238 Subsequently, this information can be used to inform guests via their smartphones which restaurant or table is busy and what the waiting time will be.

TAKEAWAYS

This chapter examined 18 different industries to show the possibility of Big Data for your organization. Many more industries could benefit from Big Data, ranging from fishing to the publishing industry. Each will require its own data and tools and have its own applications. With Big Data, almost anything is possible. Organizations need to use their creativity to define the best use for their situation.

Although different industries and organizations may have varying uses for Big Data, looking beyond your own organization or industry will broaden your view and provide new insights and ideas. Best practices from other organizations and industries are valuable in providing a better understanding of the endless possibilities of Big Data and how you can achieve a winning strategy that will let you outperform your competitors.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.20.224.107