Chapter 6
Final Thoughts and Conclusion

Up to now, you have read and learned what some of the latest technologies can do in the area of data management and analytics. In-database processing can be applied throughout the analytical data life cycle to explore, prepare the data and develop, and deploy the analytical data model in a streamlined process. In-memory capabilities can be applied at the data exploration and model development stages to analyze massive amounts of data quickly and efficiently. These technologies are complementary and can be augmented with the traditional process to optimize the data management and analytics processes. Organizations with Hadoop are integrating with their data warehouse and/or data mart and leveraging in-database and in-memory processing to analyze big data in collaborative data architecture. Organizations are adopting and implementing these innovative technologies together to enable data-driven information for effective analytical-driven decisions. Various customers in banking, e-commerce, government, retail, telecommunication, transportation, and others have provided their insights on the IT and business benefits gained by solving complex problems with in-database, in-memory, and/or Hadoop.

So, what's next in the horizon for data management and analytics? What are some key focus areas in the next five years that will take data management and analytics to the next level? What industries are leaping into the next generation of data management and analytics? Many of these questions will be answered in the next few sections. In this chapter, I will discuss the following topics:

  • Top five focus areas in data management and analytics
  • Future for data management and analytics
  • Final thoughts and conclusion

FIVE FOCUS AREAS

The pace of change is constant, and it is even faster for technology advancements. I am very excited for the future of data management and analytics since every business in every industry is in a position to generate, collect, and explore outrageous amounts of data. There are dozens of interesting data statistics about the explosion of data sources from industry analysts, thought leaders, and IT influencers. They all indicate and expect the amount of data to grow exponentially and beyond imaginable. In one of the statistics, I am astonished to learn that at least 2.5 exabytes of data are produced daily (that's 2.5 followed by a staggering 12 zeros!). This massive data production is everything from data collected by the robots and satellites in outer space to your social media photos from your latest family vacation adventure. Some experts have estimated that 90 percent of all the data in the world was produced within the last two years mainly from digital devices. Looking forward, experts now predict that 40 zettabytes of data will be in existence by 2020. And only three years ago, the entire World Wide Web was estimated to contain approximately 500 exabytes. Thus, 40 zettabytes is beyond conceivable, and IT vendors in the software and hardware business are working diligently to capitalize this massive growth in data.

We (you and me), the users of technology, have partly contributed to this explosion of data. According to a DOMO Data Never Sleeps 2.0 study, thousands and millions of structured and semi-structured data are generated every minute (that is 60 seconds), as outlined below:

  • Facebook users share nearly 2.5 million pieces of content.
  • Twitter users tweet nearly 300,000 times.
  • Instagram users post nearly 220,000 new photos.
  • YouTube users upload 72 hours of new video content.
  • Apple users download nearly 50,000 apps.
  • Email users send over 200 million messages.
  • Amazon generates over $80,000 in online sales.

There is no doubt that the increase in digital data is due to the popularity of the Internet and mobile devices and the growing population that wants to join the digital world. As the digital world and users of the Internet expands, new technologies are emerging that allow people to further create and share information in ways previously impossible. Surprisingly, the bulk of data between now and 2020 will be not be produced by humans but by machines—systems that intelligently communicate to each other over data networks leading to autonomous data-driven decision making: for example, machine sensors and smart devices talking with other digital devices to transmit data or machine-driven decision so that we (humans) do not have to intervene. With all of the data points—clicks, likes, tweets, photos, blog posts, online transactions—the digital and nondigital data tell a compelling and detailed story about you and me (who we are and what we do). Hidden in the never-ending tidal wave of exabytes of data are new ideas, insights, answers, and innovations that will redefine the business landscape, drive opportunities for profits, and change business models beyond big data. The challenge, of course, is being able to capture, sort, and use that information in real-time and at scale. This is one of the biggest technological challenges of our generation and the most obvious is the ongoing task of making sense of all that data. So far, however, only a tiny fraction of the data being produced has been explored for its value through the use of analytics. IDC estimates that by 2020, as much as 33 percent of all data will contain information that might be valuable if analyzed properly with the right technology and infrastructure. And, 33 percent of 40 zettabytes is a lot of data!

With the expectations of increasing data volumes, variety, and velocity, adopting and implementing integrated data management and analytics is even more critical to support the needs of tomorrow. Customers across industries around the world have shared with me what is important to them and what they will focus on in the next five years. They are planning to explore, adopt, or implement one or more of the data management and analytic technologies. All customers believe that they can do more with their data and leverage analytics for competitive advantage. They strongly believe that investment in one or more of these technologies will help to maintain their leadership in the market place and drive innovation for meeting the changing needs of internal and external business and IT requirements. These five areas are:

  1. Cloud computing
  2. Security (cyber, data breach)
  3. Automating prescriptive analytics
  4. Cognitive analytics
  5. Anything as a Service (XaaS)

Most of these five areas are also aligned with the industry trends from Gartner, Forrester, IT influencers, business leaders, and vendors in the data management and analytics landscape. As shown in Figure 6.1, anything as a service is the center of the focus areas that will help organizations achieve their vision for the near future. The other four focus areas (cloud computing, security, prescriptive analytics, and cognitive analytics) are initiatives that customers who I interact with want to embark upon in the next few years. Let's dive into what each focus area is and what it can do to drive innovation for data-driven organizations.

Illustration depicting the Top five focus areas: Cloud computing, Prescriptive Analytics, Services (XaaS), Security, and Cognitive analysis.

Figure 6.1 Top five focus areas

CLOUD COMPUTING

Cloud computing has been a buzzword in the IT industry for a few years now. It is talked about everywhere and by every software and hardware vendor. The global cloud computing market is growing at a fast pace and is expected to reach US$270 billion in 2020. In a recent study, 94 percent of organizations are either already using, or want to make use of, cloud computing as a part of their operations. It is an ideal technology for data management and analytics due to the big data phenomenon (see Figure 6.2).

Illustration depicting Cloud computing.

Figure 6.2 Cloud computing

Because cloud computing has been such a trendy topic, there is a lot confusion as to what it is and whether it is anything new that we are yet or not doing. In the simplest terms, cloud computing is a technology that allows you to store and access databases, servers, programs, and a broad set of applications over the Internet instead of your personal computer. Cloud computing providers own and maintain the network-connected hardware and software for applications ranging from business intelligence, analytics, and data management. Cloud computing allows consumers (like you and me) and businesses to use programs and applications without a large upfront investment in hardware and spending a lot on the heavy lifting of managing and maintaining that hardware. Instead, customers can provision exactly the right type and size of computing resources you need to enable analytical capabilities that support your new, innovative idea or operate your IT department. You can have access to the cloud around the clock year-round, any time, and anywhere, and only pay for what you use.

When discussing cloud computing with customers, I always try to illustrate it with an applicable example. One that is relatable and easy to understand is email, regardless if you have a Gmail (Google), Hotmail (Microsoft), or Yahoo!. All you need is an Internet connection, where you can type in your login ID and password into the application and start crafting and sending emails. The server and email management software is all on the cloud (Internet) and is entirely managed by the cloud service provider: Microsoft, Google, and Yahoo!. The consumers (you and I) get to use the software and enjoy the many benefits.

Traditional on-premise deployments of data warehousing and analytics remain a key strategy for many organizations and the move to cloud computing offers an alternative and modern approach. Cloud computing is providing developers and IT departments with the ability to focus on what matters most and avoid the tedious tasks like procurement, maintenance, and capacity planning. As cloud computing has grown in popularity, several different models and deployment strategies have emerged to help meet specific needs of different users.

The cloud computing services' market growth would be influenced by the global demand for technology-based services, which, in turn, depend on the state of the global economy. Currently, the growth is driven by demand in developed nations in Western markets such as North America and Europe. The developing nations are slow to adapt to the concept, and are expected to drive the growth toward the later part of the decade. The technological backwardness of emerging economies poses restrictions on cloud computing services due to lack of infrastructure availability and technical know-how. Selecting the right type of cloud computing for your needs can help you strike the right balance of control.

Types of Cloud Computing

Businesses have several options in the type of cloud computing and deployment of the cloud. There are three main types that are commonly referred to as Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS). Deployment options include public or private cloud. Each type of cloud and deployment method provides you with different levels of control, flexibility, and management. It is important to understand the differences among Iaas, PaaS, and SaaS, as well as what deployment strategies can help to provide the right infrastructure for your data management and analytic needs (see Figure 6.3).

Overview of Typical cloud computing services.

Figure 6.3 Typical cloud computing services

IaaS is for network architects, PaaS is for application developers, and SaaS is for end users of all types.

Let's examine the types of cloud computing in detail.

Infrastructure as a Service (IaaS)

IaaS consists of the fundamental building blocks for cloud IT. Sometimes this is referred to as “Hardware as a Service” or HaaS. This means that the service provider owns, houses, runs, and maintains all the equipment used to support operations. This includes storage centers, hardware, data servers, and networking components. IaaS provides businesses with the highest level of flexibility and management control over your IT systems and is most comparable to existing IT resources that many IT departments and teams are familiar with today. Users of IaaS are able to run software, applications, and have control over the operation systems while the IaaS providers manage the underlying infrastructure such are servers, networking, and databases. Instead of having to purchase the hardware outright, users can purchase IaaS based on consumption, similar to electricity, water, or other utility services.

IaaS is ideal for customers who do not have IT resources or expertise in-house to manage network and data management capabilities. For example, IaaS providers can host a data center for you that consists of hardware for data storage (i.e., database or data warehouse), a firewall, and a network for access. Users can access the data server through a secured web interface. In this approach, there is no need to invest in hardware on-premise and maintenance. In addition, there is no need to retain a team dedicated to data management.

Platform as a Service (PaaS)

PaaS provides the underlying infrastructure and eliminates the need for organizations to manage the hardware and operating systems. This means that you will be able to rent the hardware, operating systems, storage, and network capacity over the Internet. PaaS enables you to focus on the deployment and management of your applications. This helps you be more efficient as you do not need to worry about hardware procurement, capacity planning, software maintenance, upgrades, or any of the other heavy lifting tasks involved in running your applications.

These cloud platform services are used for applications and other development, while providing cloud components to software. What programmers and developers gain with PaaS is a structure they can manage and build on to develop or customize their applications. PaaS makes the development, testing, and deployment of applications easy, fast, and cost-effective for the following reasons:

  • The operating system features can be changed and upgraded frequently: Developers and programmers have more than one operating system and are never limited to one type or another.
  • Developers around the world can collaborate on the same project: Since developers are running their own operating system, there will never be compatibility or updating issues. This allows the customer to benefit from the different resources they can obtain from virtually anywhere, anytime, and anyplace.
  • Common infrastructure involves less cost: They don't need to own individual storage centers, servers, or other hardware. They can consolidate the infrastructure and decrease their expenses.

In addition, application development using PaaS receives the cloud characteristics such as high-availability, high-performance scalability, and more. Enterprises benefit from PaaS because it minimizes the amount of programming or coding necessary and automates business policy for accountability and auditability.

Software as a Service (SaaS)

SaaS provides you with a complete software product as a service. Because of the web delivery model, SaaS eliminates the need to install and run applications on individual computers. With SaaS, it's easy for enterprises to streamline their maintenance and support, because everything can be managed by vendors: applications, runtime, data, middleware, operating system, virtualization, servers, storage, and networking. In this type, service providers host applications that customers can access over a network (usually the Internet). This is one of the most widely used models as more companies are relying on software services for their business. There are two slightly different SaaS models:

  1. The hosted application management: The provider hosts commercially available software and delivers it over the Internet. For example, email providers like Google or Yahoo! use this model to host their email services and distribute them over the Internet to all web browsers.
  2. Software on demand: The provider hosts unique software and delivers it to a particular network. For example, a design firm might use this model to host the latest Adobe Suite and make it available to all the designers linked to their network.

This service enables consumers to leverage end-user applications running on a cloud infrastructure. With a SaaS offering, you do not have to think about how the service is maintained or how the underlying infrastructure is managed; you only need to think about how you will use that particular piece software. The applications can be accessed from various client devices through a thin-client interface such as a Web browser (e.g., from Web-based email). The user does not manage or control the underlying cloud infrastructure or individual application capabilities.

Another example is leveraging SaaS for analytics. Users can access the analytic tool to run linear regression, basket analysis, customer relationship management, and industry-related (healthcare, retail, pharmaceutical) applications. Some large enterprises that are not traditionally thought of as software vendors have started building SaaS as an additional source of revenue in order to gain a competitive advantage.

SaaS represents the largest cloud market and is still expanding quickly (followed by IaaS). SaaS uses the web to deliver applications that are managed by a third-party vendor. Most SaaS applications can be run directly from a web browser without any downloads or installations required. However, in the near future all the three segments are projected to experience strong demand and growth.

Deployment of the Cloud

There are various deployment options for you to choose from to support the type of cloud you select. Of all cloud deployment, private and public cloud computing are the most popular. Recent research indicates that 7 percent of businesses adopt the private cloud computing, while an overwhelming 58 percent embrace the hybrid cloud, which is a combination of private and public clouds. While there are three basic types of clouds, there are four deployment options: public, private, hybrid, and community.

  1. Public Cloud
    Public clouds are open to the general public with cloud infrastructures that are owned and managed by the cloud service provider. A public cloud deployment is established where several businesses have similar requirements and seek to share infrastructure such as hardware or servers. In addition, it can be economically attractive, allowing users to access software, application, and/or stored data. In addition to the economics, the public cloud can empower employees to be more productive even when away from the office.
  2. Private Cloud
    Private clouds are only accessible to the users or partners of a single organization with dedicated resources. Private clouds can be built and managed within an enterprise data center that includes software and hardware. They can be hosted by a third-party provider and managed either on- or off premises (on-premise can be at the company's location). Customers seeking to benefit from cloud computing while maintaining control over their environment are attracted to deploying a private cloud. Private clouds simplify IT operations, on-demand access to applications, and efficient use of existing underutilized hardware.
  3. Hybrid Cloud
    Hybrid clouds are a blend of private and public clouds. Customers can choose to create a bridge between public and private clouds to address increased demand for computing resources during a time period such as end of the month or specific peak shopping time such as the holiday season. A hybrid deployment is a way to connect infrastructure and applications between cloud-based resources and existing resources that are not located in the cloud. The most common approach of a hybrid deployment is between the cloud and existing on-premises infrastructure, connecting cloud resources to internal system for additional performance and flexibility.
  4. Community Cloud
    Community clouds are comparable to private clouds but targeted to communities that have similar cloud requirements where the ultimate goal is to collaborate and achieve similar business objectives. They are often intended for academia, businesses, and organizations working on joint projects, applications, or research, which requires a central cloud computing facility for developing, managing, and executing cooperative projects. Community cloud shares infrastructure between several organizations from a specific community with common concerns and can be managed internally or by a third party and hosted internally or externally. The costs are distributed across fewer users than a public cloud but more than that of a private to share the saving potential.

Benefits of Cloud Computing

Businesses can reap the many benefits that cloud computing has to offer. These benefits include cost savings, high reliability, simplified manageability, and strategic focus.

  1. Cost Savings
    The most significant cloud computing benefit is cost savings. Regardless of the type and size of the business, keeping capital and operational expenses at a minimum is always a challenge to endure and a goal to achieve. Cloud computing can offer substantial capital costs savings with zero in-house server storage, application requirements, and supporting staff. The off-premise infrastructure removes the associated costs in the form of power, utility usage, and administration costs. Instead of having to invest heavily in data centers and servers before you know how you are going to use them, you only pay for what you use and there is no initial invested IT capital. It is a common perception that only large businesses can afford to use the cloud. On the contrary, cloud services are extremely affordable for smaller and medium businesses.
  2. Simplified Manageability
    Cloud computing offers enhanced and simplified IT management and maintenance capabilities through central administration of resources, vendor managed infrastructure, and guaranteed service level agreements. The service provider maintains all of the IT infrastructure updates and maintenance and resources to support the architecture. You, as the end user, can access software, application, and services through a simple web-based user interface for data management and analytical needs. No more infrastructure investments or time spent adding new servers or partitioning silos. With the cloud, you basically have access to unlimited storage capability and scalability.
  3. High Reliability
    With a managed service platform, cloud computing offers high reliability and high performance infrastructure. Many of the cloud computing providers offer a Service Level Agreement, which guarantees around the clock service with 24/7/365 availability. Businesses can benefit from a massive pool of IT resources as well as quick failover structure. If a server fails or a disaster occurs, hosted applications and services can easily be relocated to any of the available servers within the cloud as backup.
  4. Strategic Focus
    If your company is not in the IT shop, cloud computing can alleviate the technological annoyances. There is no IT procurement to deal with, and the high-performance computing resources give you a competitive edge over competitors to focus on being the first to market. Cloud computing allows you to forget about technology and focus on your key business activities and objectives. It can also help you to reduce the time needed to market newer applications and services and become more strategic in focus for your business. Focus on projects that differentiate your business, not the infrastructure. Cloud computing lets you focus on your own customers, rather than on the heavy lifting of racking, stacking, and powering servers.

Disadvantages of Cloud Computing

However, there are some drawbacks and disadvantages to cloud computing, including security and cyber-attacks, possible downtime, limited control, and interoperability of technology.

  1. Security and Cyber-attacks
    Customers of the cloud are most concerned about security. Although cloud service providers implement the best security standards and industry certifications, storing data and important files on external service providers always opens up risks of exploitation and exposure. Using cloud technologies means you need to provide your service provider with access to important business data that is sensitive and needs to be protected at all times. Public clouds heighten security challenges on a routine basis. The ease in procuring and accessing cloud services can also give malicious users the ability to scan, identify, and exploit loopholes and vulnerabilities within a system. For example, in a public cloud architecture where multiple users are hosted on the same server, a hacker might try to break into the data of other users hosted and stored on the same server.
  2. Possible Outages and Downtime
    As cloud providers take care of a number of customers each day, they can become overwhelmed and may even encounter technical outages. This can lead to your business processes being temporarily interrupted. Since you access the cloud services via the Internet, you will not be able to access any of your applications, server, or data from the cloud if your Internet connectivity is offline.
  3. Limited Control
    Since the cloud infrastructure is entirely owned, managed, and monitored by the service provider, the customer has little or minimal control of the systems. The customer can only control and manage the applications, data, and services but not the backend infrastructure itself. You use what is negotiated and provided.
  4. Interoperability and Compatibility
    Although cloud service providers claim that the cloud will be flexible to use and integrate, switching cloud services is something that has not yet completely developed. Organizations may find it difficult to migrate their services from one vendor to another in case you receive a better cloud service package. Hosting and integrating current cloud applications on another platform may throw up interoperability and support issues. For example, applications developed on Microsoft Development Framework (.Net) might not work properly on another platform like Linux.

There is no doubt that businesses can reap huge benefits from cloud computing. However, with the many advantages, there are also some disadvantages. I personally recommend that you take the time to examine the advantages and disadvantages of cloud computing, and select the cloud provider that suits your business needs. Cloud computing may not be appropriate for every business but should be considered in the near future for data management and analytics.

SECURITY: CYBER, DATA BREACH

Information security that is data-driven has been in practice for many years and is not a new concept. Regardless of the business and industry that you are in, data breach can happen, and it did. In the entertainment sector, the cyber-attack that happened to Sony breached the internal servers accessing internal financial reports, top executives' embarrassing emails, private employee health data, and even new, yet-to-be-released movies and scripts. This information was leaked and placed on the Internet for public consumption. In the government sector, the U.S. Office of Personnel Management (OPM) was hacked and exposed personal records, including name, address, social security numbers, and background checks on millions of people. Among the details of this data breach were copies of millions of sets of fingerprints. In the financial industry, the cyber-attack seized data from millions of customers of Wall Street giant J.P. Morgan Chase, revealing names, addresses, and phone numbers of millions of household and small-business accounts. Finally, in the healthcare industry, Anthem, Inc. experienced a very sophisticated external cyber-attack, which resulted in the theft of personal information such as medical ID numbers, social security numbers, income information, and contact details from millions of customers and subscribers. As a result, 2.5 billion personal records from cyber-attacks were exposed in the past five years.

Data is a strategic asset as mentioned in the previous chapter and should be protected by all means necessary. Using analytics to detect anomalies for IT security is a common practice. For example, financial services sector companies such as Visa, MasterCard, and American Express have used data and analytics to detect potentially fraudulent transactions based on patterns and pattern recognition across millions of transactions. In the public sector, the government agencies have been using data and analytics to uncover terrorist threats and fraud in social programs and detect insider threats, as well as other intelligence applications. Although big data brings opportunities for businesses, it is also being exposed to malicious use from hackers. With additional data sources, hackers are finding new ways to strike corporate IT systems, abusing the big data opportunities for data breaches and cyber-attacks.

The topic of security is very relevant to recent massive data breaches and cyber-attacks. Security has become a main concern of many executives, particularly the chief information officer (CIO) and chief security officer (CSO), due to recent security incidents. In many of these cases, the cyber-attacks have led to very costly data breaches for Anthem, Inc., JPMorgan Chase, Sony Pictures, OPM (Office of Personnel Management), and Target. These types of cyber-attacks are costing businesses US$400 billion to US$500 billion annually. With such costly measures, IT spending around security is on the rise. The analyst firm Gartner anticipates and predicts that the world will spend over US$100 billion on information security by 2018. Another report from Markets and Markets indicate that the cyber security market will grow to US$170 billion by 2020. As expected, industries such as aerospace, defense, and intelligence continue to be the biggest contributor to cyber security solutions since they have highly classified and sensitive data.

Not only do these data breaches and cyber-attacks affect the businesses, but they also affect you and me as consumers, customers, and subscribers to these businesses. If our personal information is exposed, we are at high risk for fraudulent activities such as identity theft, credit card misuse, and false claims on insurance. Unfortunately, many organizations are simply not prepared and lack the right analytics-based approach, strategy, process, technologies, and skillset to deter and prevent cyber security. Since companies and other organizations can't stop attacks and are often reliant on fundamentally insecure networks and technologies, there is a need to adopt smarter defensive strategies. Compared with traditional security methods and efforts for preventing cyber-attack, big data with new data sources bring additional complexity for analysis. New approaches and new ways of thinking about cyber security are beginning to take hold by better capturing the various data points and leverage advanced analytics. Organizations are aggressively looking at data management and analytics for detecting fraud and other security issues by using advanced algorithms to mine historical information in real time. They are responding far more quickly, using platforms that alert security staff to what is happening or what may likely happen, and quickly helping them take action in terms of predictive analytics. In addition to technology enablement with tools and solutions, all businesses big and small must acknowledge the new cyber security realities and requirements by adopting and modifying their practice to embrace an early detection methodology with rapid response and coordination strategy.

Security Gaps in the Traditional Approach

The historical dump-and-analyze approach has proven ineffective because the needed data history is not typically stored or analyzed in a timely fashion. New approaches are required to leverage and evaluate data in a way that can determine what data is and isn't important to support cyber security–based behavioral analytics. In the current approach, information security is really based on correlating data and events designed for monitoring and detecting known or past attack patterns—“what is happening” or “what has happened.” This strategy and tactic have some gaps.

Restricted Analysis on Data Type

Most systems leverage a relational database and therefore require that all of the data be aggregated, integrated, and predefined prior to loading. The requirement to predefine the data and control how the data is sent for analysis can restrict the volume and variety of data that can be collected and analyzed. This also requires significant time managing the data system for end-users as they keep up with updates and change revisions. All of this results in unwanted limitations on the types of security data that analysts see and what they can do to analyze or apply analytics once the data become available.

Delayed Detection of Incidents

As mentioned earlier in Chapter 4 on Hadoop, the database is ideal for certain data types and structures. Combating security and preventing cyber-attacks requires constant streaming of data such as location, physical security information, role, and identity. Leveraging the database to include these additional data requires the process of predefining the data before analysis can take place, which may result in delayed analysis due to integration and customization of the data. These data points and contextual details are essential and can make a difference between offensive versus defensive detection of incidents.

Lack of Ad-Hoc or Real-Time Capabilities

The system is designed to collect, filter, and correlate log events from security devices to detect issues from all the logs and alerts generated. By analyzing the logs with multiple data points simultaneously, the system could identify the most critical alerts to investigate. In order to analyze events across multiple devices, the data must be normalized and stored in a database. However, this design is optimized for detecting alerts but may not be optimized and is less effective for ad-hoc queries to examine attacks that use multiple tactics that span across various touch points and systems.

Rigid Application and Solution

Regardless of the type of company and industry, no two organizations are identical. Every business has different technologies and unique environments, security processes, databases, and analytical applications. Certain out-of-the-box capabilities and reports are adequate, but most organizations need to customize their security solution to fit their environment and business needs. This includes adjusting existing correlation and business rules, developing enterprise dashboards and reports, or generating new ones for added business value. Additionally, nontraditional data sources in the form of semi-structured data (video, blogs, etc.) are often necessary to tackle advanced threats and the ever-changing world of cyber-security.

From a recent survey conducted by ESG (Enterprise Strategy Group), 27 percent of organizations say they are weak when it comes to analyzing intelligence to detect security incidents. Looking at the current status or what has happened model is no longer adequate as multidimensional cyber-attacks are dynamic and can manipulate different tactics and techniques to attain their way into and out of an organization's network and data systems. In addition, the traditional approach and set of security devices are designed to look for particular aspects of attacks such as a network perspective, an attack perspective, a malware perspective, a host perspective, or a web traffic perspective. These different technologies see isolated characteristics of an attack and lack the holistic picture of security. This makes cyber-attacks extremely difficult to distinguish or investigate. Until the entire event data is combined, it's extremely hard to determine what an attacker is trying to accomplish.

In order to combat security issues, building a strong data management foundation for advanced analytics is critical. This entails getting insights into all activities across networks, hosts (e.g., endpoints and servers), applications, and databases. It also includes monitoring, alerting, analyzing for incidents, and then coordinating, containing, remediating, and sharing threat intelligence incorporated back into the monitoring, alerting, and response process. The IT and security teams also need the ability to detect attack activities by leveraging breadcrumbs of evidence found lying across the entire technology stack (e.g., firewalls, IPS, antivirus, and servers). The universal problem is how to be on the offense instead of the defense and to quickly determine the root cause of incidents so that it can be contained before it can be spread throughout the organization. Of course, the obvious intent is to return insight and intelligence from the analysis back into the data system for continuous security improvement.

The French military and political leader Napoleon Bonaparte said, “War is 90 percent information,” as he waged war across Europe. In regards to security and the fight against cyber-attacks, this statement is noteworthy and very accurate in today's world. Information that lends to proactive data-driven decisions is critical. Tackling new types of cyber-threats requires a commitment to data gathering and processing and a greater emphasis on analytics to analyze security data. If your organization is looking to embark on or enhance your data security initiative, consider the following shortcomings of your IT system.

Not Collecting and Analyzing All Relevant Data

Today, the limitation for analyzing cyber security is the inability for organizations and cyber software solutions to leverage all of the data assets. Since multidimensional cyber-attacks most likely navigate a variety of systems, networks, protocols, files, and behaviors, companies need to analyze data across all areas. This means collecting from a wide variety of data sources including logs, flows, network packets, videos, identity systems, physical security, and so on, and making them available to all members of the IT and security team. Since multidimensional attacks can occur over an extended period of time, historical analysis is vital and must also be incorporated with the new analysis so that analysts can analyze the root cause and determine the breadth of possible cyber-attack or data breach. With the appropriate analytical cyber solution, context is provided so that patterns and anomalous behaviors can be proactively identified that are indicators of fraud, theft, or other security breach.

Enable a Flexible Data Management System

While original data formats and context should be preserved for integrity and governance, the security team must also have the ability to tag, index, enrich, and query any data element or group of data elements collectively to get a wider perspective for threat detection/response. This allows the analysts to add context to the raw data, making it contextually rich and more informative for proactive actions. In addition, enhancing the data can help alleviate some steps in cyber investigations and become more productive in the process.

Complicated Data Access and Use of Application

Asking questions and the ability to get responses to questions in a timely manner is critical to any operations, especially in security. To state the obvious from a renowned quality control expert, operations specialist, and profound statistician Dr. W. Edwards Deming, “If you do not know how to ask the right questions, you discover nothing.” This quote can apply to any industry but is very appropriate for detecting cyber security. Any data—but in particular, security data—will remain a black hole if it cannot be easily accessed, analyzed, and understood by the security teams. To accomplish this, applications must provide a simple, easy-to-use interface to access the data and apply advanced analytics to that data. This will empower analysts at all levels to quickly investigate threats and gain valuable insights. Applications should also allow for uncomplicated ways to develop dashboards, queries, and reports to convey security operations to executives and the leadership teams. Thus, applications with data exploration, data visualization, and advanced analytics provide in-depth understanding of the data and track historical trends across data elements.

Businesses face a stark reality when it comes to protecting their most important asset, their data. The processes and technologies they employed for the last few years are no longer adequate in a world of change and complexity. Sometimes it is better to take a step back and evaluate the landscape of cyber security. This practice will reveal the complex nature of modern multidimensional cyber-attacks and likely convince them to adopt a more offensive, proactive, and comprehensive strategy. Therefore, a vast improvement in security around cyber-attacks requires a robust analytics solution to transform data into intelligence. This means collecting, processing, and analyzing all data and focusing on the people, process, and technology needed to detect and address the security activities. It also includes responding in a coordinated manner to an incident and investigating and determining root cause by scoping, containing, and analyzing the problem. Then bring the results of the investigation back into the application for proactive prevention and mitigation.

This new approach to cyber security prevention can be viewed as an end-to-end relationship between data management and big data analytics technologies along with some consulting services. The technology must be scalable, manageable, and easy to use. Having the right technology and infrastructure is only half of the equation. At the same time, a process is needed for an organization to respond by asking the right questions, knowing how to navigate through the data, and leveraging analytics to stop the hackers in their tracks.

Hackers

Hackers have varying degrees of skills, expertise, and knowledge. According to IDC, there are six categories by which hackers are classified. They are accidental, insider, opportunist, hacktivist, professional criminal, and state-level actors. We will look at these in more detail.

Types of Hackers

A short description of each is defined below:

  • Accidental: An employee or contractor who works for an organization accidentally and leaves data exposed because of lack of experience
  • Insider: A more highly skilled employee (compared to an accidental) who is familiar with the internal network and uses known corporate vulnerabilities for self-gain
  • Opportunist: A third party or external person who, in spite of lacking significant skills, uses basic tactics such as worms, viruses, and other tools (this person likes to brag about their effort)
  • Hactivist: A true hacker with more experience and sophisticated skills who uses malware to penetrate IT systems; often executed for political stance or with a specific motive
  • Professional criminal: An organized crime effort(s) from a group including terrorist units that use very high-level and extremely sophisticated abilities to gain financial benefits with damaging information about a business
  • State-level actor: A person who may be employed by the national government using state-of-the-art techniques to gain strategic or economic information

Harris Interactive, a polling company, recently conducted a survey on causes of data breaches and how these data breaches were discovered. The results are shown in Table 6.1. It concluded that primary cause is from a lost or stolen computing device at 46 percent, followed by employee mistakes or unintentional actions. Ranked third and fourth party are third-party glitch at 43 percent and attack from hackers or criminal at 33 percent. The note on Table 6.1 shows how these breaches were discovered and, surprisingly, over a third came from a customer or patient complaint. It shows that the company was not even aware of the attack or breach until someone from the outside told them about it.

Table 6.1 Causes of Data Breaches

Cause of Data Breach Percentage
A stolen or lost computing or mobile device 46%
Employee mistake or accidental actions 42%
Third-party negligence 42%
Criminal attack 33%
Technical system snafu 31%
Malicious act from an insider 14%
Intentional nonmalicious act from employee 8%

Breaches were discovered by:

52% audit/assessment

47% employee detected

36% customer complaint

New Targets for Heightened Risks

The database or the data warehouse remain the primary targets for hackers to penetrate, access, and attack. Once hackers are able to enter the system, it is exposing sensitive and critical information about the company and its customers. In addition to misusing the data, cyber-attacks can really disrupt the day-to-day operations of your company. Hackers often demand a ransom payment to restore access and not to distribute the data publicly. Ransomware is not new, but it is on the rise. This is where hackers use a kind of software that can lock people out of systems until they make a payment such as bitcoin. For those who are unfamiliar with bitcoin, it is a new kind of money and is the open source with peer-to-peer networks to operate with no financial institutions to manage transactions.

As already mentioned, it can happen to any company across industries. However, a number of cyber-attack cases occurred in the healthcare industry, particularly hospitals, in the spring of 2016. Hospitals are most vulnerable because they traditionally spend a very small fraction of their budget on cyber security. Aside from having malware and other cyber security software, it is also important to teach a large network of doctors and nurses not to view and click on suspicious links via email. The way hackers get into a system is generally through a phishing attack—persuading any random employee to click on a link or an attachment in an email—or by finding a network loophole. By doing so, it allows its technical systems to be vulnerable to hackers armed with a cutting edge, ever-evolving set of tools. Most of these doctors and nurses are basic users of technology and are not IT savvy enough to detect suspicious emails and what not to click. As hospitals have become dependent on electronic systems to coordinate care, communicate critical health data, and avoid medication errors, patients' well-being may also be at stake when hackers strike. In some ways, healthcare is an easy target since its security systems tend to be less mature than those of other industries, such as banking, retail, and technology. Where a financial-services or technology firm might spend up to a third of its budget on information technology and security, hospitals spend only less than 5 percent. Figure 6.4 shows the cyber-attack incidents by industry.

Graphical illustration of Cyber-attacks by industry.

Figure 6.4 Cyber-attacks by industry

Hospitals are used to chasing the latest medical innovations, but they are rapidly learning that caring for sick people also means protecting their medical records and technology systems against hackers. It's doctors and nurses who depend on data to perform time-sensitive, life-saving work. Hospitals' electronic systems are often in place to help prevent errors. Without IT systems, pharmacists cannot easily review patients' information, look up what other medications the patients are on, or figure out what allergies they might have before dispensing medications. And nurses administering drugs cannot scan the medicines and the patients' wristbands for giving the correct treatments. When lab results exist only on a piece of paper in a patient's file, it's possible they could be accidentally removed by a busy doctor or nurse and this critical information could simply disappear.

In several U.S. hospital cases where cyber-attacks occurred, a virus infiltrated their computer systems and forced the healthcare company to shut down its entire network, turn away patients, and postpone surgeries. It resorted to paper records where information was scattered and may not be updated. Hackers were demanding payments in the form of bitcoins to restore the operations of the IT systems.

In addition to the traditional database, hackers are evolving and going after the digital sources such as websites and social media. According to IDC, the premier global market intelligence firm, cyber attackers are aiming at modern technologies such as social media (Facebook, Instagram, Twitter), mobile devices (cellular phone, tablets, PDAs), clouds (private, public, and hybrid), and finally the Internet of Things (IoT), where a variety of digital devices are connected to the Internet (more on IoT in the next section in this chapter).

Vendors offering cyber security are developing advanced solutions, specifically cyber analytics. At the same time, hackers are also evolving and targeting new sources and target areas. They also look for the lowest-hanging fruit of vulnerability (i.e. healthcare and hospitals) to set sights for cyber-attacks. The data landscape for cyber-security is becoming much more complex and CIOs/CSOs are dealing with challenging tasks to protect all data, prevent attacks, and proactively mitigate these threats. Having the right data management and analytic capabilities for cyber-security is only half of the equation. The education and cultural perspective may be harder to solve and maintain. Training staff and employees to not leave sensitive data unattended, not click on links in emails they didn't expect to receive, and report any suspicious phishing activity is a daunting and enduring task. It takes a coordinated effort of people, process, and technology to be successful at addressing security.

AUTOMATING PRESCRIPTIVE ANALYTICS: IOT, EVENTS, AND DATA STREAMS

In the last five years, the term and topic of analytics has been trendy across all industries and globally in all regions (Americas, Europe, Middle East, and Asia/Pacific). Companies are making hefty investments in analytics to drive business and operational decisions. Many vendors are spending millions of dollars in research and development to go to market with and deliver the next best thing for their customers. Companies whose domain is not analytics have jumped on the bandwagon to be a part of the lucrative pie. Analytics have become so big that it has surpassed big data searches and should have sub-categories when it comes to this subject. You may be familiar with descriptive and predictive analytics, which have been around for over 10 years. However, the latest trend or buzzword is prescriptive analytics, which is the combination of descriptive and predictive. Let's examine the differences in each.

In Chapter 5, we discussed the characteristics of descriptive and predictive. Descriptive analytics assess the past performance and comprehends that performance by mining the historical data to look for the reasons behind past success or failure of a decision. Predictive analytics is an attempt to forecast an outcome based on the data analysis. It is using a combination of historical performances from descriptive analytics with rules, algorithm, and various external data sources to determine the probable outcome of an event or the likelihood of a situation going to occur. Prescriptive analytics is the combination of descriptive and predictive analytics. It takes analytics to the next level, which is beyond predicting future outcomes. Prescriptive analytics prescribe or can forecast what will happen, when it will happen, and why it will happen. It delivers data-driven decision options on how to take advantage of a future opportunity or alleviate a future risk and reveals the implication of each decision option. Unlike descriptive analytics, predictive analytics rely on all data sources, which is a combination of structured (numbers, categories) and semi-structured (videos, images, sounds, texts) data along with well-defined business rules to enable business analysts to identify the actions needed and predict what lies ahead. Prescriptive analytics also utilizes advanced automated data-driven decision—making techniques (e.g., optimization and simulation models) to evaluate the alternatives and deliver these recommended decisions in a timely manner.

Prescriptive analytics require a well-defined process, highly skilled people, and scalable technology. Technology for prescriptive analytics needs to be scalable in order to analyze the multitude sources of data. This technology not only has to analyze all your data sources, both structured and semi-structured, but also needs to adapt to the dynamic of big data of high data volumes, variety, and velocity. The business process consists of business rules which define constraints, preferences, policies, and best practices. Then, the highly skilled people are needed to develop mathematical applications and computational models by applying statistics, operations research, pattern recognition, and machine learning for effective use of prescriptive analytics. Prescriptive analytics is used in scenarios where there are too many options, variables, constraints, and data points for the human mind to efficiently evaluate without assistance from technology. Figure 6.5 is a diagram that illustrates the relationship of descriptive, predictive, and prescriptive analytics.

Illustration of Prescriptive analytics.

Figure 6.5 Prescriptive analytics

According to a recent Gartner report, only 3 percent of surveyed companies are currently leveraging prescriptive analytics software, compared to 30 percent that are active users of predictive analytics technology. But with the continued explosion of data volume, variety, and velocity, combined with vast improvements in technology, prescriptive analytics adoption is expected to grow substantially in the next five years. Table 6.2 shows a summary of the differences in descriptive, predictive, and prescriptive analytics.

Table 6.2 Different Types of Analytics (Descriptive, Predictive, and Prescriptive)

Type Description
Descriptive
  1. Used widely by businesses
  2. Looks at historical data
  3. Transparent use of analytics
  4. Easy to understand
Predictive
  1. Expanding use by businesses
  2. Complementary to descriptive analytics
  3. Test hypotheses for decisions
  4. Provide direction for future
Prescriptive
  1. Combine descriptive and predictive analytics
  2. Provide results of decision based on scenarios
  3. Holistic view from all data sources

In the past few years, analytics are becoming more automated as a business requirement and a necessary task. Although prescriptive analytics provides a recommendation to a human to take an action, automating prescriptive analytics take the action on the results of the analysis without having human intervention to act on the decision. For example,

  • Prices and packages for vacations change online automatically based on demand or destination.
  • Determine what promotional email to send to a customer automatically based on their preference or profile.
  • Deliver a package automatically based on the subscription or membership to a product.
  • Send a drone to deliver an order within an hour of purchase.

To its proponents, prescriptive analytics is the next evolution in business intelligence and business analytics, an automated system that combines big data, business rules, mathematical models, and machine learning to deliver sound advice in a timely fashion. The need to automate prescriptive analytics stems from companies that demand real-time responses from data-driven decisions. It is obvious that every company is inundated with data and that data must be analyzed. The reality is that organizations do not have enough people to analyze all the data and make all the decisions in a timely manner. Thus, automating prescriptive analytics makes sense.

The systems into which automated prescriptive analytics are often embedded into the event stream processing portfolio are designed to take action in real time. At the same time, organizations are increasingly enabling prescriptive analytics inside of their data warehouses integrated with their Hadoop environment. This helps with managing their many data sources (structured and semi-structured). An integrated infrastructure that automates prescriptive analytics needs to be closely connected to IT and the CIO organizations. This type of advanced analytics should not be separated or siloed within the organization. Let's take prescriptive analytics to another dimension by automating it in a way that decisions are automatically made without any human intervention.

Value of Prescriptive Analytics

Prescriptive analytics provides the instruction of what to do and—just as importantly—what not to do when analytical models are deployed into production environments. Defined as decisions, they are applied to scenarios where there are too many options, variables, constraints, and data for a person to evaluate without assistance from technology. These prescriptive decisions are presented to the front-line worker—providing the answer they seek and accounting for the detailed aspects of the scenario that they find themselves in. For example, call center personnel often rely on prescriptive analytics to know the appropriate options, amount, and under what conditions, a prospective customer can be extended varying levels of credit.

Prescriptive analytics also provides organizations with the ability to automate actions, based on these codified decisions. Every organization has simple, day-to-day decisions that occur hundreds to thousands of times (or more), and that don't have to require human intervention. For example, the identification and placement of a targeted advertisement based on a web shopper's session activity is popular in the retail industry. In such cases, prescriptive analytics is used to ingest, define, and take the optimal action (e.g., place the most relevant ad) based on scenario conditions (in our example, what has been viewed and clicked on during the session). What is optimal, for the purposes of this paper, is defined as an action that best meets the business rule definitions and associated predicted likelihoods.

Scoring data with a model typically involves IT. Sent in an email, or some of other notification, IT is presented with an equation and the data inputs needed. What is often very lacking is the business rationale, context, and a translation of terminology into IT terms. As such, IT will ask all the necessary questions, often recode the model—run tests and validate output, and then apply any specific business policies and/or regulatory rules—and will put the model into production, that is, operationalize the model so it can generate results.

While in some organizations these steps may not all be done by IT, they still happen. Each step of the data analytical life cycle (as referenced in Chapter 1) adds time to developing the model, implementing the model, and cashing in on the business benefits. In many organizations, the latency associated with model deployment to business action is weeks, if not months. As a result, by the time a model is ready to generate results in a production context, it's often too late—and either the opportunity to impact is gone or conditions have changed to the point where the model is no longer relevant.

Prescriptive analytics have the benefit of automating instructions and best suggested options that are acted on by a person. Prescriptive analytics is also used to directly automate actions, for more mundane tasks, doing so consistently and accurately. In both cases, relevancy to the current scenario is assured in this managed environment and is the product of the vetted, tested, and detailed decision flow. As data volume, variety, and velocity are only set to increase, and as technology continues to develop to process more data, faster—the trend to automating actions taken from analytics will correspondingly rise.

The business need to automate prescriptive analytics stems from companies that demand real-time responses from data-driven decisions. It is obvious that every company will increasingly become inundated with data and that data must be analyzed. The reality is that organizations simply don't have enough people to analyze all the data—even if they could comprehend all the scenario details and volumes, to make all decisions in a timely manner. Prescriptive analytics technologies offer the following:

  • Relevant, concise, and accurate decisions
  • Easy automation for human instructions and downstream application/system actions
  • Explicit use of the business context
  • Tested, vetted, and documented decisions
  • Adjustability to changing scenarios
  • Timely deployed actions
  • An unequivocal source of truth, governed in a single environment

Process for Prescriptive Analytics

Being an analytically driven organization means basing decisions and actions on data, rather than gut instinct. As more organizations recognize the competitive advantages of using analytics, the impact can fade as competitors build the same capability. To cross this innovation breakthrough and sustain the competitive advances that come from analytical adoption, organizations continually test and expand data sources, improve algorithms, and evolve the application of analytics to everyday activity to deliver predictive analytics.

Predictive algorithms describe a specific scenario and use historic knowledge to increase awareness of what comes next. But knowing what is most likely to happen and what needs to be done about it are two different things. That's where prescriptive analytics comes in. Prescriptive analytics answers the question of what to do, providing decision option(s) even based on predicted future scenarios.

Events seldom happen in isolation. It is through their interconnections that we develop the detailed understanding of what needs to be done to alter future paths. The richness of this understanding, in turn, also determines the usefulness of the predictive models. Just as the best medicine is prescribed based on thorough examination of patient history, existing symptoms, and alike—so are the best prescriptive actions founded in well-understood scenario context. And just as some medicines can react with one another—with one medicine not being as effective when it is taken with another—so can decisions and corresponding actions taken from analytics impact the outcome of future scenarios.

As you may expect, under different scenarios there are different predictions. When conditions change, the associated prediction for that same data event can also change. When you apply one treatment, you affect another, changing the scenario. Actions that are taken not only create a new basis for historical context but also create new data that may not have been considered by the original model specification. In fact, the point of building predictive models is to understand future conditions in order to change them. Once you modify the conditions and associated event behavior, you change the nature of the data. As a result, models tend to degrade over time, requiring updates to ensure accuracy to the current data, scenario, and new predicted future context.

Well-understood scenarios are fed by data, structured and semi-structured. The more data you have to draw from to examine dependencies and relationships that impact the event being predicted, the better the prediction will likely be. This is where the value of big data comes in—as big data is more data with finer detail and greater context richness. Big data offers details not historically available that explain the conditions under which events happen, or in other words, the context of events, activities, and behaviors. Big data analytics allows us, like never before, to assess context—from a variety of data, and in detail. And when that big data is also fast data (on the order of thousands of events per second), it's a stream of events. When we bridge big data analytics with event streams, as generated in the Internet of Things (IoT)—we have the power to write more timely and relevant business prescriptions that are much harder for competitors to imitate. IoT is a good fit for autonomous prescriptive analytics.

Leveraging the Internet of Things (IoT)

The Internet of Things (IoT) can mean different things for many people and works in conjunction with big data. It is a system of physical objects—devices, vehicles, buildings, machines and others—that are embedded with electronics, software, sensors, and network connectivity so that these objects can communicate through the exchange of data via the Internet. The term was created by a British entrepreneur, Kevin Ashton, back in 1999. IoT is and will continue to generate a lot of data as it represents the connection of the fast growing physical devices and systems. Data transmitted by objects provides entirely new opportunities to measure, collect, and act upon an ever-increasing variety of event activity. According to Gartner, approximately 21 billion connected things will be used globally by 2020. Another staggering statistic is more than 5.5 million new things are connected every day, from sensors on industrial machines to alarm monitoring systems in your homes to GPS location of intelligent vehicles and fitness devices.

The IoT spans a broad range of mature and early-stage technology from RFID tags and remote monitoring to autonomous robots and microscopic sensors dubbed “smart dust”. For the record, Gartner predicts that there will be 6.4 billion connected IoT devices in use worldwide in 2016, and 21 billion units by 2020. That means the number of internet-connected things could triple over the next five years.

A respected U.S. president, Theodore Roosevelt, once said, “In any moment of decision, the best thing you can do is the right thing, the next best thing is the wrong thing, and the worst thing you can do is nothing.” Decisions with limited information are a thing of the past. The IoT is enabling data-driven decisions with a wealth of information that has often been overlooked. Connected devices, coupled with advances in data collection and analytics, are giving business managers at all levels more relevant and timely information when they need it than they've ever had before. How that affects the decisions they're making is having a deep and lasting impact on operational and business performance.

The Internet is now embedded into houses, vending machines, factory equipment, cars, security systems, and more, as illustrated in Figure 6.6. The connected world can be smarter and has potential to change our personal life and how we conduct our daily business operations. Table 6.3 shows some examples of how industries can benefit from IoT with autonomous prescriptive analytics.

Schematic of Internet of Things connectivity.

Figure 6.6 Internet of Things connectivity

Table 6.3 Industry Uses of IoT

Industry Use of IoT
Consumers U.S. households collectively have over 500 million connected devices including Internet service, mobile devices, tablets, monitoring alarm systems with an average of 5 smart applications per family.
Healthcare Smart monitoring systems for patients can alert family, doctors, and nurses when a critical situation occurs; insulin injection trackers and prescription drugs adjusted based on real-time analysis of patient's health—all of which can improve patient care and health management.
Manufacturing This sector leads the way with IoT. There is a 30% projected increase in connected machine-to-machine devices over the next 5 years, driving the need for real-time information to optimize productivity.
Retail Automated order fulfillment for grocery replenishing, for curb-side pickups, and to prescribe other goods and products for in-store shopping experience; build customer loyalty and satisfaction.
Transportation Approximately 24 million cars have navigation systems and Internet access to locate nearby attractions for customers on the road.

On the downside, as IoT grows, the growing use of detectors and sensors will excite the hackers and cybercriminals. They can leverage these devices to hack into the systems. Many traditional fraud detection techniques do not apply because detection is no longer seeking one rare event or anomaly but requires understanding an accumulation of events in context. One challenge of cybersecurity for IoT involves constant data analysis, and streaming data events is managed and analyzed differently. I expect advanced analytics to shed new light on detection and prevention with event-streaming processing. Another challenge is the plumbing of the data generated by IoT. An even bigger challenge for IoT will be to prove its value. There are limited implementations of IoT in full production at the enterprise level.

Use cases leveraging prescriptive analytics in IoT applications abound: everything from analyzing social media watch by collecting tweets, blogs, and posts to determine what consumers are recommending as a service/product to security and surveillance of login sessions and data access for data security breaches—and all else in between. Going beyond collection of data for exploration, and even analysis, prescriptive analytics will not only uncover patterns in events as they occur, but they will be used to take automated actions to prevent unnecessary outages and costs. By sending alerts, notifications, updating situational war room dashboards, and even providing instructive action to other objects, the need for real-time actions has never been greater.

COGNITIVE ANALYTICS

In the previous section, we discussed descriptive, predictive, and prescriptive analytics. Beyond prescriptive is cognitive analytics (see Figure 6.7). Cognitive analytics is combining cognitive computing and advanced analytics into a field of imitating human intelligence. Let's take a moment to better understand cognitive computing. Cognitive computing is the ability to access a vast store of historical data (structured and semi-structured), apply machine learning algorithms to discover the connections and correlations across all of those information pieces and human interaction, and then leverage the “knowledge base” as the engine for discovery, decision support, and deep learning. Cognitive analytics is a field of analytics that leverages the power of cognitive computing to mimic human intelligence with a self-learning feedback loop of knowledge. As expected, data is at the heart of cognitive analytics along with natural language processing, probabilistic reasoning, machine learning, and other technologies to efficiently analyze context and uncover near real-time answers hidden within colossal amounts of information.

Illustration of Cognitive, prescriptive, predictive, and descriptive analytics.

Figure 6.7 Cognitive, prescriptive, predictive, and descriptive analytics

Cognitive analytics pulls from all data sources for intelligence. This includes the traditional and semi-structured data from the digital world—emails and videos, images and sensor readings, plus the vast array of information available on the Internet, such as social media posts and academic research articles. This intelligence is augmented to make sense of all the data that are beyond the capacity of the human brain to process. The significance of the cognitive system is that it can adapt and get “smarter” over time by learning through their interactions with data and humans via the feedback loop mechanism. In addition, cognitive computing is taking advantage of the technological advancement in processing power and massive parallel and distributed computing capabilities that makes applying the analytics more scalable and feasible to quickly answer complex questions and help us to make even smarter decisions. A cognitive system can provide real-time responses to complex questions posed in natural language by searching through massive amounts of information that have been entered into its knowledge base, making sense of context, and computing the most likely answer. As developers and users “train” the cognitive system, answers do become more reliable and increasingly precise over time. Possible use cases will be illustrated later in this chapter.

Figure 6.7 illustrates the analytic maturity continuum. As expected, the correlation between value and complexity goes hand-in-hand. Descriptive analytics provides the minimum value with the least complexity. Predictive provides additional value as well as increasing complexity. When you complement descriptive and predictive for prescriptive, the complexity and value intensify for matured organizations with data-driven focus. Finally, cognitive involves the highest level of complexity and delivers the most value in the continuum.

With the advancement in computing power and capabilities, cognitive analytics is ideal for data-driven discovery and analytic-driven decision making. This is made possible with machine learning algorithms, which can be applied to all data for mining of historical trends, real-time behaviors, predicted outcomes, and optimal responses. The cognitive algorithms can be deployed to operate in a self-automated way by integrating and leveraging the information from the Internet of Things. Of course, not all applications should be automated and enable a machine to make the decisions, but it is not unreasonable to allow a machine to mine and analyze your massive data collections autonomously for new, unpredicted, unforeseen, and influential discoveries.

Evaluating Cognitive Analytics

The ability to automatically take actions based on data insights is becoming an increasingly important aspect of today's modern business environment to gain the distinct competitive advantage. Here are a few things that IT influencers, analysts, vendors, and system integrators in the industry are saying about cognitive analytics and its future.

Driving Innovation

Cognitive analytics delivers innovation with people, process, and technology. The collection of participants includes computer programmers/developers, data scientists, and analysts who develop algorithms and know how to apply the analytics for the intelligence that can be consumed by the cognitive system. Then, the cognitive system can deliver the analytical-data driven information for innovation.

Transforming Our Businesses, Work, and Life

The presence of mobile devices and the expansion of the IoT are already changing the way we conduct businesses, how we work and live. Business processes can infuse with cognitive analytics to capitalize the big data phenomenon, from internal and external sources. This offers the customers a heightened awareness of workflows, context, and environment, which leads to continuous learning, better forecasting, and increased operational effectiveness. For example, I always have my mobile phone in my possession to remind me of appointments, conduct video and meeting conferences, check on the stock market, access the intranet for company updates, and so on. You may even use your mobile device or tablet for distributing critical alerts to colleagues and customers about a business opportunity or product advice. Cognitive analytics is taking data points from mobile devices and IoT to enrich algorithms and intelligent machines.

The result of cognitive applications includes Alexa from Amazon, Siri from Apple, Watson from IBM, and Cortana from Microsoft. Before long, there will be a marketplace of millions of cognitive agents, driven in part by the explosive adoption of mobile devices, IoT, and the upsurge of machine-to-machine interaction. Examples of such agents would be personal virtual assistants who would be with the people helping in many different facets of life. It will be a foundation to be interwoven into technology (e.g., social media) and touch our daily lives.

Uniting and Ingesting All Data

Cognitive analytics converge and unify all types of data for analysis, human interactions, and digital. Cognitive analytics ingest transaction history, web interactions, geospatial data, web interactions, customer patterns from loyalty programs, and data from wearable technologies (such as Fitbit, smart watches, etc.) at a more granular level. These data points add tone, sentiment, emotional state, environmental conditions, and the person's relationship—these details provide additional value to what have been difficult or impossible to analyze. By continuously learning from data collected, these engagements deliver greater value and become more relatable. Cognitive computing processes constantly run on fresh, new feeds of an ever-growing pool of disparate data sources, including media streams, IoT sensor data, and other nontraditional data sources. One type of data that is integral to cognitive analytics is the open source data. In addition, online community data can provide critical data points for cognitive algorithms to perform with enhanced precision and agility. As data continue to grow with rich context, cognitive analytics becomes smarter and more intelligent in delivering answers to many questions.

Automating Data Analysis

Cognitive analytics is enabling machine learning to understand and adopt new theories and concepts by automating detection and sensing deep patterns and correlations. This automation capability with powerful computing power and advanced analytics is fundamental to machine learning's value in a world where data just keeps getting bigger and grows into higher volumes, more heterogeneous varieties, and faster velocities than ever.

Enhancing Governance

As cognitive analytics drive more business processes and automate more operational decisions, compliance and governance play a key initiative. Organizations are beginning to implement more comprehensive legal, regulatory, and policy frameworks to manage compliance, risks, and consequences for tractability and auditability capabilities. We are seeing an increased demand for more coherent frameworks for cognitive data sharing, decision lineage tracking, and computer algorithm accountability.

In addition, there are safeguards within governance over security, privacy, and intellectual property protection from cognitive systems.

Emerging Skillset for Data Scientists

According to a Harvard Business Review report, the job of data scientist has been labeled the sexiest job of the twenty-first century. With the emerging cognitive application development, it has high demand for data scientists with similar types of skills. These professionals are combining statistical modeling, computer programming skills, and domain knowledge in open source technology such as Hadoop, R, and Spark. As cognitive analytics mature, it will require industry subject matter experts to help translate requirements and needs of the business.

Expanding the Need for Services

Every industry's and profession's knowledge is growing at a rate faster than anyone can keep up with. Such as with open source technologies, new protocol, best practices, new regulations and policies, it demands the continuous improvement, adaptation, and augmentation of their capabilities to deliver uses not previously possible. Some businesses that want to explore and enter the cognitive computing and analytics may need to outsource expertise that they may lack in-house in these areas. We discussed the services offered through the cloud, and the same services can be applied here for cognitive analytics. Later in this chapter, we will discuss “Anything-as-a-Service” (XaaS), which can be applicable to outsource data scientists, computer programmers, industry consultants, and project administrators to be on-demand and to assist with the internal resources that may not be available at your company.

Personalizing Our Needs

Everyone likes to have personalization to our likings and needs. For example, when you are sending a marketing email, you often personalize it with a salutation, first name, last name, and a link for more information. Cognitive applications can take the personalization to the next step with more natural interactions such as voice and visualization recognition. In addition, our personal systems will interact with each other, and develop a collective intelligence based on mutual communities and tastes. Analysts and IT influencers anticipate an increase in analyzing geospatial and temporal context of everything we do and deliver reposes based on those contexts.

Cognitive analytics has many possible use cases being explored in theory, which may likely turn into real-use cases.

Possible Use Cases

As I research and learn about cognitive analytics, I must admit I am fascinated with the possibilities and the possible innovative use cases that customers across industries can consider. I covered a healthcare use case under the security section. On the opportunity side, healthcare organizations can leverage cognitive analytics to uncover patterns and make inferences based on aggregated patient records, insurance claims, social media trends, travel patterns, and news feeds. In another example related to healthcare, a doctor or a nurse can leverage cognitive analytics to quickly scan through medical journals, clinician notes, patient history, and other documents to find highly relevant information to improve a diagnosis or treatment plan. Each person can possibly generate one petabyte in a lifetime. Cognitive analytics is designed to help industries such as healthcare to keep pace, and serve as a companion for doctors and nurses with their professional performance. Because these systems master the language of specific professions—for example, the language of medicine, medical procedures, and medical practices, this reduces the time required for a professional or a new medical student graduate to become an expert.

In the oil industry, cognitive analytics can possibly automate operations where there is a lot of streaming data at very fast velocity when monitoring oil fields remotely. This approach can cut operation and maintenance costs, allowing the company to do more strategic tasks. In the digital oil field, a single system captures data from well-head flow monitors, seismic sensors, and satellite telemetry systems, which are part of the Internet of Things. The data are transmitted, streamed, and relayed to a real-time operations center that monitors and analyzes the data to detect anomalies. As the analyses occur, it can automatically adjust these parameters to tame the anomalies, predict downtime, and act on that information to optimize production and minimize downtime. Furthermore, the feedback loop mechanism can train the system to know what to do in the future when a similar incident occurs.

In the manufacturing sector, innovative companies conduct complex analyses to determine how much customers are willing to pay for certain features and to understand which features are most important for success in the market. For example, one manufacturer used customer insights data gathered through sensors and detectors, which are also part of the IoT, to eliminate unnecessary costly features. Cognitive analytics can determine which features had higher value to the customer and which customer is willing to pay a higher premium for them.

Finally, any retail business with a call center can also leverage cognitive analytics. A customer representative can quickly respond to a customer's inquiry about baby gear and accessories by using a cognitive system that pulls information from product descriptions, customer reviews, sales histories, community blogs, and parenting magazines. Like myself, anyone who has a new baby has many questions, concerns, and inquiries about product safety, popularity, and adoption so that you have the best items for your loved ones. Other examples using cognitive analytics include property management where smart buildings are constructed with sensors and detectors via IoT to conserve energy consumption, enhance security for property owners, and perform preventative maintenance on a building or complex.

It is an exciting field. With advancements in technology such as analytics, data management, machine learning, and natural language processing, it is with utmost enthusiasm that I closely monitor and watch the progress of cognitive analytics.

Expectations and Looking Ahead

Cognitive analytics is still in its early stages of maturity and in the mainstream—and is by no means a replacement for traditional information and analytics programs. Traditional analytics and data management systems are based on rules that shepherd data through a series of predetermined processes to arrive at outcomes. While they are powerful and mature, they thrive on structured data, but are incapable of processing qualitative or unpredictable input. This inflexibility limits their usefulness in addressing many aspects of a complex, emergent world, where ambiguity and uncertainty abound. Cognitive systems augment the traditional information and analytics programs.

Rather than ignore unwieldy, diverse data formats, organizations can use cognitive analytics to quickly exploit traditional and semi-structured data—text documents, images, emails, social posts, and more—for useful insights. Cognitive systems are probabilistic, meaning they are designed to adapt and make sense of the complexity and unpredictability of semi-structured data. They can “read” text, “see” images, and “hear” natural speech, which is expected in the modern world. They interpret that information, organize it, and offer explanations of what it means, along with the rationale for their conclusions. They do not offer definitive answers but information that can lead to the right answers. They can be designed to weigh information and ideas from multiple sources, to reason, and then offer hypotheses for consideration. A cognitive system assigns a confidence level to each potential insight or answer. For businesses that need to find real-time answers hidden within massive amounts of diverse data sources, getting a head start on building cognitive analytics capabilities could be a strategic and smart move.

Cognitive analytics will provide additional personalized services to you and me, the consumers of information technology. Based on the data, the details from the results of cognitive analytics reveal our preferences and historical patterns since we are creatures of habits. As humans, we can be inconsistent and unpredictable with our reasoning and decision making. Cognitive analytics can improve the quality and consistency for business and personal decisions by tracing how decisions are made and measuring the resulting outcomes, allowing leading and best practices to be shared across the organization and in our personal lives. Finally, it can enhance knowledge sharing, providing fast access to on-demand answers to highly relevant and important questions. Analytics is about asking—and answering—smarter questions to get higher-quality results at a lower cost. These questions are often about driving more value in your organization, data-driven information that leads to analytical-driven decisions.

ANYTHING AS A SERVICE (XAAS)

We touched on big data and its presence in the marketplace. The growing popularity of the cloud has its collective service offerings (SaaS, IaaS, PaaS). Beyond the cloud, Anything as a Service, known as XaaS, in particular, is in high demand to support end-to-end data management and analytics projects. The demand for XaaS plays a critical part in the growing market of big data. According to Accenture, in a study with HfS research, 53 percent of senior vice presidents and above see XaaS as critical or absolutely critical for their organization. In another report from International Data Corporation (IDC), the big data technology and services market represents a multibillion-dollar opportunity in excess of US$48 billion in 2019. With this in perspective, service providers of XaaS are incented to offer software, infrastructure, and platform services in a bundle solution or a single package. The customer can take advantage of XaaS, which can replace traditional services that were provided by one or more IT departments or companies.

With a XaaS solution, the customer receives many services from the same provider for one convenient cost, usually on a subscription basis and often not having to spend any of their capital investment. Not only is this method far easier for the customer to keep track of, but also it provides the customer one-stop shopping. If problems arise with any one service, the customer needs only to contact the company who provides them all instead of having to deal with individual providers.

Companies that take advantage of the XaaS business model will discover that the services are conveniently and largely plug and play. This approach is especially beneficial for start-up companies who want to begin immediately without the need to spend time procuring services, managing multiple contracts and moving through the purchasing process. No longer do start-ups have to deal with high and prohibitive costs to begin their businesses. XaaS will allow for far lower start-up costs, as well as the ability to develop and test new business models at a faster rate. With XaaS, businesses can concentrate on the value that comes from helping their customers rather than accessing infrastructure and capital. The managed service nature delivers up-to-date with the very latest technologies and product developments.

In addition, companies can also scale up or down, depending on their needs at a given moment in time which is another important influencer and flexibility adopting XaaS. The XaaS providers bring with it an ongoing relationship between customer and supplier, in which there is constant communication with status updates and real-time exchange of information. This benefit can save a business weeks or months.

Another benefit is cutting out the middle man. The XaaS model is changing everything in that it is both taking over applications and also taking over service delivery channels and thus, cutting out the traditional middle man. With the Internet and mobility becoming the new norm and the standard way of doing things, people can access the services and applications anytime, anywhere, and any place.

XaaS can help to accelerate time to market and innovation. No customer likes deploying something and then discovering that a new or better version of the software or hardware has come along a few months later and they are feeling already behind the technology curve. With the XaaS approach, innovation can occur in near real-time where customer feedback can be gathered and acted on immediately. Organizations (and their customers) are able to stay at the cutting edge with state-of-the-art data management and analytics technologies with minimal effort. This is an area where XaaS distinguishes itself from the traditional IT approach and practitioners who still believe that it's better to build and develop things themselves. There are times that building IT makes sense for the larger enterprises, but it may end up with spending a lot more money to be locked into something that could pretty soon be out of date. Open source and integration environments that encourage application development are thriving. Through this kind of service, the leaders and innovators can be pioneers in their respective markets.

So while the benefits and reduced risks of the XaaS model are clear and tangible, it requires users to have access to a network with Internet connectivity. The network backbone is what powers the XaaS value and proposition forward. These services all rely on a robust network to give the reliability that services need and that the end-users expect and deserve. As companies make the shift to the XaaS paradigm, they must always think about their networks. If reliable, stable, high-speed connectivity is not available, then the user experience declines and the service proposition weakens. Another risk is hiring the right XaaS provider with the right skillsets and expertise. A report published by McKinsey, titled “Big Data: The Next Frontier for Innovation, Competition, and Productivity,” cautioned on the challenges companies could face, such as having a shortage of well-trained analysts that could analyze efficiently all the information given by big data. The report cautioned how the United States alone faces a shortage of 140,000 to 190,000 people with deep analytical skills, as well as 1.5 million managers and analysts to analyze big data and make decisions based on their findings.

In today's world of convenience, XaaS can alleviate any guess work with time and resources for data management and analytics projects. Not only do organizations need to put the right talent and technology in place but also structured processes and workflows to optimize the use of big data. Access to data is critical—companies will increasingly need to integrate information from multiple data sources, often from third parties, and have them in a common architecture to enable the value of data and analytics. XaaS can provide the people, process, and technology in many areas. Some of the common XaaS are described as follows.

CaaS

The C in CaaS stands for “Communications” as a Service. You can outsource all your communication needs to a single vendor that includes voice over Internet protocol (VoIP), instant messaging, collaboration, and video conferencing, among others related to communications. In this case, the provider is responsible for all the hardware and software management. Service providers usually charge on an on-demand basis so you only pay for what you need. Like other services, it is a flexible model and can grow as your needs for communication expands.

For example, businesses have designed specific video conferencing products in which users can sign in via the Internet and participate as necessary. Vendors can then bill the business according to its participation. The convenience and utility of CaaS and similar services are rapidly expanding the business world. It is part of a greater trend toward cloud computing services and other remote services used by businesses to reduce overhead or optimize business processes.

DBaaS

The DB in DBaaS stands for “Database” as a Service. It provides users with some form of access to a database, most likely via the Internet, without the need for setting up physical hardware, installing software, or configuring for performance. The service provider manages all of the administrative tasks and maintenance related to the database so that all the users or application owners need to do is use the database. Of course, if the customer opts for more control over the database, this option is available and may vary depending on the provider.

In essence, DBaaS is a managed service offering access to a database to be used with applications and their related data. This is a more structured approach compared to storage-as-a-service, and at its core it's really a software offering. In this model, payment may be charged according to the capacity used as well as the features and use of the database administration tools.

DRaaS

The DR in DRaaS stands for “Disaster Recovery” as a Service. It is a backup service model and provides resources to protect a company's applications and data from disruptions caused by disasters. This service is also offered on-premise as well. It gives an organization a complete system backup that allows for business continuity in the event of system failure. Figure 6.8 shows the types of possible disasters that can occur to interrupt your business operations. Human error accounts for 60 percent of the disaster recovery incidents, followed by unexpected updates and patches at 56 percent and server room issues at 44 percent.

Illustration of the Types of disaster incidents.

Figure 6.8 Types of disaster incidents

After the DRaaS provider develops and implements a disaster recovery plan that meets your needs, the provider can help you to test and manage the disaster recovery procedures to make sure they are effective. Should disaster strike, the DRaaS provider also performs recovery services. DRaaS enables the full replication of all cloud data and applications and can also serve as a secondary infrastructure. While the primary undergoes restoration, the secondary infrastructure becomes the new environment and allows an organization's users to continue with their daily business processes.

MaaS

The M in MaaS stands for “Monitoring” as a Service. It is a framework that facilitates the deployment of monitoring functionalities for various other services and applications. The most common application for MaaS is online state monitoring, which continuously tracks certain states of applications, networks, systems, instances, or any element that may be deployable within the cloud. State monitoring has become the most widely used feature. It is the overall monitoring of a component in relation to a set metric or standard. In state monitoring, a certain aspect of a component is constantly evaluated, and results are usually displayed in real time or periodically updated in a report. For example, the overall timeout requests measured in a period of time might be evaluated to see if this deviates from what's considered an acceptable value. Administrators can later take action to rectify faults or even respond in real time. State monitoring is very powerful because notifications now come in almost every form, from emails and text messages to various social media alerts like a tweet or a status update on Facebook.

AaaS

There is one service that I highly recommend customers to consider before embarking on any project that involves in-database, in-memory, Hadoop, and big data analytics. It is the Assessment as a Service that consultants from the service providers can deliver on-premise by evaluating your data management and analytics processes. Consultants who conduct the assessment will meet with your IT and business departments to analyze your data-warehousing infrastructure and assess the analytics practice. This assessment can range from two to five days. The objectives of this assessment are to review:

  • Business requirements, time frames, and critical success factors
  • Current and planned interoperability between data management and analytics environment, including areas of concern
  • Operational data sources to support business priorities
  • Analytics and business intelligence priorities, strategy, process, and gaps
  • Technologies that are being used for data management and analytics
  • Best practices to optimize the analytics and data management ecosystems
  • Training gaps and opportunities for improvement in software, hardware, and processes

Before the assessment starts, there is some prework from the customer to provide information to the consultants. The consultants have a list of questions to the IT and business departments on efficiency, productivity, precision, accuracy, empowerment, compliance, timeliness, and cost reduction. Each response provides a metric to analyze the company's current environment and also determine the value within the IT and business departments. It is a well-balanced effort from the customer and the service provider.

During the assessment, the consultants will meet many folks from your organization that can include database administrators, security administrators, enterprise architects, business process modelers, data modelers, data scientists, statisticians, business analysts, and end users. Depending on the number of days for the assessment, each customer will have a tailored agenda. For example, if the customer commits to a three-day assessment, which is the most typical length of time, the sample agenda would be

  • Day 1—consultants will meet with IT
  • Day 2—consultants will meet with business
  • Day 3—consultants will meet with IT and business, share results from analysis, and provide guidance

At the end of the assessment period, the customer will have a technology roadmap document outlining short-, medium-, and long-term actions and strategies for adopting and implementing technologies such as in-database, in-memory, and/or Hadoop to their current architecture. Many customers have conducted the assessment and have found it invaluable to drive innovation, change, and improvement in their data business and IT environments.

Customers have stated that the consultants who conduct the assessment are like marriage counselors between the IT and business departments. They close the gap from an independent voice and provide guidance from an external perspective that many internal staff would have overlooked or not even considered. Their analysis brings fresh, new insights and approaches to IT and business from an agnostic angle. These consultants also bring many best practices from industry specific applications to help integrate and solve complex analytics and data management issues. Customers often ask these consultants to return after the assessment to conduct hands-on training and even conduct another assessment exercise in another department.

Future of XaaS

As the Internet of Things continues to evolve, every business can become a technology company to some extent. Innovative businesses will seek to disrupt their own industries with new and exciting technology products, delivered as a service. XaaS makes it possible for companies outside the information technology industry to deliver these exciting new solutions. With XaaS, businesses are partnering with specialized firms to develop the functional areas and can conduct training that fall outside their primary expertise and focus. Businesses are able to develop new services and products more quickly and bring them to market before their competitors. The “Anything as a Service” approach is really at the center of so much potential business transformation, and it is anticipated that it will become a strategic initiative in its own right. It is creating a whole new paradigm for customers and service providers alike.

Organizations can achieve immediate total cost of ownership by outsourcing these services to a qualified and skillful vendor compared to traditional, on-premises solutions. Overall, businesses are considering and beginning to adopt the XaaS model because it takes the total cost of ownership and transforms it from being a concern into something that is more controllable and attainable. Traditionally, IT initiatives such as data warehousing, business analytics, or business intelligence projects were known for suffering from project delays and possibly overruns. Companies did not know what they would get at the end of a project, which took longer than intended and which was, of course, over budget. These types of incidents would be mitigated with XaaS, and the XaaS approach removes those risks. While there may be a concern about having less control over the whole project or scope of the initiative, businesses have come to realize the benefits outweigh any concerns.

CONCLUSION

In a global economy and complex society where value increasingly comes from data, information, knowledge, and services, it is essential but challenging to make sense for data-driven decisions. And until now, we have not had the means to analyze it effectively throughout the life cycle of data: data exploration, data preparation, model development, and model deployment.

In-database processing delivers the promise of analyzing the data where the data reside in the database and enterprise data warehouse. It is the process of moving the complex analytical calculations into the database engine and utilizing the resources of the database management system. Data preparation and analytics can be applied directly to the data throughout the data analytical life cycle. Benefits include eliminating data duplication and movement, thus streamlining the decision process to gain efficiencies, reducing processing time from hours into minutes, and ultimately getting faster results through scalable, high-performance platform. In-memory analytics is another innovative approach to tackle big data by using an in-memory analytics engine to deliver super-fast responses to complicated analytical problems. In-memory analytics are ideal for data exploration and model development processes. Data are lifted into memory for analysis and flushed when the process is completed. Specific in-memory algorithms and applications are designed to be massively threaded to process a high volume of models on large data sets. Both of these technologies are complementary in nature, and not every function can be enabled in-database or in-memory.

Hadoop is an emerging technology to manage your traditional data sources as well as new types of data in the semi-structured landscape. Hadoop is an open source technology to store and process massive volumes of data quickly in a distributed environment. It is becoming a prominent element in the modern architecture of big data for its benefits, including flexibility with data structures and a lower cost of investment. Many misconceptions around Hadoop have created false expectations of the technology and its implementation. However, Hadoop offers a platform to support new data sources for data management and analytic processing.

Integrating in-database, in-memory, and Hadoop delivers a collaborative and harmonious data architecture for customers with structured and semi-structured data. Customers in various industries implement many variations to take advantage of each technology for their business requirements. From public to private sectors, organizations are leveraging new technologies to better manage the data, innovate with analytics, and create/maintain competitive advantage with data-driven decisions from analytical-driven information. The collaborative architecture integrates data management and analytics into a cohesive environment to improve performance, economics, and governance. It allows you to now crawl, walk, sprint, and run (in a relay) toward success. “The whole is greater than the sum of its parts,” as stated by the Greek philosopher Aristotle.

If there is one thing that I highly suggest, it is to review the customer successes and use cases. Not only do they provide information that many of you can relate to, but they also provide some best practices when considering any and all of the technologies (in-database, in-memory, Hadoop). These use cases are the ultimate proof that these technologies are effective, add strategic value to their organizations, and provide data-driven analytical insights for innovation. Whether you are an executive, line of business manager, business analyst, developers/programmers, data scientists, or IT professional, these use cases can enlighten the way you do things and help you explore the many options that may be applicable to your needs. Customer successes and use cases are the most popular requests when it comes to introducing new concepts and technologies at conferences and any speaking engagement. Even when I talk to internal sales folks, they always ask for use cases to share with their prospects and customers how other clients are using these technologies and what tangible and intangible benefits they have achieved.

We are barely scratching the surface when it comes to analyzing data. The future of data management and analytics is pretty exciting and rapidly evolving. Customers are looking forward to refocusing their efforts on some existing initiatives as well as embracing new ones. Some customers may already have a security application in place but with newer sources of threat, it may be wise to upgrade or explore enhanced solution to prevent fraud and cyber-attacks. For others, new focus areas are cloud computing and services. The two are complementary if you want to consider remote data centers, virtual applications, and outsourcing services to fill in the gaps that your organizations may lack. Finally, prescriptive and cognitive analytics are two focus areas that apply automation and machine learning.

I am personally excited for the maturation of prescriptive and cognitive analytics as the Internet of Things evolve. These two technology advancements are complex in nature but they also provide the most value and captivation. Ultimately, businesses will possess the foresight into the increasingly volatile and complex future with prescriptive and cognitive analytics. Such insight and foresight are important to business leaders who want to innovate in their respective industries—on complex financial modeling, on drug development, on new scientific discovery to help cure disease, or on launching a new product or start-up company. Prescriptive and cognitive analytics can reveal hidden and complex patterns in data, uncover opportunities, and prescribe actionable hypotheses that would be nearly impossible to discover using traditional approaches or human intelligence alone. Both require an underlying architecture that is flexible and scalable regardless of the industry that you may be in. This architecture must tie people, process, and technology together to support a diverse set of data management and analytics applications from an ecosystem of data scientists, programmers, and developers. Specifically for cognitive analytics, it must encompass machine learning, reasoning, natural language processing, speech and vision, human-computer interaction, dialog and narrative generation, and more. Many of these capabilities require specialized infrastructure that leverages high-performance computing, specialized hardware platforms, and particular resources with specific technical expertise. People, process, and technology must be developed in concert, with hardware, software, and applications that are constructed expressly to work together in support of your initiative and your organization's livelihood. For me, the journey ends for now, and I look forward to our next adventure—helping you improve performance, economics, and governance in data management and analytics.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
13.59.141.75