Chapter 9 – Anomaly Detection in Banking With AI

Chapter 9 considered the process of anomaly detection from an IBM Watson Analytics perspective, walking through an example use case project relative to the banking industry, in which transactions were evaluated to identify potential fraudulent situations.

Key takeaways:

  • In data mining, anomaly detection is defined as the identification of items, events, or observations which do not conform to an expected pattern in a dataset and are sometimes referred to as a rare event. These events raise suspicion and, typically, anomalous items will translate to some kind of problem that requires deeper attention and needs to be addressed.
  • Anomaly detection is a technique or method used to identify unusual patterns that seem to not conform to what is the or an anticipated behavior. You will routinely see anomaly detection techniques used in many areas such as intrusion detection, system health monitoring, and fraud detection.
  • Anomalies are generally categorized as being point (which is when a single data point is too different from all others), contextual (these anomalies are only a problem in specific situations/context), or collective (when data as part of a set becomes an issue).
  • Anomaly detection enjoys a wide range of use cases relating to the banking industry such as fraud detection, which is typically categorized as corruption, cash, billing, check tampering, skimming, larceny, and financial statement deception.
  • As with all projects, using some knowledge as to the definition of the various types of banking fraud, the project starts with a review of a file of transactions looking at transactions outside of what is understood to be normal.
  • Even when using a tool such as MS Excel, it can be seen that there are many types of bank transaction found within data and each transaction is assigned a Business Transaction Code (BTC) that identifies the transaction's purpose.
  • The project protocol is again to obtain the data, perform an initial review, and then load the file into Watson Analytics so that we can proceed to the next step in the project, which is usually to begin using Watson Analytics Explore.
  • As we've seen in earlier chapters of this book, with Explore you can use language and keywords that you feel are most correlated to the objective(s) of the project to create questions that then help explore and visualize the data in your project.
  • IBM Watson Analytics uses worded questions, not programming code or queries to generate starting points you can read through and then use to create visualizations that meet your project's requirements. With Watson Analytics, you develop questions rather than a structured query syntax.
  • The chapter's project constructed meaningful questions to ask based upon the projects objectives and compared the questions to similarly focused structured queries that you might run on a relational database, yielding results in table format. Another approach considered was performing sorting and filtering within MS Excel. Even with the power of MS Excel, more analysis was required of the data. Even after adding a filter on certain columns, there were no clear insights.
  • Conversely, Watson Analytics automatically created visualizations based on the originally constructed questions. These visualizations made it easier to see larger variances in the data.
  • Adding additional filters to visualizations make it easy to see suspicious data events, such as amounts found during this period that are much larger than those that can be seen as the normal amount for a particular Bank ID.
  • Reviewing the details of visualizations can prove that previous visualization highlights are in fact outliers based upon previously established baselines.
  • Using different visualization types, you can better reveal outliers and other red flag or audit points in the data. Additionally, you can use the Visualization Content section to select or unselect specific data points of interest.
  • Using a Grid type of visualization, we can see data in a table or row and column format, which sometimes makes it easier to see data as you scroll through it in various ways.
  • You can always further explore a visualization be clicking on the Visualization Content section and add or remove any data column or row.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.16.81.33