In data mining, anomaly detection (or outlier detection) is defined as the identification of items, events, or observations that do not conform to an expected pattern (or other items) in a dataset, and that are sometimes referred to as rare events. These events raise suspicion and, typically, the anomalous items will translate to some kind of problem that requires deeper attention and needs to be addressed. Common events include bank fraud, structural defects, medical conditions, or simply mistakes in a text.
Anomaly detection is a technique or method that is used to identify unusual patterns that don't seem to conform to the accepted behavior. You will routinely see anomaly detection techniques used in many areas such as intrusion detection, system-health monitoring, and fraud detection.
No matter what the application might be, it is critical to first establish and understand certain baselines and boundaries that will define what an anomaly is (for that application).
Anomalies are generally categorized as point (which is when a single data point is too different from then most others), contextual (contextual anomalies are only a problem in specific situations/contexts), or collective (when data as part of a set becomes an issue anomalies).