Jump to content

Anomaly Detection in Spotfire

Overview, techniques and use cases

  • Anomaly Detection with Spotfire


    This article provides an overview of anomaly detection capabilities in Spotfire, including techniques, use cases, and more.

    Overview

    What are Anomalies?

    Anomaly detection is a way of detecting abnormal behavior. One definition of anomalies is data points that do not conform to an expected pattern compared to the other items in the data set. Anomalies are from a different distribution than other items in the dataset. Anomalies in data translate to significant (and often critical) actionable information in a wide variety of application domains. The figure below shows a simple example of anomalies (o1, o2, O3) in a 2D dataset. The autoencoder technique described here first uses machine learning models to specify expected behavior and then monitors new data to match and highlight unexpected behavior:

    anomaly_2d.png.2bb8d116afcc96d3896e56f2bf5768a8.png

    Anomalies are similar, but not identical, to outliers. Outliers are points with a low probability of occurrence within a given data set. They are observation points that are distant from other observations. However, they don't necessarily represent abnormal behavior. Outliers in data warrant attention because they can distort predictions and affect model accuracy if you don't detect and handle them. For more information on detecting outliers in Spotfire, see this article: Top 11 methods for Outlier Detection.

    For more information on anomaly detection, check out the resources section at the end of this article.

    Root Cause Analysis

    Anomaly detection is closely related to root cause analysis, a critical step across several industries to identify the core causes of issues and prevent them from occurring again. While anomalies are deviations from normal behavior, a recurring pattern of anomalies can point to a deeper issue or a fault in a system that would need to be addressed via root cause analysis. By not only detecting anomalies early, but also understanding potential patterns behind anomalies early, organizations can reduce the amount of damage and disruption caused by an underlying failure.

    Anomalies can essentially serve as warnings of a potential problem. One method of bridging anomaly detection to root cause analysis is through a clustering analysis of anomalies. Clustering can aid in categorizing anomalies that are similar to each other, which will make it easier to identify patterns and other commonalities among them. When a group of anomalies have one or several factors in common, this can point to a common cause and potential reason for failures or defects. Clustering analysis also allows anomalies to be treated as indicators of systemic issues, rather than each anomaly needing to be viewed and analyzed as an isolated event. In turn, this will save organizations time and energy to understand underlying causes of said anomalies.

    When performing clustering analysis on anomalies, a user can set a threshold of how many clustered and/or consecutive anomalies would warrant further investigation. Let’s say this threshold is set to 5; once 5 consecutive anomalies occurred or 5 anomalies are clustered together, this would be called an incident. Once incidents are identified, they can be saved into a separate data structure, fit for performing root cause analysis on. In this case study, we describe this process in more detail.

    In the Cyber Threat Detection use case and demo, this workflow of performing anomaly detection and root cause analysis is demonstrated on computer log data to prevent cybersecurity attacks and identify security risks.

    Industry Use Cases

    Here are a few examples from our practice:

    Case Study: Fault Detection and Classification

    Fault detection and classification is the process of monitoring data to identify anomalies that could cause issues in real time. A common dataset for testing fault classification algorithms is the CWRU Bearing Dataset, which contains vibration signal data of normal and anomalous bearings. It has been widely studied - see the references here for example: Lite and Efficient Deep Learning Model for Bearing Fault Diagnosis Using the CWRU Dataset - PMC - Yoo et al.

    Using the Spotfire anomaly detection template, we have repurposed this dataset–which is typically used for classification of anomalies–to perform unsupervised anomaly detection and clustering.

    To read about this case study in depth, check out this article.

    AD_4nXePg0XiptOBWmtCpxR13aHSRgabjk3gA6MPMPaL_8kvfjexWY_O-4qypQlKdIRh-huiqI2vqVuQzF7LxeKwbecIzqmg-ELP-MYFmLSAW5ydJWXJjAmb8Dw8ty6B39esmXcerFvDwUDHGlBqjWfxdUh-v4Y?key=guXqf-HgPwp2944uc1X6gQ

    Identifying Abnormal Product

    Many manufactured products undergo some form of testing to determine suitability for use.  Univariate and linear multivariate Statistical Process Control methods can be used to detect anomalous products based on this data. However, with increasing component and system complexity, multivariate anomalies that also involve significant interactions and nonlinearities may be missed by these more traditional methods. These anomalies can be implicated in reliability and system failures. AI-based algorithms, such as autoencoders, can often be used to identify these complex anomalies. Once the anomalies are detected, their fingerprints can be generated so they can be classified and clustered, enabling investigation of the causes of the clusters. As new data streams in, it can be scored in real-time to identify new anomalies, assign them to clusters, and respond to mitigate potential problems.

    Preventing Machine Breakdowns with Connected Sensor Data

    Many different types of equipment, vehicles, and machines are now instrumented with sensors. Monitoring these sensor outputs can be crucial to detecting and preventing breakdowns and disruptions. Unsupervised learning algorithms like Autoencoders can be used to detect anomalous data signatures that may predict impending problems.  When sensor time series traces exhibit repeating patterns, special techniques, such as MASS (download this document for more information), or the one used in the Sensor Anomaly Detection at the Edge solution on this page, (shown in the image below) can be used.

    time_series_anomaly_msft.thumb.png.c64ceb7c0184fa844ed4c4326cda7f45.png

    Defects and Abnormalities in Images

    Connected digital cameras today capture large amounts of raw image data. People are very good at rapidly identifying abnormalities in images. However, it is expensive and time-consuming for humans to extract critical information from large numbers of images; they often remain unprocessed. AI algorithms are increasingly used to automate this process. These use cases often involve some combination of unsupervised learning (where similar images are clustered together), human verification that images contain abnormalities, and supervised learning, to train models that automate the identification of abnormalities of interest. Examples include the identification of cancer cells and manufacturing defects in images. An example of how this is done with semiconductor wafermap spatial test and fail patterns can be found here.  

    Cyber Threat Detection

    Networked computers today are under constant threat of ransomware and other forms of cyber-attack. System Threats can be detected through analysis of computer log data, utilizing unsupervised learning models such as LSTM autoencoders for anomaly detection.  LSTM autoencoders identify anomalies in the sequence of log events.  

    For more information on this use case, the demo is available here.

    cyber_threat_-_root_cause_analysis.thumb.png.94569a0967a986a56036a52e90a489af.png

    Baseball

    Baseball is one of the oldest sports in the United States, with a history dating back to the 19th century. Since 1880, there have been 101 different teams that have played a grand total of 2,829 different seasons. By looking at the data, we wanted to statistically uncover which of these 2,829 seasons were anomalies, and which teams had seasons unlike any other. To accomplish this, we utilized a method called SAX (Symbolic Aggregate Approximation) encoding. The advantage of using SAX is that it is able to act as a dimensionality reduction tool, it is tolerant of time series of different lengths, and it makes trends easier to find.  For details, see this blog: Using Time Series Encodings to Discover Baseball History's Most Interesting Seasons

    baseball_2021-09-17_at_7_14.35_pm.thumb.png.2adcd0d54c2d5090643bda4b38aba7f4.png

    Listening for Abnormalities in the Sounds Machines Make

    A good mechanic can tell whether your car is OK - or not - by listening to the sounds it makes.  A really good one can tell you what is wrong with it. 

    Abnormal sounds can be an indicator that a machine needs maintenance. The video below shows an example of an application that uses audio data from any device and learns to identify anomalous sounds made by machines. Datasets of known abnormalities can then be created and the models can be deployed for real-time scoring.  

       

    Bank Stress Test

    Economic and performance data can be used for "stress testing" the capital reserves of bank holding companies to identify data anomalies.  Details of one implementation can be found here:  

    Data Quality Management and Anomaly Detection - A Bank Stress Test Use Case

    Fighting Financial Crime

    In the financial world, trillions of dollars worth of transactions happen every minute. Identifying suspicious ones in real-time can provide organizations with the necessary competitive edge in the market. Over the last few years, leading financial companies have increasingly adopted big data analytics to identify abnormal transactions, clients, suppliers, or other players. Machine Learning models are used extensively to make predictions that are more accurate.  Learn about and download the Risk Management Accelerator

    fraud_07-live-transactions.thumb.png.d00571e44b504dca25493dcb959cbcc0.png

    Healthcare claims fraud

    Insurance fraud is a common occurrence in the healthcare industry. It is vital for insurance companies to identify claims that are fraudulent and ensure that no payout is made for those claims. An economist recently published an article that estimated $98 Billion as the cost of insurance fraud and the expenses involved in fighting it. This amount would account for around 10% of annual Medicare & Medicaid spending. In the past few years, many companies have invested heavily in big data analytics to build supervised, unsupervised, and semi-supervised models to predict insurance fraud.  

    Techniques for Anomaly Detection

    Companies around the world have used many different techniques to fight fraud in their markets. While the below list is not comprehensive, three anomaly detection techniques have been popular.

    Visual Discovery

    Anomaly detection can also be accomplished through visual discovery. In this process, a team of data analysts/business analysts, etc. build bar charts; scatter plots, etc. to find unexpected behavior in their business. This technique often requires prior business knowledge in the industry of operation and a lot of creative thinking to use the right visualizations to find the answers.

    Supervised Learning

    Supervised learning is an improvement over visual discovery. In this technique, persons with business knowledge in a particular industry label a set of data points as normal or anomalous. An analyst then uses this labeled data to build machine learning models that will be able to predict anomalies on unlabeled new data.

    Unsupervised Learning

    Another technique that is very effective is unsupervised learning. In this technique, unlabeled data is used to build unsupervised machine learning models. These models are then used to predict new data. Since the model is tailored to fit normal data, the small number of data points that are anomalies stand out. Some examples of unsupervised learning algorithms are:

    Autoencoders

    Unsupervised neural networks or autoencoders are used to replicate the input dataset by restricting the number of hidden layers in a neural network. A reconstruction error is generated upon prediction. The higher the reconstruction error, the higher the possibility of that data point being an anomaly.

    Clustering

    In this technique, the analyst attempts to classify each data point into one of many pre-defined clusters by minimizing the within cluster variance. Models such as K-means clustering, K-nearest neighbors, etc. are used for this purpose. A K-means or a KNN model serves the purpose effectively since they assign a separate cluster for all those data points that do not look similar to normal data.

    One-class support vector machine

    In a support vector machine, the effort is to find a hyperplane that best divides a set of labeled data into two classes. For this purpose, the distance between the two nearest data points that lie on either side of the hyperplane is maximized. For anomaly detection, a One-class support vector machine is used and those data points that lie much farther away than the rest of the data are considered anomalies.

    Time Series techniques

    Anomalies can also be detected through time series analytics by building models that capture the trend, repeated patterns (such as seasonality, machine cycles), and levels in time series data. Here is an introduction to the Detection of Anomalies in Repeating Time Series using the MASS algorithm, which includes a Spotfire example. 

    Multivariate time series analysis is another important application of time series for anomaly detection. Sometimes, a data point may not be flagged as anomalous by one variable, but when intersected with another variable or set of variables, can clearly be an anomaly. For example, if a technician was tracking the temperature, pressure, and vibration from a piece of machinery, it's likely that some slight variation in the data points would not be considered anomalous when using univariate anomaly detection. However, with multivariate anomaly detection, the technician could see that perhaps, the data points where pressure rises is not normal when certain changes in temperature and vibration occur at the same time. With a multivariate approach, these points would be flagged as anomalous, and could be further analyzed to determine likely root causes for the observed behavior.

    For more information on the multivariate approach to anomaly detection, check out this video from TAF 2023 on Multivariate Anomaly Detection and Classification.

    A Design Pattern for Human-Centered Anomaly Detection and Classification

    For many applications, it is not enough to determine that an item is an anomaly, but it is also important to know how it is anomalous.  It is important to enable the subject matter expert (SME) to remain in control throughout this process. Aided by AI, they use their knowledge of the business to help determine how anomalies will be classified and how accurate the models will be. Human Centered AI (HCAI) provides a framework for balancing computer automation and human control.  Here is a Design Pattern that we use for generating anomaly detection models consistent with HCAI principles.  It achieves this by using a combination of Visual Discovery, Supervised, and Unsupervised learning techniques.  

    anomaly_detect_workflow_4.thumb.png.aafb622816387749c14193a21b0d79d4.png

    1. Detect anomalies
    2. Determine a unique 'fingerprint' for each anomaly 
    3. Cluster anomalies together with similar fingerprints
      •  SME refines the assignment of items to clusters to determine the Classes of practical significance for the use case

    4.  Train supervised learning model for each Class of interest

      •  SME reviews false positives and false negatives and refines the model until it achieves the desired accuracy

    5. Deploy supervised learning models to Classify new items that belong to each class of interest.
    6. Monitor model health and re-train if accuracy degrades or new classes of anomalies are detected.  This process can be automated or guided by the SME.  

    This design pattern is used in the Spotfire Anomaly Detection template and our Wafermap Pattern Recognition solution.

    Autoencoders Explained

    Autoencoders use unsupervised neural networks that are both similar to and different from a traditional feed-forward neural network. It is similar in that it uses the same principles (i.e. Backpropagation) to build a model. It is different in that, it does not use a labeled dataset containing a target variable for building the model. An unsupervised neural network also known as an autoencoder uses the training dataset and attempts to replicate the output dataset by restricting the hidden layers/nodes.

    autoencode_doc1.thumb.png.c88edbe03aafae5183ea3a9be37ac030.png

    The focus of this model is to learn an identity function or an approximation of it that would allow it to predict an output that is similar to the input. The identity function achieves this by placing restrictions on the number of hidden units in the data. For example, if we have 10 columns in a dataset (L1 in the above diagram) and only five hidden units (L2 above), the neural network is forced to learn a more restricted representation of the input. By limiting the hidden units, we can force the model to learn a pattern in the data if there indeed exists one.

    Not restricting the number of hidden units and instead specifying a 'sparsity' constraint on the neural network can also find an interesting structure.

    Each of the hidden units can be either active or inactive and an activation function such as 'tanh' or 'Rectifier' can be applied to the input at these hidden units to change their state.

    autoencode_doc2.png.7320479fc1b07836564aa346db611489.png

    Some forms of autoencoders are as follows:

    • Under complete Auto encoders
    • Regularized Auto encoders
    • Representational Power, Layer Size, and Depth
    • Stochastic Encoders and Decoders
    • Denoising Auto encoders

    A detailed explanation of each of these types of autoencoders is available here.

    LSTM Autoencoders

    An example of an autoencoder is an LSTM autoencoder. In this model, LSTMs are used in both the encoder and decoder parts to learn representations of sequential data and then reconstruct it. In the latest version of the Spotfire Anomaly Detection Template, LSTM is one of the primary candidate models utilized when performing multivariate anomaly detection.

    To learn more about LSTMs in general, check out the Spotfire Energy Consumption and Generation Forecasting application. In this application, an LSTM is used to create a time series forecast, demonstrating another use case for these types of models. You can find the application on our demo gallery here, with the associated community article here.

    Spotfire Solutions

    Spotfire Anomaly Detection Template - Autoencoders using TensorFlow

    This template uses an autoencoder machine learning model to specify expected behavior and then monitors new data to match and highlight unexpected behavior. It features automated machine learning to optimize model-tuning parameters. The Time Series release includes time series analysis, so it can be used as a form of 'control chart', and has an input component drill-down to find the most important features influencing a reconstruction error and clustering analysis to the group and analyze similar groups of anomalies. Download the template from the Spotfire Exchange. See the documentation in the download distribution for details on how to use this template.

     

     

    A Deep Learning Autoencoders method is deployed using a Python Data Function. See this page for more information on how to build a good autoencoder model that will generalize to new datasets.

    Spotfire Python Data Function - Autoencoder using TensorFlow

    Spotfire allows for inbuilt Python and R data functions. An autoencoder is a versatile deep learning model that is used in multivariate regression, anomaly detection, and dimension reduction. This implementation uses TensorFlow with the Keras API; both are popular Python deep learning libraries. The data function allows a user to configure different datasets, configure different neural network architectures, train and save the neural network model, and score new data using the trained models. The Spotfire DXP includes further analysis of model features contributing toward reconstruction errors and uses reconstruction errors to find a statistical golden batch of data. More information on this asset is available here. It can be downloaded from the Spotfire Exchange here

    ae_sf_df.thumb.png.7f0f3337218c102ab3c4080e76d6fdeb.png

    Isolation Forest Python Data Function for Spotfire

    Isolation Forests are known to be powerful, cost-efficient models for anomaly detection.  They isolate anomalies using binary trees and work well in high-dimensional problems that have a large number of irrelevant attributes, and in situations where the training set does not contain any anomalies.  This data function will train and execute an Isolation Forest machine learning model on a given input dataset. It can be downloaded from the Spotfire Community Exchange here.   

    Local Outlier Factor Python Data Function for Spotfire

    This data function uses the unsupervised local outlier factor method to perform anomaly detection on a dataset.  The local outlier factor is based on the concept of local density.  By comparing the local density of an object to the local densities of its neighbors, one can identify regions of similar density and points that have a substantially lower density than their neighbors are considered to be outliers.  The data function can be downloaded from the Spotfire Community Exchange here.  

    Risk Management Accelerator

    The Spotfire Risk Management Accelerator identifies potentially risky activities, such as financial crime or insurance fraud, in a high-frequency event stream using machine learning. Supervised and/or unsupervised models can be built and hot deployed to the streaming event processing platform, where events are scored events in real time. Alerts are then raised when potentially risky behavior is detected.

    Statistical Process Control

    Control charts are widely used in Manufacturing, Energy, Telco, Technology, and many other sectors. They are a form of anomaly detection used to monitor key metrics, detect deviations from the baseline, and generate automated alerts. Spotfire supports many types of Shewhart (univariate) and multivariate charts; integrated limits generation, storage and deployment; selection of rules to detect out-of-control points; tagging and annotation; management and operations dashboards; periodic or real-time alerts; process capability studies and root cause drill-downs. More details about Spotfire SPC solutions can be found in the Process Control & Anomaly Detection section on the Manufacturing Solutions page.

    Resources

    General (non-Spotfire) references

    Spotfire Corporate Assets on Anomaly Detection

    Spotfire Overview Webinars

    Spotfire Community pages on Anomaly Detection

    Spotfire Community Exchange and AWS Marketplace software downloads


    User Feedback

    Recommended Comments

    There are no comments to display.


×
×
  • Create New...