Jump to content
  • Anomaly Detection using TensorFlow with Spotfire®


    Spotfire®'s Python Data Function enables users to install and readily use packages available on PyPi to build custom functionality into their dashboards. Users can execute custom python code and use the resulting document properties or tables to update visualizations on a Spotfire® dashboard. We present an overview of the autoencoder implementation used in Autoencoder TensorFlow Python Data Function for Spotfire® and the Anomaly Detection Template for Spotfire®.

    Overview

    Spotfire®'s Python Data Function enables users to install and readily use packages available on PyPi to build custom functionality into their dashboards. Users can execute custom Python code and use the resulting document properties or tables to update visualizations on a Spotfire dashboard. We present an overview of the autoencoder implementation leveraged in Autoencoder TensorFlow Python Data Function for Spotfire® and the Anomaly Detection Template for Spotfire®.

    Prerequisites

    The following requirements must be met to enable running the Autoencoder data functions:

    • Spotfire® 10.7 (or later) client and server

    • Python packages pandas, numpy, scipy, scikit-learn, tensorflow must be installed for the Python data function to work. Both assets use TensorFlow version 2.5.0

    Manufacturing Usecase

    The dataset used in both assets contains manufacturing equipment data captured during a few weeks across five plant locations with three different products (disclaimer: the data is likely fictitious and has been created for the purpose of the demo). Various metrics from this period can help us identify abnormal behavior in our machines. Overall, the autoencoder and subsequent analyses can be used in real-time applications to proactively identify risks and mitigate them.

    TensorFlow

    TensorFlow is an open-source software library used in the industry today for machine learning and deep learning. Keras is a deep learning Python API/interface for TensorFlow. More information on TensorFlow is available here and more information on Keras is available here.

    TensorFlow has a rich ecosystem of APIs in many programming languages, ancillary products for serving models, visualization frameworks (e.x. tensorboard), and deployment packages for edge and hosted products.

    Autoencoders

    Unsupervised feed-forward neural networks, also known as autoencoders, are an important deep learning technique that is used for a variety of use cases, including anomaly detection, multivariate regression, and dimensionality reduction.

    Anomaly detection is a way of detecting abnormal behavior. This technique uses past data to learn a pattern of expected behavior. This pattern is compared across new and real-time events to highlight any abnormal or unexplained activity at a specific moment.

    Some use cases for anomaly detection include:

    • Monitoring sensors on the edge devices

    • Financial or healthcare fraud

    • Manufacturing equipment early failure detection

    Autoencoders are similar to normal feed-forward neural networks in that they can have multiple layers of neurons that attempt to understand a pattern in the dataset. However, unlike traditional feed-forward networks, autoencoders do not require a target (i.e dependent column). Instead, autoencoders have a set of layers for encoding the dataset and then replicate these layers in reverse order for decoding the encoded dataset. The output from the final decoded layer is the reconstructed data. Reconstruction error refers to the difference between the original data and the reconstructed data. In data points which lie in the same pattern space as the expected pattern, there is low reconstruction error. However, in data points where there is abnormal behavior (fall away from the expected pattern space), the reconstruction error is higher. Assessing the reconstruction errors helps us identify the data points that might be anomalous or require further examination.

    ScreenShot2021-09-24at12_23_05PM.png.3ffdca2d42343cf466d66f37e802af34.png

    In addition, output in the form of overall reconstruction error for a particular data point can be decomposed into separate reconstruction errors for every input variable. This enables users to use autoencoder techniques also for a root cause analysis as you can review which variables are contributing the most to the overall reconstruction error for a particular anomaly or, in other words, which variables are responsible for an anomalous behavior. In the Spotfire Anomaly Detection community page, we dive further into the relationship between anomaly detection and root cause analysis, as well as explore other use cases.

    More information on autoencoders and their variants is available here.

    Python Data Function

    The Spotfire® Exchange component for Autoencoder TensorFlow Python Data Function for Spotfire® includes an .sfd file (the exported data function) and a Spotfire® analysis file (DXP).

    The 'Overview' page in the DXP goes over the required libraries, library and parameter documentation, and provides tips for running the function. To adjust the parameters or configure a different dataset, edit the "[Modeling] Autoencoder TensorFlow" data function (Data -> Data function properties -> Edit Parameters). Change the Input Data to the desired data table and include the predictor columns (and optionally ID or Data Usage columns).

    ScreenShot2021-09-24at11_11_07AM.thumb.png.2e4edadfa3a341f1b848a03b0d6b22a4.png

    The 'Build and Evaluate Model' page in the DXP provides a UI on the left to tune the neural network parameters (although a user can also run with the predefined parameters). The two charts, 'Histogram of Reconstruction Errors' and 'Loss per Epoch' help us assess model training. See this blog for more information on How to build an Autoencoder Anomaly Detection model that will generalize to new datasets using these visuals.

    ScreenShot2021-09-24at11_17_35AM.thumb.png.41710800321608fd02ffa8e2bb3fbe26.png

    The 'Postprocessing' page in the DXP analyzes the reconstruction errors from the autoencoder and provides an explainability component (or root cause analysis aspect) to the model. 'Reconstruction Mean Squared Error over Time' shows the reconstruction MSE for each data point. We want to study points with high reconstruction MSE. If we mark some of these points, 'Top Features contributing to Reconstruction Error' updates to show the top predictor columns contributing to this high error. When we select one feature, the trellised visual on the right updates to compare across time: the overall reconstruction MSE, the reconstruction MSE for the select feature, and the original data for that feature. The idea is to assess if higher reconstruction errors correspond with abnormal data points in the select feature/dimension.

    ScreenShot2021-09-24at11_17_51AM.thumb.png.9ea2d6ce13b052c3eca706adbbee1179.png

    Autoencoder Implementation using TensorFlow

    The autoencoder data function is meant for beginner and advanced users. As seen in the input parameters (Data Function -> Edit Script). The only required parameter is the input data! The rest of the parameters are optional. Some of them are related to data preparation logistics (file_path for saving the model, id_column for attaching a unique identifier to the data, and data_usage_column so the user can specify their own train/test/validation splits), while the majority of parameters are related to the neural network. These neural network parameters have standard defaults: Huber loss, Adam optimizer, tanh activation, etc. Take a look at the readme documentation attached in the Exchange release or the data function parameter descriptions (pictured below) for more information.

    ScreenShot2021-09-24at1_17_39PM.thumb.png.3aea55947dbc267ec489715fdda0bdac.png

    We want to highlight a snippet of the TensorFlow code for creating autoencoder architectures. The default architecture is: dimensions of model data [original input] -> 200 neurons  [encoder] -> 50 neurons [bottleneck] -> 200 neurons [decoder] ->  dimensions of model data [reconstructed output]. Note, the term bottleneck refers to the compressed middle, hidden layer (often the smallest layer). If the user wants to specify their own architecture, they can give the encoder hidden layer sizes plus the bottleneck size as a comma-separated list. For example, on the 'Build and Evaluate Model' page, the given list '64, 32, 5' tells us that the encoder sizes are 64 -> 32, the bottleneck size is 5, and then we create the decoder sizes 32 -> 64. The overall architecture is then: dimensions of model data [original input] -> 64 neurons  [encoder] -> 32 neurons  [encoder] -> 5 neurons [bottleneck] -> 32 neurons [decoder] ->  64 neurons [decoder] -> dimensions of model data [reconstructed output]. Dropout layers can be optionally added after each hidden layer. Line 291 can also be uncommented to enforce that the encoder and bottleneck sizes are strictly decreasing, as often seen in standard autoencoders.

    ScreenShot2021-09-24at1_22_44PM.thumb.png.70c45a59134d851f0afa1fa96418aa88.png

    Lastly, the autoencoder has multiple purposes. We save the bottleneck as a Numpy array; this is often a reduced dimension or representation of the data learned. On the 'Find Golden Batch' page, we use quantile cutoffs on the reconstruction errors to filter for a 'golden batch' (the 'best data') under nominal conditions.

    Anomaly Detection Template

    The Anomaly Detection Template for Spotfire® is full-scale data preparation, autoencoder, LSTM and K-means modeling, and in-depth postprocessing analysis on the same dataset and can be used on any Time Series anomaly detection use case. It uses a variant of the same data function in Autoencoder TensorFlow Python Data Function for Spotfire®, and includes other R and Python data functions. For full instructions on using this template, reference the user guide within the Exchange release or the Dr. Spotfire video at the top of this wiki page.

    LSTM

    Before showing the look and feel of the template, it is worth mentioning that anomaly detection template (version 5.0.0 and above) is also using LSTM method for anomaly detection task in parallel with autoencoder method. LSTM (Long Short-Term Memory) has had great success in Large Language Models. Here we use it in seq-to-value mode, which means that it tries to predict the next vector of measurements from a lookback window. This is similar to creating lag variables. Note that like an autoencoder, no external labels are used. When the model has been fit, it does its best to predict the value of all columns at the currently targeted time instance. The reconstruction error measures the residual of this prediction, so results can be used for root cause analysis or for example clustering of anomalies in the same way as autoencoder output.

    Template in action

    The '1-Explore' and '2-Model' pages explore the columns of the user's input data table, calculate summary statistics for these columns, facilitate choosing data to use in modeling, splits the time series data into train/test/validation sets, and set predictor and variable types. 

    ezgif.com-gif-maker.png.9f200132f4d0b1afd6004e4186bec636.png

    On the '2-Model' page, you can also define how many models should be built and what is the parameterization of the models.

    image.thumb.jpeg.48367659b292366ff262cdc66361c97f.jpeg

    The results of the modeling can be reviewed on '3-Compare Models' page as well as '4-Results' page. '3-Compare models' page is there to evaluate which model is more successful and gives relevant results.

    image.thumb.jpeg.dd8c9361229d6044944769639504427d.jpeg

    These visuals that assess model training, postprocessing on the reconstruction errors and root cause analysis similar to the ones in the Python data function DXP. Again, these visuals are specific to autoencoder training/LSTM and model evaluation. 

    image.thumb.jpeg.bd50812266172076db6f84b8925b3d20.jpeg

    Lastly, this template post-processes the reconstruction errors using outlier cutoffs and K-means clustering to identify incidents and clusters of incidents over time. An incident is defined as a collection of at least 5 timestamps with reconstruction errors over a user-defined outlier cutoff. Incidents can be clustered together into similar groups and analyzed over time. Clusters provide a clear profile of the incident and allow comparison between them as well as giving insights into root causes of these anomalies. Additionally, a time series view of clusters is showing when each type of incident was a recurring problem. This can be done both retrospectively or in conjunction with the new data.

    image.thumb.jpeg.ab75f10612b6271afcbfdd06a9f89b16.jpeg

    image.thumb.jpeg.cd6237a627eeb30000df3928008dbde5.jpeg

    References

     

     


    User Feedback

    Recommended Comments

    There are no comments to display.


×
×
  • Create New...