Hideo Sato Posted September 29, 2023 Posted September 29, 2023 I am currently verifying the Anomaly Detection template version 5.0.0 downloaded from the following URL, but I am encountering an error.We apologize for the inconvenience and would appreciate your help.(1) Regarding the Data Prep Options on the "2-MODEL" page, I see that I can set up preprocessing such as "dropna" and "robust", but to which columns are these applied?Also, what are the "impute", "minmax", and "robust" preprocessing options in the Data Prep Option?(2) I was able to create a model by inputting new data, but the results after "Incidents per Cluster" on the "4-RESULTS" page were not adapted to the new data. As an example, as shown in the attached image, there are no columns "dp1" through "dp6" in the input data, but for some reason the results for these columns in the default data are still being displayed.Is there a solution to this problem?Sorry for the number of questions.Thank you in advance.
David Katz Posted October 4, 2023 Posted October 4, 2023 Thanks for your questions. Always helpful to get feedback from users.(1) Regarding the Data Prep Options on the "2-MODEL" page, I see that I can set up preprocessing such as "dropna" and "robust", but to which columns are these applied?Also, what are the "impute", "minmax", and "robust" preprocessing options in the Data Prep Option?Preprocessing options are applied to all columns which are marked “Predictor” in the Variable Classification visualization.Robust is the default method for scaling. It uses the median and interquartile range to rescale the each predictor input.Minmax is not yet implemented – it will default to robust with a warning. It will be implemented in an upcoming version.Impute is currently a very simple function that fills in missing values with the overall mean.Other preprocessing may be performed outside the anomaly detection dxp if desired. You might find this package of interest: https://community.spotfire.com/s/article/Time-Series-Analytics-for-Spotfire. The smoothing options can be used to create a new series that has no missing values. Best practice is to use the corresponding new values to fill in the missing values, leaving the other values unchanged (so that anomalies remain). Future iterations of the anomaly detection dxp will incorporate these algorithms.(2) I was able to create a model by inputting new data, but the results after "Incidents per Cluster" on the "4-RESULTS" page were not adapted to the new data. As an example, as shown in the attached image, there are no columns "dp1" through "dp6" in the input data, but for some reason the results for these columns in the default data are still being displayed.Is there a solution to this problem?This can happen when there are too few incidents identified for the clustering algorithm to produce usable results. We will plan to improve the behavior with a helpful error message when this happens. Note that this event should not affect the remainder of the dxp functions on succeeding pages; it just means that the clustering has failed but this is not needed for the time series portion of the analysis.Let us know if you have any further questions or need more help.
Recommended Posts
Create an account or sign in to comment
You need to be a member in order to leave a comment
Create an account
Sign up for a new account in our community. It's easy!
Register a new accountSign in
Already have an account? Sign in here.
Sign In Now