Jump to content
  • Automated Data Cleaning with Spotfire Statistica®


    This article outlines the steps for automatically cleaning data with Spotfire Statistica®.

    The following example is to get a user started with the Data Health Check node.

    After starting Statistica, select Home menu, pull down arrow on New menu. Now select the Workspace menu. The list of templates will display. Select the Get Data template to locate a data connector. Select the node and type Ctrl+C. 

    Note: Some of the nodes may be disabled if the user does not own the associated product. If data needs to be retrieved from OSISoft PI database, then look for a template named PI Asset Framework or PI Asset Framework and Event Frames.

    Next select Home menu, pull down arrow on New menu. Now select the Workspace menu. Select the Automated Data Cleaning template.  

    Connect data (see Get Data above) to the Data Health Check node. This node can be found by search for it by name via Feature Finder (upper right corner).

    feature_finder_0.png.d531c6a92ab77e3e9657b2c5f274645b.png

    Data Health Check node looks for common data issues (missing, invariant, etc...) for each variables and generates a report. This report can be used in deciding how to clean the data. This can be especially useful for "big data" when the user does not want to explore 5,000 variables by hand.

    data_health_check.png.71ca3e93be8977d140d450539482ac0a.png

     


    User Feedback

    Recommended Comments

    There are no comments to display.


×
×
  • Create New...