Summary
Data preparation is one of the most important and time consuming tasks in the process of data analysis. This quick start workflow will help users define data cleaning and data preparation steps quicker and not from scratch. Also, it can be really useful for new Statistica users.
Unfortunately, there is no universal set of steps or even settings of these steps to have automatic data cleaning and data preparation process for different usecases. Users can go node by node and set nodes according to the data available and tasks needed. Users can learn from settings and descriptions of functionalities on an example data set.
Data cleaning steps include:
- handling outliers
- handling missing data in various ways
- invariance check
After data cleaning, the user might want to define, recompute or create new variables. This can be done using various built-in data transformations, categorical variables recoding, etc.
The template shows the usage of the most important and most frequently used functionalities in the data preparation process.
Release P1.0
Published: June 2018
initial Release