Spotfire Statistica® - Statistica Glossary - Spotfire Statistica / Data Science

This article explains commonly used terminology in Spotfire Statistica® (or simply Statistica)

File types

Workspace - Visual workflow user interface for creating no-code data analysis pipelines. The filename extension for this file is .sdm

Workbook - It is a tree structure container of objects in folder type structure (in the workspace they are named according to the node name). Objects are all forms of the results created by workspace or interactive analysis. The filename extension for this file is .stw

Spreadsheet - Table format with actual data values together with variable definitions and other table metadata. Each table retrieved from database or analyzed by Statistica tools is first transferred into the format of Statistica spreadsheet. The filename extension for this file is .sta

Macro - It is a code/script written (or recorded) in Statistica Visual Basic language. Users can use macro language to record/code certain user actions, choices, separate analyses, etc., and then play them back, to provide an automated method to achieve the same results (e.g., customizations, analyses). The filename extension for this file is .svb

Report - This is a word processor-style document inside Statistica where you can create your reports from analyses in a manual or automated way. The filename extension for this file is .str

Project - Project is simply the file type including all the open work in the Statistica application during the saving of the project. Thanks to this object, the user can start work at the point user left the work (even after closing the application). The filename extension for this file is .spf

More about these objects can be found in Statistica help/documentation.

Spreadsheet

Text Labels - Each numeric variable can have two identities: text and numeric. Text Labels are entries (strings of characters) that label specific numeric values within a variable in the spreadsheet. Text labels will always correspond to a numeric value and the variable's data type can be Double, Integer, or Byte.

Case name - Each row can have a header called casename. These headers serve mainly to have better orientation concerning rows information (user can work with casenames also in his/her computations e.g. assign value to a variable based on casename).

Bundles - Variables can be grouped into bundles. Bundles are one of the properties of a Statistica spreadsheet. After defining a bundle user can use it in variable selections for a simpler choice of variables. A single click on the bundle will select all the variables of the bundle. In the variable selection, process bundles are displayed in square brackets.

MD - Shortcut for missing data.

Long name - It is a feature of each variable specifying more detailed information about the variable (it can be displayed as part of the variable header). It can be used to annotate some information about the variable or to insert a formula transforming the variable.

Workspace terms

Node - Building block for visual workflows (Workspaces) with particular functionality.

Annotation - People can comment on anything in the workspace. Annotations can be visible as text boxes inside the workspace canvas. Comments can be edited, renamed, or deleted. They are saved as part of the workspace. Every node has, in addition, its own annotation section as well in the Home tab of the node.

Downstream document - Document which can be used in subsequent analysis (output table available after clicking on the right bottom corner of the node icon.

Deploy - Deploy is in fact equivalent to Save but Deploy is saving objects into Enterprise shared metareposity not into the disk (in other words it is the deployment of analyses/reports/models into production or to be shared with other team members).

SVB - Shortcut for Statistica Visual Basic, concerning workspaces and SVB letter on the node icon: it is referencing legacy node types scripted for particular functionality with fewer options but not following the interactive menu of a particular functionality (for further details, please look here).

General

Case - Whenever the tool is referencing cases, it means rows of a table. Typically various measures/information for one individual are stored in one row.

Variable - Whenever the tool is referencing a variable, it means a column of the table. Typically one concrete kind of information is stored within one column.

Response - Variable which we would like to predict/model. Also known as the target variable.

Predictors - Variables which are expected to be used for the construction of the prediction for some response variable (also called independent variables)

Navigation

Ribbon bar - It is the main orientation set of tabs for choosing functionalities in the upper part of the Statistica client application.

Enterprise - Word Enterprise is in the tool referring to metarepository structure (part of Statistica Server product) which is used for sharing work with the team, where you can share the Statistica objects, manage access permissions, define database connections, introduce versioning, etc.

Analysis

By Group - By Group means analyzing data separately for several groups (these are defined typically via one or more categorical variables). If the user can see the button By Group inside the analysis or nodes it is possible to do analyzing per group. This means the user can set a filter for analysis inside the node.

Select Cases - Select cases are equivalent to filters. Users can see the button Select Cases inside some of the nodes or interactive analysis. This means the user can set a filter for analysis inside the node or inside the analysis (not need to be set up upfront) after that analysis is conducted only on a subset of the data.

Weights (W) - Some of the analyses support computation according to weights (e.g. user would like to compute the weighted mean). Users can see the button typically displayed with a capital W inside the analyses or nodes. This means the user can specify weights variable and conduct weighted computations.

PMML - Some nodes (with modeling functionalities) are able to produce PMML code (Predictive Model Markup Language). This is an XML-based standard format for storing predictive model output. Can be used inside Statistica but also in external scoring applications.

PI - PI stands for special kind of data storage from OSIsoft which is fully supported by Statistica. PI data sources are typical in the manufacturing vertical.

Parts of the solution

MAS - Monitoring and alerting server, more details here.

SDMS - Statistica Document Management System is part of the solution handling versioning of objects inside Enterprise metarepository. It can be enabled or disabled from within the Enterprise Manager.

Data Entry - Spotfire Statistica® Data Entry Server (DES) is a web server that integrates with Spotfire Statistica® Server. It enables the creation of web forms, double-blind data entry, etc. (more information here).

Sign In

Spotfire Statistica® - Statistica Glossary

File types

Spreadsheet

Workspace terms

General

Navigation

Analysis

Parts of the solution

Table of contents

User Feedback

Recommended Comments

Industries