Jump to content
  • Create a Custom Transformation in Spotfire®


    Introduction

    Data is often not formatted for immediate analysis, and it may contain errors. The transformations framework adds a layer between data reading and the creating of a data table, providing the means to modify the data before it is imported. A data transformation applies a rule or a function to the retrieved data to prepare it for analysis.

    With the transformation framework it is possible to create a custom transformation. When deployed in the Spotfire® environment, the transformation appears in the following places:

    • File > Add Data Tables
    • File > Replace Data Table
    • Insert > Column
    • Insert > Rows

    This tutorial describes a simple example of a custom transformation that is capable of changing the data type of a column or replacing null values in a column.

    Prerequisites 

    • Spotfire® Developer (SDK), see download instructions here.
    • Spotfire® Analyst, download from edelivery.tibco.com.
    • Microsoft Visual Studio® 2013 or higher. The free Community edition is available for download here.

    See also

    • SDK example: SpotfireDeveloper.CleanupTransformationExample - contains the complete code for the Cleanup Transformation example described in this tutorial.

    Overview

    The extension point for the transformation framework is the CustomDataTransformation class, which can be inherited to implement a transformation. This class is then registered in the RegisterDataTransformations method on the AddIn class together with a CustomTypeIdentifier.

    Other classes in the framework are:

    • DataTransformationConnection - Represents a connected data transformation. A connection is used to perform prompting. Either use a default instance or create a derived class.
    • CustomDataRowReader - Base class for custom data readers. Used by CustomDataSource and CustomTransformation. It is also possible to create a default wrapper around an IDataReader by using one of the CreateReader methods on the base class, DataRowReader.

    If the transformation should have a UI that prompts for user input, a view class needs to be registered in the RegisterViews method on the Addin class. A view class should inherit from System.Windows.Form to run in the Analyst client. In the web client, transformations can be executed but there is no way to add new transformations.

    Basic Example

    The core functionality of this custom transformation is implemented by three classes:

    • CleanupTransformation: Defines the logic of the transformation.
    • CleanupTransformationDialog: The dialog retrieving settings for the transformation.
    • CleanupTransformationReader: Performs the actual transformation. It is returned by the CleanupTransformation when the Connect method is called.

    CleanupTransformation

    The CleanupTransformation class is implemented by deriving from CustomDataTransformation. It handles the prompting for settings and creates a DataRowReader for the transformation.

    1. Override ConnectCore.

      This method checks if prompting is required and creates a DataTransformationConnection which prompts the registered transformation dialog and creates a CleanupTransformationReader.
       

    2. Override GenerateDataHistoryCore.

      This method must be implemented if the transformation requires settings to execute. In the method, all applied settings are added as details to a DataHistoryBuilder.
       

    3. Add public properties.

      The CleanupTransformation class is passed to the dialog and the specified settings are then stored in it.

      The properties of the CleanupTransformation class must be public to enable the dialog to store the settings.
       

    4. Implement serialization.

      The settings for the transformation must be serialized.

     

    CleanupTransformationDialog

    The user must be able to specify parameters for the transformation and therefore a dialog must be created. For the CleanupTransformation the user must be able select a column and specify whether the column should change the type or whether null values ought to be replaced.

    cleanuptransformationdialog.png.e22c24e2fca92dd1ff2b62f9c3dc023e.png


    If the column shall change type, the replacement type must be specified. 

    If null values shall be replaced, the replacement value must be specified.

    Update the model with user information.

    When the user presses the OK button, the specified parameters are stored in the model. The stored values are used as parameters when creating the DataRowReader.

    CleanupTransformationReader

    1. Implement a class deriving from CustomDataRowReader.

      DataRowReader is the class that performs the transformation of the data.

      The reader will step through the rows of the data set. For each row, the data will be calculated in the MoveNextCoremethod.
       

    2. Implement the constructor.

      Store the settings for the transformation and create a list for the cursors used to transform the data.

      Each column is added to a list of columns. For the column to be transformed a custom cursor is assigned. If the data type shall be changed, the column gets a new data type. The transformation cursor and the input cursor is stored; these will be used in the MoveNextCore method.
       

    3. Implement GetColumnsCore.

      This method returns the columns that the DataRowReader can return.
       

    4. Implement GetResultPropertiesCore.

      This method returns the result properties for the DataRowReader.

      Properties to describe the performed transformation are added to the result property collection.
       

    5. Implement MoveNextCore.

      This method moves the cursors to the next row of the data set. First, a check is performed that the input reader can move to the next row. Then, all cursors in the list of transformation cursors are processed.

      For each cursor, the output value will be computed:

      If column type is to be changed, the input value is converted to the new type.

      If the null values are to be replaced, the invalid input values are replaced by the defined value.
       

    6. Implement ResetCore.

      This method resets the DataRowReader.

     

    Code Examples

    Simple CustomDataTransformation and CustomDataRowReader

    This data transformation example creates a custom transformation that returns every other row from the input and all columns.

    First create a transformation:

    internal class EveryOtherTransformation : CustomDataTransformation
    {
        public EveryOtherTransformation() { }
     
        public EveryOtherTransformation(
                SerializationInfo info, StreamingContext context)
            : base(info, context) { }
     
        protected override void GetObjectData(
                SerializationInfo info, StreamingContext context)
        {
            base.GetObjectData(info, context);
        }
     
        protected override DataTransformationConnection ConnectCore(
                ImportContext importContext, DataRowReader input)
        {
             return DataTransformationConnection.CreateConnection(
                        delegate {
                            return new EveryOtherTransformationReader(input);
                        });
        }
    }
     

    A reader also needs to be created. It is this class that performs the actual work:

    internal class EveryOtherTransformationReader : CustomDataRowReader
    {
        private DataRowReader inputReader;
     
        public EveryOtherTransformationReader(DataRowReader inputReader)
        {
            this.inputReader = inputReader;
        }
     
        protected override IEnumerable<DataRowReaderColumn> GetColumnsCore()
        {
            return this.inputReader.Columns;
        }
     
        protected override ResultProperties GetResultPropertiesCore()
        {
            return this.inputReader.ResultProperties;
        }
     
        protected override void ResetCore()
        {
            this.inputReader.Reset();
        }
     
        protected override bool MoveNextCore()
        {
            return this.inputReader.MoveNext() && this.inputReader.MoveNext();
        }
    }
     

    Setting Metadata

    This custom transformation propagates all columns from the input reader, but assigns additional metadata to all real columns.

    Two new metadata properties are added for each real column:

    • Custom.Lamba: A real value, randomly calculated, between 0 and 1.
    • Custom.Propagate: A boolean value specifying whether the columns should be propagated from a custom transformation later in the transformation chain.

    The transformation is very similar to the Simple CustomDataTransformation example above, and this is the reader:

    private class LambdaTransformationReader : CustomDataRowReader
    {
        private DataRowReader inputReader;
        private List<DataRowReaderColumn> columns;
     
        public LambdaTransformationReader(
                    ImportContext importContext, DataRowReader inputReader)
        {
            Random random = new Random();
            this.inputReader = inputReader;
     
            this.columns = new List<DataRowReaderColumn>();
            foreach (DataRowReaderColumn col in inputReader.Columns)
            {
                DataColumnProperties properties = col.Properties;
                if (col.DataType == DataType.Real)
                {
                    double lambda = random.NextDouble();
                    bool propagate = random.Next(0, 2) != 0;
     
                    // we need to clone the input properties to be able to set
                    // new properties.
                    properties = properties.Propagate(importContext);
     
                    properties.SetProperty("Custom.Lambda", lambda);
                    properties.SetProperty("Custom.Propagate", propagate);
                }
     
                // keep Name, DataType and Cursor since we do not modify the
                // values in the columns.
                this.columns.Add(
                    new DataRowReaderColumn(
                    col.Name,
                    col.DataType,
                    properties,
                    col.Cursor));
            }
        }
     
        protected override IEnumerable<DataRowReaderColumn> GetColumnsCore()
        {
            return this.columns;
        }
     
        protected override ResultProperties GetResultPropertiesCore()
        {
            return this.inputReader.ResultProperties;
        }
     
        protected override void ResetCore()
        {
            this.inputReader.Reset();
        }
     
        protected override bool MoveNextCore()
        {
            return this.inputReader.MoveNext();
        }
    }
     

    Using Properties from a Previous Transform

    This is a custom transformation which uses the properties added in the Lambda transformation example above. It performs the Box-Cox transformation on all real columns that have the Custom.Lambda property set using that value as lambda. Columns which do not have Custom.Lambda set are just sent through the transform. It also uses the Custom.Propagate property to decide whether the original column should also be returned for transformed columns. The code sample only shows the reader:

    private class BoxCoxTransformationReader : CustomDataRowReader
    {
        private DataRowReader inputReader;
        private List<DataRowReaderColumn> columns;
     
        private List<MutableValueCursor<double>> transformedCursors;
        private List<DataValueCursor<double>> transformInputCursors;
        private List<double> lambdas;
     
        public BoxCoxTransformationReader(DataRowReader inputReader)
        {
            this.inputReader = inputReader;
            this.transformedCursors = new List<MutableValueCursor<double>>();
            this.transformInputCursors = new List<DataValueCursor<double>>();
            this.lambdas = new List<double>();
     
            this.columns = new List<DataRowReaderColumn>();
            foreach (DataRowReaderColumn col in inputReader.Columns)
            {
                bool propagate = true;
                bool isLambda = false;
                DataColumnProperties properties = col.Properties;
                if (col.DataType == DataType.Real &&
                    col.Properties.HasPropertyValue("Custom.Lambda"))
                {
                    isLambda = true;
     
                    double lambda =
                            (double)col.Properties.GetProperty("Custom.Lambda");
     
                    propagate =
                            (bool)col.Properties.GetProperty("Custom.Propagate");
     
                    this.lambdas.Add(lambda);
     
                    MutableValueCursor<double> cursor = (MutableValueCursor <double>)DataValueCursor.CreateMutableCursor(DataType.Real);
                    this.transformedCursors.Add(cursor);
                     
                    this.transformInputCursors.Add(
                            (DataValueCursor<double>)col.Cursor);
     
                    this.columns.Add(
                        new DataRowReaderColumn(
                            col.Name,
                            col.DataType,
                            properties,
                            cursor));
                }
                 
                if(propagate)
                {
                    // does not have a lambda property.
                    string columnName = col.Name;
                    if(isLambda)
                    {
                        // add original to propagated transformed columns.
                        columnName =
                            string.Format("Original {0}", columnName);
                    }
                    this.columns.Add(
                        new DataRowReaderColumn(
                        columnName,
                        col.DataType,
                        properties,
                        col.Cursor));
                }
            }
        }
     
        protected override IEnumerable<DataRowReaderColumn> GetColumnsCore()
        {
            return this.columns;
        }
     
        protected override ResultProperties GetResultPropertiesCore()
        {
            return this.inputReader.ResultProperties;
        }
     
        protected override void ResetCore()
        {
            this.inputReader.Reset();
        }
     
        protected override bool MoveNextCore()
        {
            if (!this.inputReader.MoveNext())
            {
                return false;                   
            }
            for (int i = 0; i < this.transformedCursors.Count; ++i)
            {
                DataValueCursor<double> input = this.transformInputCursors[i];
                MutableValueCursor<double> output = this.transformedCursors[i];
     
                if (input.IsCurrentValueValid)
                {
                    output.MutableDataValue.IsValid = true;
                    double val = input.CurrentValue;
                    double lambda = this.lambdas[i];
                    if (lambda == 0)
                    {
                        val = Math.Log(val);
                    }
                    else if(lambda != 1)
                    {
                        val = (Math.Pow(val, lambda) - 1) / lambda;
                    }
                    output.MutableDataValue.Value = val;
                }
                else
                {
                    output.MutableDataValue.IsValid = false;
                    output.MutableDataValue.ErrorValue =
                            input.CurrentDataValue.ErrorValue;
                }
            }
            return true;
        }
    }
     

    Prompting the User

    This custom transformation shows how to prompt the user for values to be used.

    Prompting is performed by returning the prompt models in the DataTransformationConnection object. It is usually sufficient to use the static factory methods that allow a list of prompt models as input.

    • When the ExecuteTransformation method is called, all the prompt models will have been successfully prompted.
    • If the prompting was canceled, then the ExecuteTransformation method will not be called.
    private bool NeedsPrompting
    {
        get
        {
            // check if all parameter needs for transformation execution is set.
            return true;
        }
    }
     
     
    protected override DataTransformationConnection ConnectCore(
        ImportContext importContext, DataRowReader input)
    {
        PromptService ps = importContext.PromptService;
     
        // create a new list were we add the prompt models if needed
        List<object> promptModels = new List<object>();
     
        // check if we are allowed to prompt.
        if (ps.IsPromptingAllowed)
        {
            switch (ps.PromptMode)
            {
                case PromptMode.NotAllowed:
                    // require prompting but are not allowed.
                    if (NeedsPrompting)
                    {
                        throw
                            new ImportException(
                                 "Lambda transformation could not be performed."
                                );
                    }
                    break;
                case PromptMode.Allowed:
                    if (NeedsPrompting)
                    {
                        // add this as a prompt object
                        promptModels.Add(this);
     
                        // the registred prompt view on this object
                        // will modify this objects settings.
     
                    }
                    break;
                case PromptMode.Always:
                    // add this as a prompt object
                    promptModels.Add(this);
     
                    // the registred prompt view on this object
                    // will modify this objects settings.
     
                    break;
            }
        }
     
        // create connection using the factory method and provide
        // the promptmodels as the final parameter.
        return DataTransformationConnection.CreateConnection(
            delegate
            {
                // when this delegate is called all prompting will
                // already has been performed.
                return new LambdaTransformationReader(importContext, input);
            },
            promptModels);           
    }
     

    The transformation may either provide a specific prompt object or use itself, as in this example, to perform the prompting. In addition to this code, a UI component which performs the prompting must be added; a normal windows forms component that is registered in the AddIn for the custom transformation. Here, the windows forms dialog LambdaTransformationPromptDialog is registered as a view for the LambdaTransformation, thereby enabling the code above to work as expected.
     

    protected override void RegisterViews(ViewRegistrar registrar)
    {
        base.RegisterViews(registrar);
     
        registrar.Register(
            typeof(Form),
            typeof(LambdaTransformation),
            typeof(LambdaTransformationPromptDialog));
    }
     

    Creating a New Data Table

    This custom transformation creates a new data table where the data is transformed and all settings are explicitly set.

    AnalysisApplication app = this.Context.GetService<AnalysisApplication>();
     
    // create the data source.
    DataSource ds = app.Document.Data.CreateFileDataSource("c:\\test.txt");
     
    // connect
    DataSourceConnection connection = ds.Connect(app.Document.Data,
                                        DataSourcePromptMode.RequiredOnly);
     
    // create the reader
    DataRowReader reader = connection.ExecuteQuery2();
     
    DataFlowBuilder dfb = new DataFlowBuilder(
       connection.DataSource, // use the data source from the reader.
       app.ImportContext);
     
    // configure pivot using known columns from the reader
    PivotTransformation pt = new PivotTransformation();
    pt.CategoryColumns =
       new DataColumnSignature[] {
           new DataColumnSignature(reader.Columns["Category1"]),
           new DataColumnSignature(reader.Columns["Category2"])
       };
     
    pt.IdentityColumns =
        new DataColumnSignature[] {
            new DataColumnSignature(reader.Columns["Identity"]),
        };
     
    pt.ValueColumns =
        new ColumnAggregation[] {
            new ColumnAggregation(
                new DataColumnSignature(reader.Columns["Values"]),
                "Sum")
        };
     
    // add the transformation.
    dfb.AddTransformation(
        pt);
     
    // add more transformations here if needed.
     
    // create the flow
    DataFlow flow = dfb.Build();
     
    // create a new table from the flow.
    DataTable newTable = app.Document.Data.Tables.Add(flow.DocumentTitle, flow);
    
     

    New Data Table and Show UI

    This custom transformation creates a new data table where the data is transformed and the UI needs to be shown.

    In this example, the same operation as in the Creating a New Data Table example is performed, but in this case we do not set any settings on the transformations added and use the prompting inside the transformation to display the user interface instead.

    AnalysisApplication app = this.Context.GetService<AnalysisApplication>();
     
    // create the data source.
    DataSource ds = app.Document.Data.CreateFileDataSource("c:\\test.txt");
     
    DataFlowBuilder dfb = new DataFlowBuilder(
        ds,
        app.ImportContext);
     
    // add the transformation.
    dfb.AddTransformation(
        new PivotTransformation());
     
    // show the pivot prompt UI, the execute step is not really nessecary
    // since all needed UI:s will be shown when the table is created.
    // but can be useful to control when the UI should be shown.
    dfb.Execute(1, DataSourcePromptMode.RequiredOnly);
     
    // add an unpivot transformation after
    dfb.AddTransformation(
        new UnpivotTransformation());
     
    // show the unpivot prompt UI
    dfb.Execute(2, DataSourcePromptMode.RequiredOnly);
     
    // create the flow
    DataFlow flow = dfb.Build();
     
    // create a new table from the flow.
    DataTable newTable = app.Document.Data.Tables.Add(flow.DocumentTitle, flow);
     
     
     

    User Feedback

    Recommended Comments

    There are no comments to display.


×
×
  • Create New...