Jump to content
  • Using TIBCO Statistica® Automated Neural Network for TIBCO Now Data Challenge


    This article provides a step-by-step guide on using TIBCO Statistica® Automated Neural Network to analyze and visualize data from the TIBCO Now Data Challenge, focusing on predicting happiness scores and evaluating model accuracy.

    Steps:

    1. Open TIBCO Statistica? from the desktop and from the upper left-hand corner click the open folder icon. Open 2017.txt from the TIBCO Now Data Challenge folder on the Desktop and make sure to check the mark 'Take a variable name from the first row of the file.

      varname.thumb.png.d36b03171505ee223f213f8052b42717.png

    2. From the Statistics tab select Neural nets.

    3. From the pop-up window select regression and 'OK'.

    4. Next, select the variables for the analysis by clicking the 'Variables' button.

      varselection.thumb.png.c980d80798fc0ab2a673efb2af07958f.png

    5. The continuous target is the variable we are going to predict, the continuous inputs are the explainer variables. Select appropriately and ignore categorical inputs.

    6. Change networks to retain to 1 for simplicity.

      sannconfig.thumb.png.50e9055756f678a3d487e78321a0b96c.png

    7. Feel free to adjust the sampling or other options if you have experience doing so otherwise use default and click 'OK'. Again from the pop-up configure or use defaults and click train.

    8. You will be provided with the results of the model:

      results.thumb.png.9b82d99f44bc74a488af6d139252ff7b.png

    9. Click the 'Predictions' button and a table will appear. It shows the actual happiness score versus the predicted for the model(s) created.

    10. Just left of the table there is an object explorer. You will see an item 'Predictions spreadsheet for happiness score' and just to the left of it a table icon. Right-click this icon and click save item as:

      exportmodel.thumb.png.9874d5d57da4a05d888d534480662f33.png

    11. Save it as a CSV file you will be prompted, make sure to select 'Put variable name in first row':

      saveconfig.thumb.png.3f5e577943322bc9ebe0e67e28819448.png

    12. Back in Spotfire add a new data table.

    13. This can be done from File >> Add data table >> Add button on the right >> "file" from the list of options. Select the CSV table just created.

    14. When the import settings appear select ignore for the first two rows and make sure the row with Happiness Score and Happiness Score - Output is the name row:

      importsettings.png.393fed3db281b0bdaf65b39f87d4ec8a.png

    15. Now, Let's make some sense of these results through a spotfire visualization. Create a scatter plot. You can find the option to do so from the menu - Insert-> Visualization -> Scatterplot. Now using the axis selectors, change the x-axis by selecting the first column (named - "Happiness Score") of the CSV file. For the y-axis select the column “Happiness Score - Output”.

    16. As in the previous task create a residuals plot. Don't hesitate to go back to the section (link to section) to go over the steps involved. Out of the three models created which is the most accurate? Recall, that you can gauge accuracy by looking at the statistics generated by the plot

    Note : Spotfire visualizations default to the first data table added to your analysis. This might make the column selectors refer to those default columns and not the ones from the newly added data table. This can be easily changed! Just look to the right of your scatter plot (where the legend is located) and change the data table using the drop-down list.

     


    User Feedback

    Recommended Comments

    There are no comments to display.


×
×
  • Create New...