Jump to content

How do I see my rotated and centered data after principal components (PCA) in Team Studio


Recommended Posts

I would like to do a PCA analysis on my data. Column 1 contains a class variable; I would like to use columns 2 through 14 in the PCA analysis.

Once the PCA is finished, I'd like to see the resulting centered and rotated data, attached to the class variable (column 1) for further analysis.

Using the PCA tool in Team Studio all I see is the rotation matrix displayed.

It appears ther PCA tool can generate some output results that are placed into my collection of data - but the default name of this new table ("alp@user_id_@flow_id_pca_0_1") is not intuitive. Temporary data tables like these seem to become permanent fixtures that mingle with actual data in my data sources collection - and it can become difficult to remember where these came from, or if they are important

Is there some way I can simply continue my analysis in Team Studio by connecting an output from the PCA tool into the downstream analysis

Link to comment
Share on other sites

  • 4 weeks later...

Hi Peter,

You can add a "Predictor" operator after the PCA operator to transform your original data. Note that there are two incoming links to the Predictor operator - one from your original dataset, the other from the PCA operator.

 

 

The Predictor operator will calculate the transformation on the principal component axes. In my example, it produced the 3 transformed columns (y_0_PCA, ...) and appended them to the original data.

 

 

 

In your case, your class variable in the original datasetwill be passed along to the Predictor operator.

 

Chia-Yui LEE

TIBCO Data Science

Link to comment
Share on other sites

  • 2 weeks later...

Sorry I'm being dense, but I am not able to attach an output from the PCA.   The outbound arrow appears but it does not attach to the predict node.   Is there some setting I need to address in the configuration of the PCA node

 

screen_shot_2019-07-11_at_1.15.21_pm.png

 

The PCA is happily computing some output so that is not the issue.  I've tried adjusting the carryover columns, no luck so far:

 

screen_shot_2019-07-11_at_1.15.37_pm.png

Link to comment
Share on other sites

  • 2 weeks later...

Peter,

We found the reason to this and I'll document it here for completeness.

You couldn't join the PCA operator to the predictor because the workflow uses a database datasource. We were able to join the operators after switching to a Hadoop datasource.

In general, functionalities available with Hadoop operators are richer.

On the PCA operator for database, the product documentation has this paragraph:

Output to Succeeding Operator: Database

Stored database tables that can be accessed by other Operators

The PCA Operatorfor the database is technically a "terminal" operator, meaning that no other Operator directly follows it in the workflow. However, thePCA Operatorstores itsPrincipal Component Results(andEigenvalue Outputdetails) in two database tables that can then be accessed as thedata source for a new workflow, if applicable. The following example shows the results of the databasePCA Operatorbeing saved aspcaOperatorResultsIrisandpcaOperatoreEigenOutputIris.The tables can be brought into the workflow and the derivedPrincipal Componentscan be fed into anAlpine Forest Operator,for example, and the classification results analyzed in theConfusion Matrixin orderto understand if the reduced set of Variables created by thePCA Operator provide an accurate enough model.

You can try this approach if your data has to be on database. With Hadoop datasources, you won't need to access the intermediate tables yourself as the workflow would know where to pick themup.

Chia-Yui LEE

TIBCO Data Science

Link to comment
Share on other sites

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...