Jump to content

Specific Requisites Statistica Software


Thiago Leo

Recommended Posts

Hello everyone,

I'm new on plataform Statistica and i need to present to my companny some requisites, if the software can attend.

Could you help me

- Read file homesite.train.csv. (obs: this file was provided in one of Kaggle's competitions).

- Convert the columns with categorical data (Strings) to numeric.

- Save converted data to Hadoop HDFS.

- From Spark, read the HDFS data and load them into an RDD. Cache and demonstrate on the Spark console.

- Divide the RDD between training and validation (70% -30%) randomly.

- With the training data, train a model using Spark's RandomForestClassifier.

- Validate the trained model with the validation data.

- Calculate the AUC-ROC score in Spark and display in the solution interface.

Link to comment
Share on other sites

  • 2 months later...

Create an account or sign in to comment

You need to be a member in order to leave a comment

Create an account

Sign up for a new account in our community. It's easy!

Register a new account

Sign in

Already have an account? Sign in here.

Sign In Now
×
×
  • Create New...