Summary
Overview
Random Forest is a machine-learning algorithm that aggregates the predictions from many decision trees on different subsets of data. This technique allows the model to be more accurate than single decision trees in predicting new data. It is a supervised learning technique that can be used to determine variable importance and make predictions. This point-and-click template uses a distributed random forest trained in H2O for best in the market training performance. The response can be either numeric or binary (e.g., good / bad) and predictors can be a mixture of numeric and categorical columns. Version 2 features automated machine learning to optimize model tuning parameters.
Details
There is a community article describing details of the methodology.
Release P2.0
Published: January 2018
Addition of option to automate optimization of model tuning parameters. This facilitates use by a business user or citizen data scientist.
Release P1.0
Published: March 2017
Initial Release