Machine Learning Module
Regularization
You can now apply a Regularization type to Linear Regression and
Logistic Regression models to help manage overfitting. You can select from the LASSO, Ridge
Regression, or Elastic Net types. Regularization can be fine tuned with these additional new
fields for both stages:
- Value of alpha
- Value of lambda
- Search for optimal value of lambda
- Stop early
- Maximum lambdas to search
- Maximum active predictors
For more information, see "Configuring Advanced Options" for Linear Regression or Logistic Regression in the Machine Learning Guide.
Model Metrics Port
The new output Model Metrics Port lets you attach a data output stage
to send model assessment metrics to a data file. This information helps you compare many
models generated from within and outside of Spectrum™ Technology Platform and perform other
data processing tasks on the metrics. Alternatively, you can add an inspection point and run
inspection on the dataflow to view these metrics in Enterprse Designer. This new output port
is available for the following stages:
- K-Means Clustering
- Linear Regression
- Logistic Regression
- Principal Component Analysis
- Random Forest Classification
- Random Forest Regression
Machine Learning Module Management Updates
The Model Assessment page in Machine Learning Model Management has the following updates in
this release:
- Model Type—A new field that indicates the type of model that was created by the dataflow.
- Creation Time—A new field that indicates the date and time the model was created.
- Value of lambda—Linear Regression and Logistic Regression contain this new metric, which controls the amount of regularization applied.
- K-Means Clustering—Figures for the following three fields on the Model Summary tab of the
Model Detail page now come from training data:
- Within cluster sum of squares
- Total sum of squares
- Between cluster sum of squares