A Machine Learning Workflow

A typical machine learning workflow includes the following steps that take place in one or more dataflows:
  1. Access the data using other Spectrum modules, such as Data Integration.
  2. Prepare the data using stages from other Spectrum modules such as those in Data Integration, Data Quality, and the Core modules.
  3. Fit a machine learning model, run the dataflow, and then review the contents of the Model Output tab in the model stage. You can then tweak the model if necessary and rerun the dataflow. Following that, you need to review the full set of model assessment output in Machine Learning Model Management tool. You can review one model at a time or compare two models.
  4. (Optional) If the model will be used to score data, expose the model in the Machine Learning Model Management tool, which makes the model available to the Java Model Scoring stage.
    1. Create a Spectrumâ„¢ Technology Platform data flow with steps 1-2 above, then replace step 3 with the Java Model Scoring stage. Set up this dataflow to run in batch mode to populate a file with model scores applied to refreshed data (the fields used as Xs or inputs are refreshed in step 1-2 as a natural part of doing business).
    2. Alternatively, use a web service in Spectrumâ„¢ Technology Platform to score data on demand. For example, access the website, get the customer ID and model inputs, score those and return the score to a process that customizes web content for your customer.
  5. (Optional) You can also deploy model scores into a Data Hub graph database as an entity property, onto maps, or into CES applications.