Exporting the Final Model to Build the Scoring Application

Topics:

Once you have finalized your model, you can export the model formula as a routine that can be used to score new data outside of RStat.

RStat offers three main types of export options:

The Exportability column in the Model tab displays the main export file types above that apply to each model.

The Model Export dialog box displays all supported file type options: C, Java, PMML, XML, and Teradata UDF. Additionally, the All option allows you to export the model as all available file types simultaneously in a single session.

Exporting Using C, Java, or PMML

How to:

Models can be exported as C code to be compiled as a User Defined Function (UDF) for use with WebFOCUS servers, as Teradata UDF in C with a SQL template for execution directly on the Teradata database, as Java for deployment to Hadoop, and as PMML for consumption by middleware.

RStat generates a User-Defined Function (UDF) with each model that is exported from RStat as Java. The UDF is a Hive wrapper. It calls the Java export. The Java UDF contains calls to import libraries that are needed to run the model in Hive. The UDF may be customized to enable the Java export to work on any device that supports Java. The Java export name and the number of arguments are dynamically entered in the Java UDF template.

Procedure: How to Export Using C, Java, or PMML

You can use this procedure to export using C, Java, or PMML.

  1. Select the Model tab.
  2. Click Export from the RStat toolbar. The Model Export dialog box opens, as shown in the following image.
    Export dialog box
  3. Select C Files, Java Files, PMML Files, or All Files from the Type drop-down list to view existing files of the selected type within the current path, as shown in the following image.
    Type drop-down list
  4. Select the Export File Type(s) from the drop-down list, as shown in the following image. When the file type is selected, the corresponding file extension is added to the exported model name.
    export file type dropdown menu

    Note: Select Teradata UDF to export scoring functions contained within C files to be consumed as a Teradata User Defined Function (UDF). For more information, see RStat Export for Teradata User Defined Function.

  5. Select an Export Target Type. Select Class to generate a model that predicts the most likely classification or category of your input data. Select Probabilities to generate a model that uses your input data to predict the numerical likelihood of an event.
  6. Select an Include option. These allow you to selectively include the PMML and Model Metadata in the C routine for further reference. They are embedded as text strings within the routine, which can be extracted using the RStat command in WebFOCUS (see Displaying Model Information With the RStat Query Command). Depending on the model you have built, these text strings can be very large. In certain environments, you will want to exclude these because they will be too large to successfully compile. Options include:
    • PMML. Includes the PMML within the routine.
    • Meta Data. Includes key model metadata within the routine. This is the model output from R which appears on the Model Textview on the Model Tab, as highlighted in the following image.

    Note: Clear the check box next to either of these include options to exclude these from the exported routine.

  7. Click Save. The file containing your scoring routine will be generated and placed in the selected location and file name.

    Note: When saving a file, an Overwrite Alert displays if the file already exists, as shown in the following image. Click Replace to overwrite the original file. Click Cancel to close the Alert without saving your file.

    overwrite alert

RStat Export for Teradata User Defined Function

How to:

As of RStat Version 1.3.1, you can export Scoring functions contained within C files to be consumed as a Teradata User Defined Function (UDF).

For Big Data Analytics, RStat routines can be deployed for in-database execution. This means that the actual scoring of large amounts of data is generated in the database engine, alleviating the need for extracting the data prior to scoring. When scoring large amounts of data, running the predictive model as an in-database function may result in significant performance gains.

Procedure: How to Create a C File for a Teradata User Defined Function

After creating the predictive model, click the Export button. Select the Teradata UDF option on the Export dialog box and then Run. A C file is created in Teradata UDF format, along with the SQL template needed to define the C program to Teradata. All the input fields are automatically defined in the SQL template. However, the actual location of the C file that has been uploaded needs to be modified in the SQL template. Once you have the location and the SQL, the UDF creation process defined by Teradata should be followed.

Once defined as a Teradata UDF, the RStat model can then be defined as an in-database function to WebFOCUS, either at the Metadata or the WebFOCUS procedure level.

The power of an RStat predictive model in-database function means that you can easily incorporate the model into ETL, reporting, Dashboard, or any other native Teradata Client application.

  1. In App Studio, from the Home tab, in the Modeling group, click Predictive Modeling.

    RStat opens.

  2. In the Filename field, on the Data tab, click the folder to browse and select a data file.
  3. Click Open.
  4. Click Run to load the data.

    RStat refreshes to display your data.

  5. Click the Model tab and select your model.
  6. Click Run.
  7. On the toolbar, click Export to export your data.
  8. In the Model Export dialog box, select the check box for Teradata UDF, as shown in the following image.

    This creates a C file on your local drive that contains Teradata.

    Note: You can save the C file to any location using the Browse for other folders options.

    Along with the C file, an SQL template is created. The items in double angle brackets (<<,>>) need to be modified in the line below:

    << ENTER FULL PATH OF UDF C FILE HERE AND PUT IN SINGLE QUOTE >>
        EXAMPLE 'CS!<<model name>>!/home/testdrive/ibi/apps/<<model name>>.c!F! <<model name>>'

    The following image shows an SQL template that has been modified and pasted into a Basic Teradata Query (BTEQ) tool to create a Teradata UDF.

    Using the UDF in an SQL Select, note that the custscore function name matches the C file deployment in Teradata.

    The following image shows in-database scored values (Arrows) returned in a report.

    RStat models are completely integrated with Information Builders tools. Using predictive models in DataMigrator allows you to segment data based on customer market segmentation.

WebFOCUS

Feedback