Deploying R models

Once you have created a model, you can deploy it to a SQL Server table and use it later for predictions. You can also do the predictions in R and store just the results in a SQL Server table.

Let's start by creating another model in R. This time, the model uses the Logistic Regression algorithm. The model uses the SQL Server data and the dbo.vTargetMail view to learn how the values of the NumberCarsOwned, TotalChildren, Age, and YearlyIncome input variables influence the value of the BikeBuyer target variable. The following code sets the execution context back to SQL Server, creates the model with the RevoScale rxLogit() function, and shows the summary of the model:

rxSetComputeContext(srvEx); 
bbLogR <- rxLogit(BikeBuyer ~ 
          NumberCarsOwned + TotalChildren + Age + YearlyIncome, 
          data = sqlTM); 
summary(bbLogR); 

You can use the model to perform predictions. In the following example, the model is used to make predictions on the same dataset that was used for training the model. In a real-life situation, you would perform predictions on a new dataset. The code stores the predictions, together with the input values used, in a SQL Server table. The RxSqlServerData() function prepares the connection to the SQL Server database and the target table name. The rxPredict() function performs the predictions, physically creates the SQL Server table, and inserts the data. Of course, the database user used to connect to SQL Server must have appropriate permissions to create a table:

bbLogRPredict <- RxSqlServerData(connectionString = sqlConnStr, 
                                 table = "dbo.TargetMailLogR"); 
rxPredict(modelObject = bbLogR, 
          data = sqlTM, outData = bbLogRPredict, 
          predVarNames = "BikeBuyerPredict",  
          type = "response", writeModelVars = TRUE); 

If you get a warning message when executing the rxPredict() function, just ignore it. Warnings should be gone with the next version of the RevoScaleR library. Just check the results. You can use a T-SQL query to check the results, as shown here:

USE AdventureWorksDW2014; 
SELECT *  
FROM dbo.TargetMailLogR; 

The partial results are shown here. Values exceed 0.5 mean positive predictions:

As mentioned, you can store a model in a SQL Server table. The following T-SQL code creates a table where the models are going to be stored and a stored procedure that actually inserts a model:

CREATE TABLE dbo.RModels 
(Id INT NOT NULL IDENTITY(1,1) PRIMARY KEY, 
 ModelName NVARCHAR(50) NOT NULL, 
 Model VARBINARY(MAX) NOT NULL); 
GO 
CREATE PROCEDURE dbo.InsertModel 
(@modelname NVARCHAR(50), 
 @model NVARCHAR(MAX)) 
AS 
BEGIN 
    SET NOCOUNT ON;   
    INSERT INTO dbo.RModels (ModelName, Model) 
   VALUES (@modelname, CONVERT(VARBINARY(MAX), @model, 2)); 
END; 

The infrastructure is created. Now you can store the model in a SQL Server table from R. However, in order to call a stored procedure, you need to use an ODBC connection. Therefore, the following code first loads the RODBC library to memory, and creates a connection to the SQL Server database. Then it serializes the model to a binary variable, and creates a string from the binary variable using the paste() function. The same function is used to prepare a string with the T-SQL code to insert the model in the table. Finally, the sqlQuery() function sends the T-SQL command to SQL Server. Again, the R user used to execute this code must have permission to execute the stored procedure:

library(RODBC); 
conn <- odbcDriverConnect(sqlConnStr); 
modelbin <- serialize(bbLogR, NULL); 
modelbinstr=paste(modelbin, collapse=""); 
sqlQ <- paste("EXEC dbo.InsertModel @modelname='bbLogR', @model='",  
               modelbinstr,"'", sep=""); 
sqlQuery(conn, sqlQ); 
close(conn); 

The final step is to use the model from T-SQL. The following code uses the sys.sp_execute_external_script system procedure to use the model and perform a prediction on a single case. First, it creates an input dataset that consists of a single row with four input variables. Then, it retrieves the stored model. Then the R code is executed, which un-serializes the model and uses the rxPredict() function again to generate the output dataset, which includes the input variables and the prediction:

DECLARE @input AS NVARCHAR(MAX) 
SET @input = N' 
    SELECT *  
    FROM (VALUES  
          (0, 2, 44, 90000)) AS  
          inpQ(NumberCarsOwned, TotalChildren, Age, YearlyIncome);'  
DECLARE @mod VARBINARY(max) = 
 (SELECT Model  
  FROM DBO.RModels 
  WHERE ModelName = N'bbLogR');   
EXEC sys.sp_execute_external_script 
 @language = N'R',   
 @script = N'   
  mod <- unserialize(as.raw(model));   
  OutputDataSet<-rxPredict(modelObject = mod, data = InputDataSet,
  outData =  NULL,    
          predVarNames = "BikeBuyerPredict", type = "response",  
           checkFactorLevels=FALSE, 
           writeModelVars = TRUE, overwrite = TRUE);   
 ',   
  @input_data_1 = @input,   
  @params = N'@model VARBINARY(MAX)', 
  @model = @mod   
WITH RESULT SETS (( 
 BikeBuyerPredict FLOAT, 
 NumberCarsOwned INT, 
 TotalChildren INT, 
 Age INT, 
 YearlyIncome FLOAT));   

The results are as follows:

BikeBuyerPredict       NumberCarsOwned TotalChildren Age  YearlyIncome
---------------------- --------------- ------------- ---- -------------
0.733910292274223      0               2             44   90000

The input values used were the same as the values in the first row of the batch predictions from the dbo.vTargetMail view used in the previous prediction example. You can see that the predicted value is also the same.

You can also deploy a model from T-SQL code directly. The following code creates a Decision Trees predictive model using the rxDTree() function using the same data as the logistic regression model did. The data is this time read directly from the SQL Server database. The R code then serializes the model  and passes the serialized model as an output parameter. T-SQL code inserts it into the table with the models.

DECLARE @model VARBINARY(MAX);
EXECUTE sys.sp_execute_external_script
@language = N'R'
,@script = N'
bbDTree <- rxDTree(BikeBuyer ~ NumberCarsOwned +
TotalChildren + Age + YearlyIncome,
data = sqlTM);
model <- rxSerializeModel(bbDTree, realtimeScoringOnly = TRUE);'
,@input_data_1 = N'
SELECT CustomerKey, BikeBuyer, NumberCarsOwned,
TotalChildren, Age, YearlyIncome
FROM dbo.vTargetMail;'
,@input_data_1_name = N'sqlTM'
,@params = N'@model VARBINARY(MAX) OUTPUT'
,@model = @model OUTPUT;
INSERT INTO dbo.RModels (ModelName, Model)
VALUES('bbDTree', @model);

You can also deploy a model from T-SQL code directly. The following code creates a Decision Trees predictive model using the rxDTree() function using the same data as the logistic regression model did. The data is this time read directly from the SQL Server database. The R code then serializes the model and passes the serialized model as an output parameter. T-SQL code inserts it into the table with the models.

In SQL Server 2017, there is a new PREDICT() T-SQL function. This function generates predictions based on stored models. In order to use this function, you don't even need to have ML Services with R and Python installed. You just need to serialize your model in a SQL Server table. However, there is an important limitation for using this function. The model you are using for the predictions must have been created with one of the supported algorithms from the RevoScaleR package. You can find the list of supported algorithms at https://docs.microsoft.com/en-us/sql/advanced-analytics/real-time-scoring#bkmk_rt_supported_algos.

The following code uses the PREDICT() T-SQL function to make predictions on the target mail data using the decision trees model just created. You can compare the predictions with the predictions from the logistic regression model to see which model performs better. You will learn more about model evaluation in the next chapter:

DECLARE @model VARBINARY(MAX) = 
(
SELECT Model
FROM dbo.RModels
WHERE ModelName = 'bbDTree'
);
SELECT d.CustomerKey, d.BikeBuyer,
d.NumberCarsOwned, d.TotalChildren, d.Age,
d.YearlyIncome, p.BikeBuyer_Pred
FROM PREDICT(MODEL = @model, DATA = dbo.vTargetMail AS d)
WITH(BikeBuyer_Pred FLOAT) AS p
ORDER BY d.CustomerKey;
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
18.216.96.94