Predictor Screening Platform Overview
Predictor screening is useful for the identification of significant predictors from a large number of candidates. Suppose you had hundreds of Xs and needed to determine which of those were most significant as predictors of an outcome.
The predictor screening platform uses a bootstrap forest to identify potential predictors of your response. For each response, a bootstrap forest model using 100 decision trees is built. The column contributions to the bootstrap forest model for each predictor are ranked. Because the bootstrap forest method involves a random component, column contributions can differ when you rerun the report. For more information about decision trees, see the Partition Models chapter in the Predictive and Specialized Modeling book
Example of Predictor Screening
The Bands data table contains measurements from machinery in the rotogravure printing business. The data set contains 539 records and 38 variables. The response Y is the column Banding? and its values are “BAND” and “NOBAND”. You are interested in understanding what properties are most likely to contribute to the response.
1. Select Help > Sample Data Library and open Bands
2. Select Analyze > Screening > Predictor Screening.
3. Select Banding? as Y, Response.
4. Select the grouped columns grain screened to chrome content and click X.
5. Click OK.
Figure 19.2 Ranked Column Contributions
Ranked Column Contributions
Note: Because this analysis is based on the Bootstrap Forest method that has a random selection component, your results can differ slightly from those in Figure 19.2. See “Bootstrap Forest”.
The columns are sorted and ranked in order of contribution in the bootstrap forest model. Predictors with the highest contributions are strong candidate predictors for the response of interest.
Launch the Predictor Screening Platform
Launch the Predictor Screening platform by selecting Analyze > Screening > Predictor Screening.
Figure 19.3 Predictor Screening Launch Window
Predictor Screening Launch Window
Y, Response
The response columns.
Predictor columns.
A column or columns whose levels define separate analyses. For each level of the specified column, the corresponding rows are analyzed using the other variables that you have specified. The results are presented in separate reports. If more than one By variable is assigned, a separate report is produced for each possible combination of the levels of the By variables.
The Predictor Screening Report
The report (Figure 19.2) shows the list of predictors with their respective contributions and rank. Predictors with the highest contributions are likely to be important in predicting Y.
The Contribution column shows the contribution of each predictor to the bootstrap forest model. The Portion column in the report shows the percent contribution of each variable.
You can select the important predictors in the Predictor Screening report. Selecting the important predictors selects the corresponding columns in the data table, enabling you to easily enter these columns into the launch windows for modeling platforms. In this fashion, the Predictor Screening enhances the modeling process
Predictor Screening Platform Options
See the JMP Reports chapter in the Using JMP book for more information about the following options:
Contains options that enable you to repeat or relaunch the analysis. In platforms that support the feature, the Automatic Recalc option immediately reflects the changes that you make to the data table in the corresponding report window.
Save Script
Contains options that enable you to save a script that reproduces the report to several destinations.
Save By-Group Script
Contains options that enable you to save a script that reproduces the platform report for all levels of a By variable to several destinations. Available only when a By variable is specified in the launch window.
..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.