Validation methods

ML is a highly iterative process. Typically hundreds, thousands, and sometimes millions of variations are generated in the process. At this amount, even with a low probability of false positive results, you are essentially guaranteed to have not just one, but several false positives. The traditional application of statistics with a 95% confidence level breaks down at this amount of iterations. Most statistical tests assume that only one hypothesis is being tested. With ML, you are trying out thousands to millions of hypotheses in the search for the optimal model. You will certainly have models that pass all the statistical tests, just by sheer odds.

Following the principle of mistrust, no model is considered acceptable until it has been tested against data it has not seen before. This process is called validation.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.17.6.75