Hypothesis testing principle

Hypothesis testing is based on two fundamental principles of statistics, namely, normalization and standard normalization:

  • Normalization: The concept of normalization differs with respect to the context. To understand the concept of normalization easily, it is the process of adjusting values measured on different scales to common scales before performing descriptive statistics, and it is denoted by the following equation:

  • Standard normalization: Standard normalization is similar to normalization except it has a mean of 0 and a standard deviation of 1. Standard normalization is denoted by the following equation:

Besides these concepts, we need to know about some important parameters of hypothesis testing:

  • The null hypothesis is the most basic assumption made based on the knowledge about the domain. For example, the average typing speed of a person is 38-40 words per minute. 
  • An alternative hypothesis is a different hypothesis that opposes the null hypothesis. The main task here is whether we accept or reject the alternative hypothesis based on the experimentation results. For example, the average typing speed of a person is always less than 38-40 words per minute. We can either accept or reject this hypothesis based on certain facts. For example, we can find a person who can type at a speed of 38 words per minute and it will disprove this hypothesis. Hence, we can reject this statement. 
  • Type I error and Type II error: When we either accept or reject a hypothesis, there are two types of errors that we could make. They are referred to as Type I and Type II errors:
    • False-positive: A Type I error is when we reject the null hypothesis (H0) when H0 is true.
    • False-negative: A Type II error is when we do not reject the null hypothesis (H0) when H0 is false.
  • P-values: This is also referred to as the probability value or asymptotic significance. It is the probability for a particular statistical model given that the null hypothesis is true. Generally, if the P-value is lower than a predetermined threshold, we reject the null hypothesis. 
  • Level of significance: This is one of the most important concepts that you should be familiar with before using the hypothesis. The level of significance is the degree of importance with which we are either accepting or rejecting the null hypothesis. We must note that 100% accuracy is not possible for accepting or rejecting. We generally select a level of significance based on our subject and domain. Generally, it is 0.05 or 5%. It means that our output should be 95% confident that it supports our null hypothesis. 

To summarize, see the condition before either selecting or rejecting the null hypothesis:

# Reject H0
p <= α
# Accept the null hypothesis
p > α

Generally, we set the significance level before we start calculating new values. Next, we will see how we can use the stats library to perform hypothesis testing.

..................Content has been hidden....................

You can't read the all page of ebook, please click here login for view all page.
Reset
3.21.104.183