Let's perform hypothesis testing using the stats library. Let's consider the following scenario.
In a study about mental health in youth, 48% of parents believed that social media was the cause of their teenagers' stress:
- Population: Parent with a teenager (age >= 18)
- Parameter of interest: p
- Null hypothesis: p = 0.48
- Alternative hypothesis: p > 0.48
Data: 4,500 people were surveyed, and 65% of those who were surveyed believed that their teenagers' stress is due to social media.
Let's start the hypothesis testing:
- First, import the required libraries:
import statsmodels.api as sm
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
- Next, let's declare the variables:
n = 4500
pnull= 0.48
phat = 0.65
- Now, we can use the proportions_ztest method to calculate the new P-value. Check out the following snippet:
sm.stats.proportions_ztest(phat * n, n, pnull, alternative='larger')
The output of the preceding code is as follows:
(23.90916877786327, 1.2294951052777303e-126)
Our calculated P-value of 1.2294951052777303e-126 is pretty small, and we can reject the null hypothesis, which is that social media is the cause of teenagers' stress.