Week 7: Hypothesis Testing | Python Data Science Tutorials

Learning objectives: By the end of this week you will be able to apply hypothesis testing concepts to real datasets, write executable Python code for each technique, and complete both graded assignments independently.

Session 1: Foundations - t-tests and Effect Size

A hypothesis test starts with H0 (null) and H1 (alternative). The p-value is the probability of observing data at least as extreme as ours IF H0 is true. Reject H0 when p < alpha (typically 0.05). Type I error (false positive) rate = alpha. Type II error (false negative) rate = beta. Power = 1 - beta. Always report effect size (Cohen's d for t-tests) alongside p-value: statistical significance does not imply practical importance.

import numpy as np
from scipy import stats

np.random.seed(42)
processing_times = np.random.normal(loc=3.4, scale=0.9, size=45)

# One-sample t-test: H0: mean = 3.0 days (regulatory standard)
t_stat, p_value = stats.ttest_1samp(a=processing_times, popmean=3.0)
print(f'Sample mean: {processing_times.mean():.3f} days')
print(f't-statistic: {t_stat:.4f}')
print(f'p-value:     {p_value:.4f}')

decision = 'Reject H0' if p_value < 0.05 else 'Fail to reject H0'
print(f'Decision at alpha=0.05: {decision}')

# Cohen's d effect size
cohens_d = (processing_times.mean() - 3.0) / processing_times.std(ddof=1)
print(f"Cohen's d: {cohens_d:.3f}  ({'small' if abs(cohens_d)<0.5 else 'medium' if abs(cohens_d)<0.8 else 'large'} effect)")

Session 2: Two-Sample Tests and ANOVA

Independent samples t-test compares means of two unrelated groups. Always run Levene's test for equality of variances first; use equal_var=False (Welch's t-test) if variances are unequal. One-way ANOVA tests whether means of 3+ groups are equal. A significant F-test tells you differences exist but not which pairs differ - follow up with Tukey HSD. Chi-square tests independence between two categorical variables.

from scipy import stats
from statsmodels.stats.multicomp import pairwise_tukeyhsd
import numpy as np
import pandas as pd

np.random.seed(0)
branch_A = np.random.normal(680, 75, 120)
branch_B = np.random.normal(710, 70, 95)

# Levene + two-sample t-test
_, lev_p = stats.levene(branch_A, branch_B)
t_stat, p_val = stats.ttest_ind(branch_A, branch_B, equal_var=(lev_p > 0.05))
print(f'Levene p={lev_p:.4f}, t={t_stat:.4f}, p={p_val:.4f}')

# One-way ANOVA
primary   = np.random.normal(150000, 40000, 80)
secondary = np.random.normal(280000, 60000, 120)
tertiary  = np.random.normal(450000, 90000, 100)
f_stat, p_anova = stats.f_oneway(primary, secondary, tertiary)
print(f'ANOVA F={f_stat:.4f}, p={p_anova:.6f}')

if p_anova < 0.05:
    all_loans = np.concatenate([primary, secondary, tertiary])
    groups = ['Primary']*80 + ['Secondary']*120 + ['Tertiary']*100
    tukey = pairwise_tukeyhsd(all_loans, groups, alpha=0.05)
    print(tukey)

Session 3: Non-Parametric Tests

Non-parametric tests make no distributional assumptions. Use when: sample size < 30 and normality cannot be assumed, data is ordinal, or the distribution is severely skewed. Mann-Whitney U (alternative to t-test) tests whether one group tends to have higher values than another. Kruskal-Wallis (alternative to ANOVA) tests whether 3+ group medians are equal. Check normality first with Shapiro-Wilk.

from scipy import stats
import numpy as np

np.random.seed(3)
# Right-skewed, non-normal data
group_control   = stats.expon(scale=12).rvs(30)
group_treatment = stats.expon(scale=8).rvs(30)

# Shapiro-Wilk normality test
_, p_sw_c = stats.shapiro(group_control)
_, p_sw_t = stats.shapiro(group_treatment)
print(f'Shapiro-Wilk p (control): {p_sw_c:.4f}')
print(f'Shapiro-Wilk p (treatment): {p_sw_t:.4f}')
# p < 0.05 confirms non-normal - use Mann-Whitney

u_stat, p_mw = stats.mannwhitneyu(group_control, group_treatment, alternative='two-sided')
print(f'Mann-Whitney U={u_stat:.1f}, p={p_mw:.4f}')
print('Decision:', 'Reject H0 - groups differ' if p_mw < 0.05 else 'Fail to reject H0')

Week 7 Assignments

Submit completed notebooks to your GitHub repository before the next session. Feedback within 48 hours.

Complete A/B test analysis: state H0/H1, check normality with Shapiro-Wilk, choose appropriate test, compute p-value and effect size, write a 200-word business decision memo.

ANOVA/Kruskal-Wallis on a dataset with 4+ groups. Run the appropriate post-hoc test. Visualise with annotated box plots showing significant pairwise comparisons.

Previous Week Next: Week 8