Introductory Case Studies / Multiple Distributions Project

Birth Weight & Maternal Smoking

Statistical comparison of baby birth weight across maternal smoking-status groups using descriptive statistics, global testing, pairwise two-sample tests, Bonferroni correction, Tukey HSD, confidence intervals, and assumption checks.

Course

Introductory Case Studies

Dataset

Babies / Mothers Data

Main Task

Group Comparison

Tools

Python / Jupyter

Project Overview

This project analyzes whether baby birth weight differs across maternal smoking-status categories. The dataset contains information about newborns and their mothers, including birth weight, gestation, infant sex, mother’s age, education, height, weight, income, and smoking behavior.

The project was completed for Project II in the Introductory Case Studies course at TU Dortmund. The focus was on comparing multiple distributions and applying correct statistical testing procedures, including global testing and post-hoc pairwise comparisons.

Research Motivation

Birth weight is an important health indicator for newborns. Maternal smoking is often studied as a possible risk factor for lower birth weight. This project investigates whether the birth-weight distributions differ between mothers who never smoked, currently smoke, quit during pregnancy, or smoked in the past.

Main research question: Do baby birth weights differ significantly between maternal smoking-status categories?

Dataset

The dataset contains 1236 samples and 23 variables related to newborns and their mothers. The analysis focuses mainly on the variable wt, which represents baby birth weight in ounces, and smoke, which represents the mother’s smoking history.

Variable Description Use in Project
wt Baby birth weight in ounces; 999 means unknown Main continuous outcome variable
smoke Mother’s smoking status Main grouping variable
gestation Length of gestation in days Relevant health-related background variable
sex Infant sex: 1 = male, 2 = female, 9 = unknown Possible descriptive subgroup information
age Mother’s age in years; 99 means unknown Maternal background variable
ed Mother’s education level Socio-demographic background variable
inc Family yearly income category Socio-economic background variable

Maternal Smoking Categories

The smoking variable records the mother’s smoking history. These categories define the groups whose baby birth weights were compared in the statistical analysis.

Code Smoking Category Interpretation
0 Never smoked Reference group for non-smoking mothers
1 Smokes now Current smoking during the observed pregnancy period
2 Until current pregnancy Quit smoking around or during pregnancy
3 Once did, not now Past smoker, not currently smoking
9 Unknown Missing or unknown smoking status

Research Tasks

The assignment required both descriptive and inferential analysis. The main goal was not only to test whether smoking groups differ, but also to correctly handle multiple pairwise comparisons and explain how adjustment methods affect statistical conclusions.

  1. Describe the distributions of birth weight and maternal smoking status.
  2. Use a global test to check whether birth weights differ between smoking categories.
  3. Conduct pairwise two-sample tests between all smoking-category pairs.
  4. Adjust pairwise tests using Bonferroni correction.
  5. Apply Tukey’s Honest Significant Difference method and calculate Tukey confidence intervals.
  6. Compare adjusted and non-adjusted test results.
  7. Check assumptions before applying the statistical tests.

Statistical Methods

The project uses classical statistical methods for comparing several groups. Descriptive statistics summarize the distribution of birth weights inside each smoking group, while global and pairwise tests evaluate whether observed differences are statistically meaningful.

Method Purpose Interpretation
Descriptive Statistics Summarize birth weight by smoking category Shows central tendency and spread in each group
Boxplots Visualize group-wise birth-weight distributions Highlights medians, spread, and potential outliers
Global Test / ANOVA Test whether at least one group mean differs Answers the overall group-difference question
Two-Sample Tests Compare pairs of smoking categories Identifies which specific groups differ
Bonferroni Correction Adjust p-values for multiple testing Controls family-wise error conservatively
Tukey HSD Post-hoc comparison of all group means Provides adjusted comparisons and confidence intervals

Hypotheses

The global test compares the mean birth weights across all smoking-status categories. The pairwise tests then compare every pair of categories separately.

Global Test

  • Null hypothesis H0: Mean baby birth weight is the same across all maternal smoking-status groups.
  • Alternative hypothesis H1: At least one smoking-status group has a different mean birth weight.

Pairwise Tests

  • Null hypothesis H0: The two compared smoking groups have equal mean birth weight.
  • Alternative hypothesis H1: The two compared smoking groups have different mean birth weight.

Data Preparation

Before analysis, unknown or invalid values need to be handled carefully. For example, birth weight value 999 means unknown, and smoking category 9 means unknown smoking status. These values should not be treated as valid numeric observations in the statistical tests.

  • Remove or mark unknown birth weight values such as wt = 999.
  • Remove or treat unknown smoking status smoke = 9 separately.
  • Keep only valid birth-weight and smoking-status observations for group comparison.
  • Convert smoking codes into readable category labels.
  • Inspect group sizes before applying tests.

Descriptive Analysis

The first stage describes the distribution of baby birth weights and counts how many observations fall into each smoking-status category. Since birth weight is continuous, group-wise mean, median, standard deviation, quartiles, and boxplots are appropriate. Since smoking status is categorical, frequency counts are appropriate.

Recommended Outputs

  • Count of mothers in each smoking-status category.
  • Mean and median baby birth weight per category.
  • Standard deviation and interquartile range per category.
  • Boxplot of baby birth weight grouped by maternal smoking status.
  • Histogram or density plot of birth weight for each group.

Assumption Checking

Before applying ANOVA-style tests, assumptions should be checked. The assignment explicitly required checking that test assumptions hold before applying the respective statistical tests.

Assumption How to Check Why It Matters
Independence Study design and one observation per baby Tests assume observations are independent
Normality within groups Q-Q plots or Shapiro-Wilk test Important for small samples and parametric tests
Equal variances Levene’s test or Bartlett’s test Standard ANOVA assumes similar group variances
No invalid coding Check 999 and 9 codes Unknown values must not distort estimates

Global Test

The global test evaluates whether the mean baby birth weight differs across maternal smoking groups. This is the first inferential step because it tests the overall question before examining individual pairs.

If assumptions are approximately satisfied, a one-way ANOVA can be used. If normality or variance assumptions are strongly violated, a non-parametric alternative such as the Kruskal-Wallis test may be considered.

Interpretation: A significant global test means that not all smoking groups have the same average birth weight. It does not directly tell which groups differ; pairwise post-hoc tests are needed for that.

Pairwise Comparisons

After the global test, all pairs of smoking-status categories are compared. Pairwise two-sample tests identify which specific categories differ in baby birth weight.

However, testing many pairs increases the probability of false positives. Therefore, the project compares unadjusted p-values with Bonferroni-adjusted results and Tukey HSD results.

Why Adjustment Is Needed

  • Many pairwise tests increase the family-wise error rate.
  • Unadjusted p-values may report too many significant differences.
  • Bonferroni correction is conservative and reduces false positives.
  • Tukey HSD is designed for all-pair mean comparisons after ANOVA.

Bonferroni vs. Tukey HSD

Bonferroni and Tukey HSD both adjust for multiple comparisons, but they work differently. Bonferroni adjusts the significance threshold or p-values based on the number of tests. Tukey HSD is specifically designed for comparing all group means and also provides confidence intervals for mean differences.

Method Strength Weakness
Unadjusted tests Simple and sensitive Higher risk of false positives
Bonferroni correction Strong family-wise error control Can be too conservative
Tukey HSD Designed for all pairwise mean comparisons Assumes ANOVA-style model conditions

Interpretation Strategy

The most important interpretation is not only whether smoking groups differ, but also how robust the conclusion is after multiple-testing correction. If a difference is significant in the unadjusted test but disappears after Bonferroni or Tukey adjustment, it should be interpreted cautiously.

  • Significant global test: evidence that at least one group differs.
  • Significant unadjusted pairwise test only: weak evidence, may be false positive.
  • Significant Bonferroni result: stronger evidence under conservative correction.
  • Significant Tukey HSD result: strong post-hoc evidence for mean difference.
  • Tukey confidence interval excluding zero: supports a significant difference between two group means.

Technical Skills Demonstrated

This project demonstrates applied statistical reasoning and careful analysis workflow rather than complex machine learning. It is useful for showing that I understand group comparisons, hypothesis testing, test assumptions, multiple-testing correction, and clear interpretation of statistical results.

1236 Samples in the original babies and mothers dataset.
23 Variables describing newborns and mothers.
α = 0.05 Significance level used for statistical testing.

Limitations

The project is an introductory statistical comparison and should be interpreted carefully. Observational data can reveal associations but cannot prove that maternal smoking alone caused changes in birth weight. Other factors such as gestation length, maternal age, health, income, education, and pre-pregnancy weight may also influence birth weight.

  • The dataset is observational, not a randomized experiment.
  • Unknown values such as wt = 999 and smoke = 9 must be handled carefully.
  • Maternal smoking may be confounded with socio-economic and health variables.
  • Birth weight can also depend strongly on gestation length and maternal characteristics.
  • ANOVA-style conclusions depend on assumptions such as approximate normality and equal variances.

Outcome

This project strengthened my understanding of comparison of multiple distributions, descriptive statistics, group-wise analysis, ANOVA-style testing, two-sample tests, multiple-testing correction, Tukey HSD confidence intervals, and assumption checking.

It is a valuable portfolio project for demonstrating applied statistics, health-data analysis, and careful interpretation of statistical testing results in Python/Jupyter.

Birth Weight Maternal Smoking ANOVA Two-Sample Tests Bonferroni Tukey HSD Confidence Intervals Assumption Checking Python Jupyter Notebook