Measures of Central Tendency and Variability
Central Tendency
Mean (Average)
x̄ = Σx / n
Sum of all values divided by count
Median
Middle value when ordered
50th percentile, less affected by outliers
Mode
Most frequent value
Value(s) that appear most often
Measures of Variability
Range
Max - Min
Difference between highest and lowest values
Variance (Sample)
s² = Σ(x - x̄)² / (n-1)
Average squared deviation from mean
Standard Deviation
s = √(variance)
Square root of variance
Interquartile Range
IQR = Q₃ - Q₁
Range of middle 50% of data
Five Number Summary
Minimum, Q₁ (25th percentile), Median (Q₂), Q₃ (75th percentile), Maximum
Probability Distributions
Discrete Distributions
Binomial Distribution
P(X = k) = C(n,k) × p^k × (1-p)^(n-k)
Fixed number of trials, constant probability
Poisson Distribution
P(X = k) = (λ^k × e^(-λ)) / k!
Events occurring at constant rate
Geometric Distribution
P(X = k) = (1-p)^(k-1) × p
Number of trials until first success
Continuous Distributions
Normal Distribution
f(x) = (1/(σ√(2π))) × e^(-(x-μ)²/(2σ²))
Bell-shaped, symmetric distribution
Uniform Distribution
f(x) = 1/(b-a) for a ≤ x ≤ b
All values equally likely in interval
Exponential Distribution
f(x) = λe^(-λx) for x ≥ 0
Time between events in Poisson process
Standard Normal Distribution
Z = (X - μ) / σ
Converts any normal distribution to standard normal (μ=0, σ=1)
Statistical Inference
Confidence Intervals
x̄ ± t(α/2) × (s/√n)
Interval estimate for population mean
Hypothesis Testing Steps
- State Hypotheses: H₀ (null) and H₁ (alternative)
- Choose Significance Level: α (typically 0.05)
- Calculate Test Statistic: t, z, χ², etc.
- Find P-value: Probability of observing data if H₀ is true
- Make Decision: Reject H₀ if p-value < α
Common Test Statistics
One-Sample t-test
t = (x̄ - μ₀) / (s/√n)
Test if sample mean differs from hypothesized value
Two-Sample t-test
t = (x̄₁ - x̄₂) / √(s₁²/n₁ + s₂²/n₂)
Compare means of two independent samples
Z-test
z = (x̄ - μ₀) / (σ/√n)
When population standard deviation is known
Type I and Type II Errors
- Type I Error (α): Rejecting true H₀ (false positive)
- Type II Error (β): Failing to reject false H₀ (false negative)
- Power: 1 - β, probability of correctly rejecting false H₀
Worked Examples
Example 1: Descriptive Statistics
Data: 12, 15, 18, 20, 22, 25, 28
Calculations:
• Mean: (12+15+18+20+22+25+28)/7 = 140/7 = 20
• Median: Middle value = 20
• Range: 28 - 12 = 16
• Variance: Σ(x-x̄)²/(n-1) = 140/6 ≈ 23.33
• Standard Deviation: √23.33 ≈ 4.83
Example 2: Normal Distribution
Problem: Heights are normally distributed with μ=70 inches, σ=3 inches. Find P(X ≤ 73).
Solution:
• Standardize: Z = (73-70)/3 = 1
• Look up P(Z ≤ 1) in standard normal table
• P(Z ≤ 1) ≈ 0.8413
Answer: About 84.13% of people are 73 inches or shorter
Example 3: Hypothesis Test
Problem: Test if sample mean 22.5 (n=25, s=4) differs from μ₀=20 at α=0.05.
Solution:
• H₀: μ = 20, H₁: μ ≠ 20
• t = (22.5-20)/(4/√25) = 2.5/0.8 = 3.125
• df = 24, critical values: ±2.064
• Since |3.125| > 2.064, reject H₀
Conclusion: Significant evidence that μ ≠ 20
Example 4: Correlation
Data: X: [1,2,3,4,5], Y: [2,4,6,8,10]
Calculation:
• Perfect positive linear relationship
• Correlation coefficient r = 1.0
• Regression line: y = 2x
• R² = 1.0 (100% of variance explained)
Interpretation: Perfect positive correlation