Computational Statistics & Data Analysis (MVComp2)

Lecture 1: Basic concepts in probability theory

Tristan Bereau

Institute for Theoretical Physics, Heidelberg University

Introduction

Literature

Today’s lecture
Based on: Ch. 2-6 in Wackerly, Mendenhall, and Scheaffer (2014)
Concepts of probability
Cox (1946), Berger (2013)

Probability: Intuition (or lack thereof)

Monty Hall problem

Suppose you’re on a game show, and you’re given the choice of three doors: Behind one door is a car; behind the others, goats. You pick a door, say No. 1, and the host, who knows what’s behind the doors, opens another door, say No. 3, which has a goat. He then says to you, “Do you want to pick door No. 2?” Is it to your advantage to switch your choice?

Birthday paradox

23 people stand in a room. What is the probability that 2 people have the same birthday (i.e., only day and month)?

Rolling the dice

Probability of rolling seven 1’s out of 10 dice?

Simple calculations of expectation values

Are the dice balanced?

How do we test for such a hypothesis?

More difficult questions

What is the probability that the Big Bang gave rise to at least one life-harboring planet in the Universe?

How can you even define probabilities if there was ever one realization?

What is the probability for a physical constant to lie within certain limits

The constant is unique and deterministic, but our knowledge is incomplete!

Probability: Conceptual views

Probability views

Frequentist view
Relative frequency over infinitely many repetitions (i.e., ensemble)1

\[ P(\text{event}) = \lim_{N\to\infty} \frac{\# \text{ events}}{N} \]

Subjective view
Reasonable expectation and personal belief,2 e.g., Bayesian view

Set-theoretic concepts

Probability space

Probability space
triple \((\Omega, \mathcal{F}, \mathbb{P})\)
\(\Omega\)
Sample space: set of possible outcomes from an experiment
\(\mathcal{F}\)
Event space: set of all possible subsets of \(\Omega\)
\(\mathbb{P}\)
Probability measure: mapping from event \(E \subset \Omega\) to number in \([0, 1]\)

Sample space

Sample space
set/space of all elementary outcomes \(\omega_k \in \Omega\) of a probability experiment, elementary outcomes cannot be further dissected
Discrete sample space
\(\Omega = \{ \omega_1, \dots, \omega_K \}\), (e.g., \(\Omega = \mathbb{N}\)); finite or countable
Continuous sample space
e.g., \(\Omega = \mathbb{R}\), \(\Omega = [0,1]^d\)

Events

Events E
set of subsets of sample space—must form a \(\sigma\)-algebra1

Roll the dice: \(\Omega = \{ 1,2,3,4,5,6 \}\)

  • \(E_0 = \emptyset\)
  • \(E_1 = \{2, 4, 6 \}\) (“even numbers”)
  • \(E_2 = \{1, 3, 5 \}\) (“odd numbers”)
  • \(E_3 = \Omega\)

Partition1

Set \(\{ A_1, \dots, A_K \}\) such that

  1. \(A_i \cap A_j = 0 \forall i \neq j\)
  2. \(\Omega = A_1 \cup A_2 \cup \dots \cup A_K\)

then for any \(B \in \Omega: B = (B \cap A_1) \cup (B \cap A_2) \cup \dots \cup (B \cap A_K)\)

Probability measure

Function \(P: E \to [0, 1]\) with the following axioms/properties for any \(A \in E\):

  1. \(P(A) \geq 0\)
  2. \(P(\Omega) = 1\)
  3. Sum rule: \(A_i \cap A_j = \emptyset \quad \forall i\neq j \Rightarrow P(A_1 \cup A_2 \cup \dots) = \sum_{i=1}^{\infty} P(A_i)\)

Probability measure

Probability function, \(P\), is not given for any sample space, it needs to be contructed for any probability experiment. e.g.,

  1. Prob. function for throwing a single fairly-balanced dice: uniform
  2. Sample space for throwing two dice \(\Omega = \{ (m,n) | m, n \in \{1,2,3,4,5,6\}\}\). Prob. of getting both numbers above 4?
  3. A and B play two tennis games. Odds are 2:1 for A to win any game. What is the probability that A wins at least one game? (example 2.4 in Wackerly, Mendenhall, and Scheaffer (2014))

Intersections, unions, subsets

Complement
\[ \overline A = \Omega\setminus A \] such that \(A \cup \overline A = \Omega\)

Basic set rules: DeMorgan’s law

Let \(A, B \subseteq \Omega\) \[ \overline{A \cup B} = \overline{A} \cap \overline{B} \]



\[ \overline{A \cap B} = \overline{A} \cup \overline{B} \]

Basic set rules (cont’d)

Let \(A, B, C \subseteq \Omega\) \[ A \cap (B \cup C) = (A \cap B) \cup (A \cap C) \]


\[ A \cup (B \cap C) = (A \cup B) \cap (A \cup C) \]

Combinatorial rules

23 people stand in a room. What is the probability that 2 people have the same birthday (i.e., only day and month)?

  • Sample space: \(N=365^{23}\) elements in \(\Omega\)
  • Number of elements with distinct birthdays: \(K=365 \times 364 \times \dots \times 343\)

\[P = 1 - \frac KN\]

Counting permutations1

Number of possibilities to assign \(N\) distinct objects to \(K\leq N\) slots

  • Without replacement \[ N \times (N-1) \times \dots \times (N-K+1) = \frac{N!}{(N-K)!} \]

  • With replacement \[ N^K \]

Combinatorial rules

23 people stand in a room. What is the probability that 2 people have the same birthday (i.e., only day and month)?

from math import factorial

num_people = 23
num_ele_space = 365**num_people
num_ele_distinct = factorial(365) / factorial(365-num_people)
prob = 1. - num_ele_distinct / num_ele_space

print(f'Probability for same birthday is {prob:5.3f}')
Probability for same birthday is 0.507

Basic probability laws

Conditional probability

Conditional probability of event A given that event B has occurred \[ P(A|B) = \frac{P(A \cap B)}{P(B)} \] If A and B are independent, then \[ P(A \cap B) = P(A)P(B) \]

Additive law









\[ P(A\cup B) = P(A) + P(B) - P(A \cap B) \]

Event composition1

Patient with a disease will respond to treatment with prob 0.9. If 3 patients are treated independently, find the probability that at least one will respond

  • \(A\): At least one patient will respond
  • \(B_1\): The first patient will not respond
  • \(B_2\): The second patient will not respond
  • \(B_3\): The third patient will not respond \[ P(A) = 1 - P(\overline A) = 1 - P(B_1 \cap B_2 \cap B_3) \]

Law of total probability1

Assume \(\{ A_1, A_2, \dots, A_K \}\) is a partition of \(\Omega\), then \[ P(B) = \sum_{i=1}^K P(B | A_i) P(A_i) = \sum_{i=1}^K P(B, A_i) \]

\(P(B, A_i)\)
the joint (or bivariate) probability function for \(B\) and \(A_i\)
\(P(B)\)
is called a marginal probability

Bayes’ rule1

\[ P(A_j | B) = \frac{P(B|A_j)P(A_j)}{\sum_{i=1}^K P(B|A_i)P(A_i)} = \frac{P(B|A_j)P(A_j)}{P(B)} \]

\(P(A_j)\)
prior (or a-priori) probability
\(P(A_j|B)\)
posterior probability
\(P(B|A_j)\)
likelihood

Summary

Summary

  • Many scientific problems are probabilistic
  • Concept/philosophy: frequentist vs. subjective (Bayesian)
  • Mathematical framework: Probability space triplet \((\Omega, \mathcal{F}, \mathbb{P})\): sample space, event space, probability measure
  • Operations on sets: Union, intersection, complement
  • Probability laws: conditional probability, additive law, law of total probability, Bayes’ rule

References

Berger, James O. 2013. Statistical Decision Theory and Bayesian Analysis. Springer Science & Business Media.
Cox, Richard T. 1946. “Probability, Frequency and Reasonable Expectation.” American Journal of Physics 14 (1): 1–13.
Savage, Leonard J. 1961. “The Foundations of Statistics Reconsidered.” In Proceedings of the Fourth Berkeley Symposium on Mathematical Statistics and Probability, Volume 1: Contributions to the Theory of Statistics, 4:575–87. University of California Press.
Wackerly, Dennis, William Mendenhall, and Richard L Scheaffer. 2014. Mathematical Statistics with Applications. Cengage Learning.