Processing math: 13%


1. Probability Theory
1.1 Set Thoery
  • Sample space S is the set of all possible outcomes of a particular experiment.
  • Two events A and B are disjoint (mutually exclusive) if AB=ϕ.
  • The events A1,A2, are pairwise disjoint if AiAj=ϕ for all ij.
  • If A1,A2, are pairwise disjoint and i=1Ai=S, then {Ai} forms a partition of S.


1.2 Basics of Probability Theory
  • A collection of subsets of S is called a sigma algebra (or Borel field), denoted by B. S={1,2,3}B={ϕ,{1},{2},{3},{1,2},{1,3},{2,3},{1,2,3}}
  • Bonferroni's Inequality is useful when it is difficult to calculate the intersection probability. p(AB)p(A)+p(B)1


1.3 Conditional Probability and Independence
  • Bayes' Rule p(A|B)=p(B|A)p(A)p(B)


1.4 Random Variable
  • Random variable is a function from a sample space S into the real numbers.


1.5 Distribution Functions

The cumulative distribution funciton (cdf) of a random variable X, denoted by FX(x), is defined by FX(x)=pX(Xx), for all x The function F(x) is a cdf if and only if the following three conditions hold:

  1. lim, and \lim_{x\rightarrow \infty}F_X(x)=1.
  2. F_X(x) is a nondecreasing function of x.
  3. F_X(x) is right-continuous; that is, \lim_{\epsilon \rightarrow 0^+} F_X(x+\epsilon)=F_X(x), for every number x \in \mathbb{R}.


1.6 Density and Mass Functions

The probability mass function (pmf) of a discrete random variable X is given by \begin{matrix}f_X(x)=p(X=x)=F_X(x)-F_X(x^-),& x\in S\end{matrix} The probability density funciton (pdf) of a continuous random variable X is the function that satisfies

f_X(x) = F^{'}_X(x)

where the derivative F^{'}_X(x) exists:

F_X(x) = \int_{-\infty}^x f_X(t)dt


2. Transformations and Expectations
2.1 Distributions of Functions of a Random Variable

Formally, if we write y=g(x), the function f(x) defines a mapping from the original sample space of X, \mathcal{X}, to a new sample space \mathcal{Y}.

g(x): \mathcal{X} \rightarrow \mathcal{Y}


We associate with g an inverse mapping, denoted by g^{-1}, which is a mapping from subsets of \mathcal{Y} to subsets of \mathcal{X}, and is defined by

g^{-1}(A)=\left \{ x\in \mathcal{X}: g(x) \in A \right \}


If the random variable Y is now defined by Y=g(X), we can write for any subset A,

\begin{align*}p(Y\in A) &= p(g(X) \in A)\\&=p(\left \{x\in \mathcal{X} : g(x)\in A \right \})\\&=p(X\in g^{-1}(A))\end{align*}


If X is a discrete random variable, then \mathcal{X} is countable. The sample space for Y=g(X) is also a countable set. Thus, Y is also a discrete random variable. The pmf of Y is

\begin{align*}f_Y(y)&=p(Y=y)\\&=p(g(X)=y)\\&=\sum_{x\in g^{-1}(y)} p(X=x)\\&=\sum_{x\in g^{-1}(y)} p_X(x)\end{align*}


If X and Y are continuous random variables with X \sim f_X(x). The cdf of Y is

\begin{align*}F_Y(y)&=p(Y\leq y)\\&=p(g(X)\leq y)\\&=p(\left \{x\in \mathcal{X} : g(x)\leq y\right \})\\&=\int_{\left \{x\in \mathcal{X} : g(x) \leq y\right \}} f_X(x)dx\end{align*}


Binomial transformation

A discrete random variable X has a binomial distribution if its pmf is of the form

\begin{matrix}f_X(x)=p(X=x)=\binom{n}{x}p^x(1-p)^{n-x},& (x=0,1,\cdots,n)\end{matrix}

\begin{align*}f_Y(y)&=p(Y=y)\\&=p((n-X)=y)\\&=p(X=(n-y))\\&=f_X(n-y)\\&=\binom{n}{n-y} p^{n-y}(1-p)^y \sim Bin(n,1-p)\end{align*}


Uniform transformation

Suppose X has a function distribution y=\sin^2(x) on the interval (0,2\pi), that is,

f_X(x)=\begin{cases}\frac{1}{2\pi} & 0<x<2\pi \\ 0 & \text{otherwise}\end{cases}

\begin{align*}F_Y(y)&=p(Y\leq y)\\&=p(g(X)\leq y)\\&=p(X\leq x_1)+p(x_2\leq X \leq x_3)+p(x_4\leq X \leq 2\pi)\\&=F_X(x_1)+(F_X(x_3)-F_X(x_2))+(F_X(2\pi)-F_X(x_4))\end{align*}


If y=g(x) is a monotone function, then g^{-1} is single-valued; that is, g^{-1}(y)=x if and only if y=g(x). If g(x) is increasing,

F_Y(y)=\int_{\left \{x\in \mathcal{X} : x \leq g^{-1}(y)\right \}} f_X(x)dx=\int_{-\infty}^{g^{-1}(y)}f_X(x)dx=\color{red}{F_X(g^{-1}(y))}

If g(x) is decreasing, we have

F_Y(y)=\int_{\left \{x\in \mathcal{X} : x \geq g^{-1}(y)\right \}} f_X(x)dx=\int^{\infty}_{g^{-1}(y)}f_X(x)dx=\color{blue}{1-F_X(g^{-1}(y))}


Let X have pdf f_X(x) and let Y=g(X), where g is a monotone function. Suppose that f_X(x) is continuous on \mathcal{X} and that g^{-1}(y) has a continuous derivative on \mathcal{Y}. Then the pdf of Y is given by

f_Y(y)=\begin{cases}f_X(g^{-1}(y))\left | \frac{d}{dy}g^{-1}(y)\right | & y \in \mathcal{Y} \\ 0 & \text{otherwise}\end{cases}


Probability integral transformation

Let X have continuous cdf F_X(x) and Y=F_X(X). Then Y is uniformly distributed on (0,1).
(p(Y\leq y)=y, 0<y<1)

\begin{matrix}F_X(x)=y & \Leftrightarrow & Y \sim Uniform(0,1)\end{matrix}

\begin{align*}F_Y(y)&=p(Y\leq y)\\&=p(F_X(x)\leq y)\\&=p(F_X^{-1}[F_X(X)] \leq F_X^{-1}(y))\\&=p(X\leq F_X^{-1}(y))\\&=F_X(F_X^{-1}(y))\\&=y\end{align*}

One application is in the generation of random samples from a particular distribution. For many distributions there are many other methods of generating observations that take less computing time, but this method is still useful because of its general applicability.


2.2 Expected Values

The expected value or mean of a random variable g(X), denoted by Eg(X), is

Eg(X)=\begin{cases}\sum_x g(x)f_X(x) & \text{if }X\text{ is discrete} \\\int g(x)f_X(x)dx & \text{if }X\text{ is continuous} \end{cases}

If E|g(X)|=\infty, we say that Eg(X) does not exist.


Cauchy random variable

An example of a random variable whose expected value does not exist.
\left (\int_{-\infty}^{\infty}f_X(x)dx=1 \text{, but } E|X|=\infty \right ) \begin{matrix}f_X(x)=\frac{1}{\pi}\cdot \frac{1}{1+x^2}, & (-\infty  < x < \infty)\end{matrix}


2.3 Moments and Moment Generating Functions
  • The n^\text{th} moment of XF_X(x) {\mu_n}'=EX^n
  • The n^\text{th} central moment of X \begin{matrix}\mu_n=E(X-\mu)^n,& \mu={\mu_1}'=EX\end{matrix}
  • Moment Generating Function (MGF) M_X(t)=Ee^{tX}
  • The n^\text{th} moment is equal to the n^\text{th} derivative of M_X(t) evaluated at t=0.EX^n=\left . \frac{d^n}{dt^n}M_X(t)\right |_{t=0}
  • The n^\text{th} moment does not uniquely determine a distribution function.
    (That is, there may be two distinct random variables having the n^\text{th} moments.)
  • If X and Y have bounded support, then F_X(u) = F_Y(u) for all u iff EX^r=EY^r for all integer r=0,1,2,\cdots.
  • If the MGF exist and M_X(t)=M_Y(t) for all t, then F_X(u)=F_Y(u) for all u.
  • Convergence of MGFs \lim_{i\rightarrow \infty} F_{X_i}(x) = F_X(x)


Poisson approximation

Binomial probabilities can be approximated by Poisson probabilities when n is large and p is small. Suppose that X\sim Binomial(n,p) and Y\sim Poisson(\lambda), with \lambda=np. \begin{align*}M_X(t)&=\left [ pe^t + (1-p)\right ]^n\\M_Y(t)&=e^{\lambda(e^t-1)}\end{align*} \begin{align*}\lim_{n\rightarrow \infty}M_X(t)&=\lim_{n\rightarrow \infty}\left [pe^t+(1-p)\right ]^n\\&=\lim_{n\rightarrow \infty}\left [\frac{\lambda}{n}e^t+ \left ( 1-\frac{\lambda}{n} \right ) \right ]^n\\&=\lim_{n\rightarrow \infty}\left [1+\frac{\lambda(e^t-1)}{n}\right ]^n=e^{\lambda(e^t-1)}=M_Y(t)\end{align*}


2.4 Differentiating Under an Integral Sign

Leibnitz's Rule

If f(x,\theta) is differentiable with respect to \theta, then \frac{d}{d\theta}\int_a^b f(x,\theta)dx=\int_a^b \frac{\partial}{\partial\theta}f(x,\theta)dx


Lebesgue's Dominated Convergence Theorem

Suppose the h(x,y) is continuous at y_o for each x, and there exists a function g(x) satisfying

  1. |h(x,y)| \leq g(x) for all x and y,
  2. \int g(x)dx < \infty.

\lim_{y\rightarrow y_0}\int h(x,y)dx=\int \lim_{y \rightarrow y_0} h(x,y)dx


Suppose the f(x,\theta) is differentiable in \theta and there exists a function g(x,\theta) such that

  1. \left | \left . \frac{\partial}{\partial \theta} f(x,\theta) \right | _{\theta={\theta}'} \right | \leq g(x,\theta) for all {\theta}' such that |{\theta}'-\theta|\leq\delta_0,
  2. \int_{-\infty}^{\infty} g(x,\theta)dx < \infty.

Then \frac{d}{d\theta}\int_{-\infty}^{\infty} f(x,\theta)dx=\int_{-\infty}^{\infty} \frac{\partial}{\partial\theta}f(x,\theta)dx



References