Suppose you choose a real number $X$ from the interval $[2,10]$ with a probability density function of the form \(f(x)=\frac{C}{x}.\)
We know that the density must integrate to $1$ over the sample space $\Omega_x = [2,10]$. Thus,
\[\begin{eqnarray} \int_2^{10} \frac{C}{x} dx &=& \left. C\log(x)\right|_2^{10}\\ & = & C\left[\log(10)-\log(2)\right] \\ \Rightarrow C & = & \frac{1}{\log(10)-\log(2)} \approx 0.6213 \end{eqnarray}\]First, we can factor $g(X) = X^2-12X+35=(X-5)(X-7)$. So $g(X)=0$ when $X=5$ and $X=7$. In between these two points, the quadratic $g(X)<0$. Thus, \(\begin{eqnarray} P(X^2-12X+35>0) &=& \int_2^5 \frac{C}{x}dx + \int_{7}^{10} \frac{C}{x} dx\\ & =& \frac{\log(5/2) + \log(10/7)}{\log(5)}\\ &\approx& 0.791 \end{eqnarray}\)
A radioactive material emits $\alpha-$ particles at a rate described by the density function \(f(t) = 0.1\exp[-0.1t].\) Find the probability that a particle is emitted in the first 10 seconds, given that:
First, note that the cumulative distribution function for this density is given by
\[F(t) = 1 - \text{exp}(-0.1t)\]Now, let $E_{t<10}$ denote the event that a particle is emitted in the first 10 seconds and let $E_{t>1}$ denote the event that no particle is emitted in the first second. We are interested in the conditional probability $\mathbb{P}(E_{t<10} \mid E_{t>1})$. From the product rule, we have
\[\begin{eqnarray} \mathbb{P}(E_{t<10} \mid E_{t>1}) &=& \frac{\mathbb{P}(E_{t<10}, E_{t>1})}{\mathbb{P}(E_{t>1})}\\ &=& \frac{F(10)-F(1)}{1.0-F(1)}\\ & = &\frac{\text{exp}(-0.1)-\text{exp}(-1.0)}{\text{exp}(-0.1)}\\ & \approx & 0.593 \end{eqnarray}\]Similar to above,
\[\begin{eqnarray} \mathbb{P}(E_{t<10} \mid E_{t>5}) &=& \frac{\mathbb{P}(E_{t<10}, E_{t>5})}{\mathbb{P}(E_{t>5})}\\ &=& \frac{F(10)-F(5)}{1.0-F(5)}\\ & = &\frac{\text{exp}(-0.5)-\text{exp}(-1.0)}{\text{exp}(-0.5)}\\ & \approx & 0.393 \end{eqnarray}\]The probability of a particle being emitted in the first ten seconds given that a particle is emitted in the first 3 seconds is just $1$.
Using Bayes’ rule, we have \(\begin{eqnarray} \mathbb{P}(E_{t<10} \mid E_{t<20}) &=& \frac{\mathbb{P}(E_{t<20} \mid E_{t<10}) \mathbb{P}(E_{t<10})}{\mathbb{P}(E_{t<20})}\\ & = & \frac{\mathbb{P}(E_{t<10})}{\mathbb{P}(E_{t<20})}\\ & = & \frac{ F(10)}{F(20)} \\ & = & \frac{1.0 - \text{exp}(-1)}{1.0 - \text{exp}(-2)}\\ & \approx &0.731 \end{eqnarray}\)
In the 1700s, scientists observed that there seemed to be more male births than female births. But what was the real ratio of male to female births? In the late 1700s, Laplace used this question to help form his statistical theories, including his refinement of Bayes’ rule. From birth records in Paris between 1745–1770, Laplace had access to the following data:
Gender | Live Births |
---|---|
female | 241,945 |
male | 251,527 |
Let $\rho$ denote the probability that a newborn (in Laplace’s time) is male. Assume a binomial likelihood function for the data in the table above with parameter $\rho$ and a uniform prior density.
The prior density is uniform, which means that it is proportional to $1$ over the sample space. Here, the sample space is just the interval $\Omega_\rho = [0,1]$. Thus, the prior density function is given by \(f(\rho) = \left\{\begin{array}{cc} 1 & \rho\in[0,1]\\ 0 & \text{otherwise} \end{array}\right.\)
Let $f$ be the number of female births and $m$ the number of male births. Since $\rho$ is the probability of a male birth, the binomial likelihood function is given by \(f(m,f\mid \rho) = \left(\begin{array}{c}m+f\\ m \end{array}\right)\rho^m (1-\rho)^f.\) This can also be written in terms of $n=f+m$ the total number of births \(f(m,n\mid \rho) = \left(\begin{array}{c}n\\ m \end{array}\right)\rho^m (1-\rho)^{(n-m)}.\)
Using Bayes’ rule, we have \(\begin{eqnarray} f(\rho \mid m,n) &=& \frac{f(m,n\mid \rho) f(\rho)}{\int_0^1 f(m,n\mid \rho) f(\rho) d\rho} \\ &=& C_1 \left(\begin{array}{c}n\\ m \end{array}\right) \rho^m (1-\rho)^{(n-m)}, \end{eqnarray}\) where the normalizing constant $C_1$ is given by \(\begin{eqnarray} \frac{1}{C_1} &=& \left(\begin{array}{c}n\\ m \end{array}\right) \int_0^1 \rho^m (1-\rho)^{(n-m)} d\rho\\ \end{eqnarray}\)
Note that
\[\begin{eqnarray} \left(\begin{array}{c}n\\ m \end{array}\right) &=& \frac{n!}{m!(n-m)!} \\ &=& \frac{\Gamma(n+1)}{\Gamma(m+1)\Gamma(n-m+1)}. \end{eqnarray}\]Thus, the normalizing constant can be expressed as \(\begin{eqnarray} \frac{1}{C_1} &=& \frac{\Gamma(n+1)}{\Gamma(m+1)\Gamma(n-m+1)}\int_0^1 \rho^m (1-\rho)^{(n-m)} d\rho\\ \end{eqnarray}\) The integral itself has a similar form in terms of Gamma functions, so we obtain \(\begin{eqnarray} \frac{1}{C_1} &=& \left[\frac{\Gamma(n+1)}{\Gamma(m+1)\Gamma(n-m+1)}\right] \frac{\Gamma(m+1)\Gamma(n-m+1)}{\Gamma(n+2)}\\ &=& \frac{\Gamma(n+1)}{\Gamma(n+2)}\\ &=& \frac{n!}{(n+1)!}\\ \Rightarrow C_1 & = & n+1. \end{eqnarray}\)
First, note that the uniform prior can be written in this form when $a=1$ and $b=1$, thus \(f(\rho) = \rho^{1-1}(1-\rho)^{1-1}.\)
The posterior is given by \(f(\rho \mid m, f) = \frac{\Gamma(n+2)}{\Gamma(m+1)\Gamma(n-m+1)} \rho^{m}(1-\rho)^{(n-m)}\) The posterior given above clearly has the form specified in the question when $a^\ast = m+1$ and $b^\ast = n-m+1$. In general, both the prior and posterior densities are specific cases of the general Beta family probability density function, which take the form \(f(x) = \frac{x^{a-1}(1-x)^{b-1}}{B(a,b)},\) where \(B(a,b) = \frac{\Gamma(a)\Gamma(b)}{\Gamma(a+b)}.\)
Complete the coding exercise in BirthSimulation.ipynb.
See the solution repository.
This question is ungraded.
The uniform prior distribution is commonly used to model situations where we have a “lack of information” about a random variable. However, a uniform distribution is not always an “uninformative” choice.
Consider two random variables $\theta \sim U[0,2\pi]$ and $\psi\sim U\left[-\frac{\pi}{2},\frac{\pi}{2} \right]$ used to describe the position (in spherical coordinates) on a unit sphere ($r=1$). Do you think this uniform distribution is non-informative? If not, can you derive a different density on $\theta$ and $\psi$ that produces a less informative prior?
Hint: Consider two patches $A$ and $B$ of equal area anywhere on the surface of the sphere. Does the uniform distribution on $\theta$ and $\psi$ result in $P(A)=P(B)$ for any $A$ and $B$?
For a direct solution to this problem, check out this blog post. While not exactly the same, this ide is related to the concept of the Jeffreys prior, which depends on the form of the likelihood function (but not the observations) and defines a prior that is invariant to changes in coordinates.