Reproducing an example

The  beta distribution

Beta distribution function is “conjugate prior” for Bernoulli distribution (and others). It means that the “form” of posterior distribution is the same as the prior distribution.

The Bernoulli distribution:

p (y | θ) = ch_6_1.png

This “feature” of beta distribution allows us to keep using the same methods over and over because the posterior distribution’s form does not change. We can apply Bayesian steps in repetition.

There is a potential matter of confusion here:
Beta Distribution is different from the Beta function.

Following is the beta distribution:
p (θ | a, b) = beta(θ | a, b)
                     = ch_6_2.png

Where,

B(a, b) = ch_6_3.png

is a  normalizing constant or beta function.

One of the most useful figures to understand what a beta distribution does, is Figure 6.1. It shows how the values of α and β change relative to each other. Let’s try to reproduce it.

Mathematica, of course, has the functions: BetaDistribution and Beta.

ch_6_4.png

ch_6_5.gif

The image above has too much information. Let’s make it exactly like the Figure 6.1:

ch_6_6.png

Graphics:α=0.1, β=0.1 Graphics:α=0.1, β=1 Graphics:α=0.1, β=2 Graphics:α=0.1, β=3
Graphics:α=1, β=0.1 Graphics:α=1, β=1 Graphics:α=1, β=2 Graphics:α=1, β=3
Graphics:α=2, β=0.1 Graphics:α=2, β=1 Graphics:α=2, β=2 Graphics:α=2, β=3
Graphics:α=3, β=0.1 Graphics:α=3, β=1 Graphics:α=3, β=2 Graphics:α=3, β=3

Exercises

Exercise 6.1

(A) Start with a prior distribution that expresses some uncertainty that a coin is fair: beta(θ|4,4). Flip the coin once; suppose we get a head. What is the posterior distribution?

(B) Use the posterior from the previous flip as the prior for the next flip. Suppose we flip again and get a head. Now what is the new posterior?

(C) Using that posterior as the prior for the next flip, flip a third time and get a tail. Now what is the new posterior?

(D) Do the same three updates, but in the order T, H, H instead  of H, H, T. Is the final posterior distribution the same for both orderings of the flip results?

Exploration

For (A) (B) and (C), we just have to increase α and β values as flip’s results come in. Plotting the results is simple, the same as shown in “Reproducing an example” section above.

ch_6_23.png

ch_6_24.gif

Now for (D), the approach is exactly the same, the order of updating α and β changes:

ch_6_25.png

ch_6_26.gif

Resulting distributions are the same, showing data order invariance.

Exercise 6.2

Suppose an election is approaching, and you are interested in knowing whether the general population prefers candidate A or candidate B. There is a just-published poll in the newspaper, which states that of 100 randomly sampled people, 58 preferred candidate A and the remainder preferred candidate B.

(A) Suppose that before the newspaper poll, your prior belief was a uniform distribution. What is the 95% HDI on your beliefs after learning of the newspaper poll results?

(B) You want to conduct a follow-up poll to narrow down your estimate of the population’s preference. In your follow-up poll, you randomly sample 100 other people and find that 57 prefer candidate A and the remainder prefer candidate B. Assuming peoples’ opinions have not changed between polls, what is the 95% HDI on the posterior?

Exploration

Since our prior beliefs about the matter is a uniform distribution, we don’t have to come up with our own values of α and β.
Let’ s plot the distribution of the original data to see what it looks like:

ch_6_27.png

ch_6_28.gif

Let’s see where the peak lies, which tells us “Mode”:

ch_6_29.png

ch_6_30.png

To calculate 95% HDI, some finicky things are involved. They are finicky to me at least:

ch_6_31.png

ch_6_32.png

Thus, 95% HDI lies within the range of 0.482444 - 0.674524.

Next, we update our beliefs given new data we collect ourselves. The results of (A) will become our prior distribution.
We replicate the same process, but now with second set of samples included:

ch_6_33.gif

ch_6_34.gif

ch_6_35.gif

ch_6_36.png

The mode has changed from 0.5816 to 0.5757.

ch_6_37.png

ch_6_38.png

Notice that the initial 95% HDI was {0.4824, 0.6745}, which changed to {0.5060, 0.6425} - it got narrower:

ch_6_39.gif

ch_6_40.png

ch_6_41.png

Exercises 6.3, 6.4, 6.5

These exercises are practically the same as 6.2. As results, they’re interesting. But implementation in Mathematica is just plugging in values of α and β and seeing the plots.
IOW, I am lazy.

Created with the Wolfram Language