STA 247 - Week 3 lecture summary

Bayes' Rule

Bayes' Rule provides a way of switching around conditional probabilities, useful if we know P(A|B) but need P(B|A). It has several forms. If B is an event with P(B)>0,

P(B|A) = P(A|B)P(B) / P(A)

as can easily be proved by substituting the definition of the conditional probabilities. We can expand the denominator using the law of total probability to get

P(B|A) = P(A|B)P(B) / [ P(A|B)P(B) + P(A|B^c)P(B^c) ]

If B₁, B₂, ... are disjoint events whose union is the whole sample space (that is, they are a partition of S), then for any i we can write

P(B_i|A) = P(A|B_i) P(B_i) / [ P(A | B₁) P(B₁) + P(A | B₂) P(B₂) + P(A | B₃) P(B₃) + ... ]

Example:

Suppose 2/3 of the computers in our computer lab have a model B disk drive, with the rest having a different model disk drive. Suppose we choose a computer at random from the lab after some time when the computers weren't used, and test whether its disk drive is working correctly. Let B be the event that the computer chosen has a model B drive, and F be the event that the drive has failed. Suppose that previous testing has shown that the probability of failure over this period for model B drives is 0.01, and for the other model of drive the probability of failure is 0.05. What is the probability that the computer has a model B drive if its drive has failed?
What we want is P(B|F). We can get it with Bayes' Rule, as
P(B|F) = P(F|B)P(B) / [ P(F|B)P(B) + P(F|B^c)P(B^c) ] = 0.01 x (2/3) / [ 0.01 x (2/3) + 0.05 x (1/3) ] = 2/7

Example:

We roll a die, and then flip a coin the number of times showing on the die. Let B₆ be the event of rolling a six, and let A be the event that at least one coin flip is a head. What is P(B₆|A)?
Using Bayes' Rule with the partition B₁, B₂, ..., B₆, we get
P(B₆|A) = (1/6)(1-1/64) / [ (1/6)(1-1/2) + (1/6)(1-1/4) + ... + (1/6)(1-1/64) ]

Bayes' Rule in terms of odds

Rather than work with the probability of an event B, we can work with the odds for event B over B^c, which are defined as P(B)/P(B^c). It's easy to show that if R=P(B)/P(B^c), then P(B)=R/(1+R), so we can go back and forth between probability and odds, using whichever is convenient.

In terms of odds, Bayes' Rule is simple:

P(B|A)/P(B^c|A) = [P(B)/P(B^c)] x [P(A|B)/P(A|B^c)]

Using this for the disk drive example above, we see that the ``prior odds'', P(B)/P(B^c), are (2/3)/(1/3)=2, and the ratio P(F|B)/P(F|B^c) is 0.01/0.05=1/5, so the odds for B given F are 2x(1/5)=2/5. The probability of B given F is therefore (2/5)/(1+(2/5))=2/7, the same as found above.

More on interpretations of probability

Recall that there are at least three approaches to specifying and interpreting probabilities:

Assume that the outcomes in the sample space are equally likely.
Base probabilities on frequencies of occurrence in many past repetitions.
Base probabilties on your personal beliefs.

The Multiplication Rule and Bayes' Rule are especially useful when we know some probabilities from frequencies of ocurrence or from our personal beliefs, and from them need to find other probabilities. When we assume all outcomes are equally likely, we could in theory find all probabilties by just counting numbers of outcomes in events, but even in this case it may sometimes be easier to use Bayes' Rule or the Multiplication Rule.

For what sample spaces are outcomes equally likely?

Recall that we earlier counted how many outcomes are in the sample space for drawing k balls from an urn containing n balls, with or without replacement, and with or without paying attention to the order. Now, let's ask for each of these four scenarios whether it is reasonable to assume that the outcomes are equally likely.

Drawing with replacement, ordered result:

The sample space has n^k outcomes. There is no reason to consider any of these to be more likely than any other.

Drawing without replacement, ordered result:

There are n!/(n-k)! outcomes. Again, there is no reason to distinguish any of these as being more likely than any other.

Drawing without replacement, unordered result:

There are C(n,k) outcomes. We might wonder whether regarding these as equally likely is consistent with regarding the ordered outcomes as equally likely. It is, because each unordered outcome corresponds to k! ordered outcomes - the same number for each unordered outcome. So equally likely ordered outcomes imply equally likely unordered outcomes.

Drawing with replacement, unordered result:

There are C(k+n-1,k) outcomes. In this situation, considering these to be equally likely would be inconsistent with considering the ordered outcomes to be equally likely, because the number of ordered outcomes for each unordered outcome varies. Another way of seeing that assuming the outcomes are equally likely isn't reasonable is that there are differences in outcomes that can lead to their not being equally likely. If there are three balls, R, G, B, and we draw four times, the outcome where we draw R four times, has a different probability from the outcome where we draw R once, G once, and B twice - there's only one way to draw R four times, but there are many ways to draw R once, G once, and B twice (RGBB, BRGB, BBGR, etc.)

The Monte Hall puzzle

Monty Hall hosts a TV game show in which contestants try to win cars. Before each contestant plays, a car is put behind one of three closed doors. The other two doors have goats behind them. The contestant gets to choose one of the three doors. Monty Hall then opens one of the other two doors - one which does NOT contain the car - and then gives the contestant the chance to switch to the remaining unopened door if they wish. Should the contestant switch?
We know from past shows that this is always how the game works, that the car is equally likely to be behind any of the three doors, and that Monty Hall is equally likely to open either door when he has a choice.
NOTE: The above may or may not correspond to the actual operation of the ancient game show "Let's Make a Deal" that Monty Hall hosted.

The boys and girls puzzle

A couple you've just met invite you over to dinner, saying "come by around 5pm, and we can talk for a while before our three kids come home from their school at 6pm".
You arrive at the appointed time, and are invited into the house. Walking down the hall, your host points to three closed doors and says, "those are the kids' bedrooms". You stumble a bit when passing one of these doors, and accidentally push the door open. There you see a dresser with a jewelry box, and a bed on which a dress has been laid out. "Ah", you think to yourself, "I see that at least one of their three kids is a girl".
Your hosts sit you down in the kitchen, and leave you there while they go off to get goodies from the stores in the basement. While they're away, you notice a letter from the principal of the local school tacked up on the refrigerator. "Dear Parent", it begins, "Each year at this time, I write to all parents, such as yourself, who have a boy or boys in the school, asking you to volunteer your time to help the boys' hockey team..." "Umm", you think, "I see that they have at least one boy as well".
That, of course, leaves only two possibilities: Either they have two boys and one girl, or two girls and one boy. What are the probabilities of these two possibilities?
NOTE: You should assume that a child is equally likely to be a boy or girl when you know nothing about them (though in reality the probabilities of boys and girls are not exactly equal), and you should assume that there are no identical twins or triplets (who are always the same sex). You should assume all other things that it seems you're meant to assume, and not assume things that you aren't told to assume. If things can easily be imagined in either of two ways, you should assume that they are equally likely. For example, you may be able to imagine a reason that a family with two boys and a girl would be more likely to have invited you to dinner than one with two girls and a boy. If so, this would affect the probabilities of the two possibilities. But if your imagination is that good, you can probably imagine the opposite as well. You should assume that any such extra information not mentioned in the story is not available.

Demo of a probability calculation in R

Suppose we draw k balls from an urn containing n balls, of which r are red and n-r are green. What is the probability of the event E that all the red balls you draw are drawn before any green balls?

It's reasonable to assume equally likely ordered outcomes, so P(E)=#(E)/#(S). The number of outcomes in the sample space, #(S), is n!/(n-k)!. To find the number of outcomes in E, we can sum over each possible number, j, of red balls that you draw. For each j, we find the number of outcomes in E in which we draw j red balls by multiplying the number of ways of drawing j red balls from r red balls in the urn by the number of ways of drawing k-j green balls from the n-r green balls in the urn. This gives the following result:

#(E) = SUM(over j) [ r! / (r-j)! ] x [ (n-r)! / ((n-r)-(k-j))! ]

The range of j (possible numbers of red balls drawn) is limited by the number of balls drawn (k), the number of red balls in the urn (r), and the number of green balls in the urn (n-r). The lowest possible value for j is max(0,k-(n-r)); the highest possible value is min(k,r).

An R function that computes the probability of E can be defined as follows:

f <- function (n, r, k)
{
  j <- max(0,k-(n-r)) : min(k,r)

  sum ( 
        (factorial(r)/factorial(r-j)) * (factorial(n-r)/factorial((n-r)-(k-j))) 
      ) / (factorial(n)/factorial(n-k))
}

Mutual Independence

This wasn't covered in the third week's lectures, but is actually needed for Question 2 of Assignment 1, so I've added it here.

Definition: We say that n events A₁, A₂, ..., A_n are mutually independent if the probability of the intersection of any subset of these events is equal to the product of the probabilities of the events in the subset. For example, when n=3, A₁, A₂, A₃ are mutually independent if all the following are true:

P(A₁ intersect A₂ intersect A₃) = P(A₁) P(A₂) P(A₃)
P(A₁ intersect A₂) = P(A₁) P(A₂)
P(A₁ intersect A₃) = P(A₁) P(A₃)
P(A₂ intersect A₃) = P(A₂) P(A₃)

One can show that if A₁, A₂, ..., A_n are mutually independent, then so are A^c₁, A₂, ..., A_n, and similarly if any other of the events A₁, A₂, ..., A_n are complemented.

When, as in question 2 of assignment 1, several events are said to be independent without further qualification, you can assume that mutual independence is meant. However, there is also the concept of pairwise independence, which means that any two of the events are independent. Mutual independence implies pairwise independence, but it is possible for three or more events to be pairwise independent but not be mutually independent.