# Probability

## The Formal Theory Of Probability

In Kolmogorov's theory, probabilities are numerical values that are assigned to "events." The numbers are non-negative; they have a maximum value of 1; and the probability that one of two mutually exclusive events occurs is the sum of their individual probabilities. Stated more formally, given a set and a privileged set of subsets F of, probability is a function P from F to the real numbers that obeys, for all X and Y in F, the following three axioms:
A1. P(X) ≥ 0 (Non-negativity)
A2. P(Ω) = 1 (Normalization)
A3. P(X ∪ Y) = P(X) + P(Y) if X ∪ Y = Ø (Additivity)

Kolmogorov goes on to give an infinite generalization of (A3), so-called countable additivity. He also defines the conditional probability of A given B by the formula:
P(A|B) = P(A ∩ B) / P(B), P(B) ≥ 0

Thus, we can say that the probability that the toss of a fair die results in a 6 is 1/6, but the probability that it results in a 6, given that it results in an even number, is 1/6 divided by 1/2 equals 1/3.

Important consequences of these axioms include various forms of Bayes's theorem, notably:
P(H|E) = [P(H)/P(E)] P(E|H) = P(H)P(E|H)/[P(H)P(E|H) + P(∼H)P(E|∼H))

This theorem provides the basis for Bayesian confirmation theory, which appeals to such probabilities in its account of the evidential support that a piece of evidence E provides a hypothesis H. P (E H) is called the "likelihood" (the probability that the hypothesis gives to the evidence) and P (H) the "prior probability" of H (the probability of the hypothesis in the absence of any evidence whatsoever).

Events A and B are said to be independent if P(AB) = P(A) P(B). If P(A) and P(B) > 0, this is equivalent to P(A|B) = P(A) and to P(B|A) = P(B). Intuitively, information about the occurrence of one of the events does not alter the probability of the other. Thus, the outcome of a particular coin toss is presumably independent of the result of the next presidential election. Independence plays a central role in probability theory. For example, it underpins the various important "laws of large numbers," whose content is roughly that certain well-behaved processes are very likely in the long run to yield frequencies that would be expected on the basis of their probabilities.

While the mathematics of Kolmogorov's probability theory is well understood and thoroughly developed (a classic text is Feller), its interpretation remains controversial. We now turn to several rival accounts of what probabilities are and how they are to be determined (see Hájek for more detailed discussion).