Recall that we defined the variance of a random variable, *X*,
to be Var(*X*) = E[(*X*-E(*X*))^{2}].

A different formula for variance is sometimes more convenient:

Var(X) = E(X^{2}) - [E(X)]^{2}

We can prove this as follows:

Var(X) = E[(X-E(X))^{2}] = E[X^{2}- 2XE(X) + [E(X)]^{2}] = E(X^{2}) -2E(X)E(X) + [E(X)]^{2}= E(X^{2}) - [E(X)]^{2}

This formula is useful theoretically, but isn't recommended for computation on computers using approximate floating-point arithmetic. Since the two quantities subtracted might be large, but their difference small, the formula can result in much of the precision of the computation being lost.

Looking at the above formula this way can also be useful:

E(X^{2}) = Var(X) + [E(X)]^{2}

For any random variables, *X* and *Y*, we saw earlier that
E(*X*) = E(E(*X*|*Y*)). There's a similar formula
for the variance of *X*:

Var(X) = E(Var(X|Y)) + Var(E(X|Y))

For example, *X* might be the time in seconds that some program
takes to run, and *Y* might be the input to this program. The run
time *X* will varying randomly, for two reasons. First, the run
time may vary randomly even for a given input (eg, due to random device
interrupts occupying time while the program is running). This is what
the first term above corresponds to. Secondly, the input may come
randomly from some distribution, and the average run time may be
different for different inputs. This is what the second term above
corresponds to.

We can prove this formula as follows:

Var(X) = E(X^{2}) - [E(X)]^{2}

= E(E(X^{2}|Y)) - [E(E(X|Y))]^{2}

= E(Var(X|Y) + [E(X|Y)]^{2}) - E([E(X|Y)]^{2}) + Var(E(X|Y))

= E(Var(X|Y)) + Var(E(X|Y))