BELIEVE   ME   NOT!    - -     A   SKEPTICs   GUIDE  

next up previous
Up: Measurement Previous: Vector Tolerance

Statistical Analysis

It's all very well to say that one should always report the results of measurements with uncertainties (or ``errors'' as they are often misleadingly called) specified; but this places a burden of judgement on the experimenter, who must estimate uncertainties in a manner fraught with individual idiosyncracies. Wouldn't it be nice if there were a way to measure one's uncertainty in a rigourous fashion?

Well, there is. It is a little tedious and complicated, but easily understood: one must make a large number of repeated measurements of the same thing and analyze the ``scatter'' of the answers!

Suppose we are trying to determine the ``true'' value of the quantity x. (We usually refer to unspecified things as ``x'' in this business.)  It could be your pulse rate or some other simple physical observable.

We make N independent measurements xi $(i=1,2,3,\dots,N)$ under as close to identical conditions as we can manage. Each measurment, we suspect, is not terribly precise; but we don't know just how imprecise. (It could be largely due to some factor beyond our control; pulse rates, for instance, fluctuate for many reasons.)

Now, the xi will ``scatter'' around the ``true'' x in a distribution that will put some xi smaller than the true x and others larger. We assume that whatever the cause of the scatter, it is basically random - i.e. the exact value of one measurement xi+1 is not directly influenced by the value xi obtained on the previous measurement. (Actually, perfect randomness is not only hard to define, but rather difficult to arrange in practice; it is sufficient that most fluctuations are random enough to justify the treatment being described here.)  It is intuitively obvious (and can even be rigorously proved in most cases) that our best estimate for the ``true'' x is the average or mean value, $\bar{x}$, given by:5.4

\begin{displaymath}\bar{x} \equiv {1 \over N} \sum_{i=1}^N \; x_i.
\end{displaymath} (5.7)

But what is the uncertainty in $\bar{x}$? Let's call it $\bar{\sigma}_x$.

How can we find $\bar{\sigma}_x$ mathematically from the data? Well, if we assume that each individual measurement xi has the same single-measurement uncertainty $\sigma_x$, then the distribution of xi should look like a ``bell-shaped curve'' or gaussian distribution:


  
Figure: A typical graph of ${\cal D}(x)$, the distribution of x, defined as the relative frequency of occurence of different values of x from successive measurements. The ``centre'' of the distribution is at $\bar{x}$, the average or mean of x. The ``width'' of the distribution is $2\sigma$ (one $\sigma$ on either side of the mean.
\begin{figure}
\begin{center}\mbox{
\epsfig{file=PS/distrib.ps,height=1.7in} }\end{center}\end{figure}

Obviously, $\Delta x_i \equiv x_i - \bar{x}$ is a measure of the ``error'' in the $i^{\rm th}$ measurement, but we cannot just find the average of $\Delta x_i$, since by definition the sum of all $\Delta x_i$ is zero (there are just as many negative errors as positive errors). The way out of this dilemma is always to take the average of the squares of $\Delta x_i$, which are all positive. This ``mean square'' error is called the variance, sx2:

\begin{displaymath}s_x^2 \equiv {1 \over N} \sum_{i=1}^{N} \; (x_i - \bar{x})^2
\end{displaymath} (5.8)

and its square root, the ``root mean square error'', is called the standard deviation - which can be shown (rigorously, in many cases, although not without a good deal of math) to be the best possible estimate for the single-measurement uncertainty $\sigma_x$.

So we actually have a way of ``calculating'' our uncertainty directly from the data!  This is quite remarkable. But wait. We have not just measured x once; we have measured it N times. Our instincts (?) insist that our final best estimate of x, namely the mean, $\bar{x}$, is determined more precisely than we would get from just a single measurement. This is indeed the case. The uncertainty in the mean, $\bar{\sigma}_x$, is smaller than $\sigma_x$. By how much?  Well, it takes a bit of math to derive the answer, but you will probably not find it implausible to accept the result that $\bar{\sigma}_x^2$ is smaller than $\sigma_x^2$ by a factor of 1/N. That is,

\begin{displaymath}\bar{\sigma}_x = { \sigma_x \over \sqrt{N} }.
\end{displaymath} (5.9)

Thus 4 measurements give an average that is twice as precise as a single measurement, 9 give an improvement of 3, 100 give an improvement of 10, and so on. This is an extremely useful principle to remember, and it is worth thinking about its implications for a while.

COMMENT:
 

The above analysis of statistical uncertainties explains how to find the best estimate (the mean) from a number N of independent measurements with unknown but similar individual uncertainties. Sometimes we can estimate the uncertainty $\sigma_{x_i}$ in each measurement xi by some independent means like ``common sense'' (watch out for that one!). If this is the case, and if the measurements are not all equally precise (as, for instance, in combining all the world's best measurements of some esoteric parameter in elementary particle physics), then it is wrong to give each measurement equal weight in the average. There is then a better way to define the average, namely the ``weighted mean'':

\begin{displaymath}\bar{x} = { \sum_{i=1}^{N} w_i x_i \over
\sum_{i=1}^{N} w_i } \end{displaymath}

where $w_i \equiv 1 / \sigma_{x_i}^2 $. If the reader is interested in the proper way to estimate the uncertainty $\bar{\sigma}_x$ in the mean under these circumstances, it is time to consult a statistics text; the answer is not difficult, but it needs some explanation that is beyond the scope of this HyperReference.


next up previous
Up: Measurement Previous: Vector Tolerance
Jess H. Brewer
1998-09-15