5 minute read

Statistics

Inferential Statistics

Expressing a collection of data in some useful form, as described above, is often only the first step in a statistician's work. The next step will be to decide what conclusions, predictions, and other statements, if any, can be made based on those data. A number of sophisticated mathematical techniques have now been developed to make these judgments.

An important fundamental concept used in inferential statistics is that of the null hypothesis. A null hypothesis is a statement made by a researcher at the beginning of an experiment that says, essentially, that nothing is happening in the experiment. That is, nothing other than natural events are going on during the experiment. At the conclusion of the experiment, the researcher submits his or her data to some kind of statistical analysis to see if the null hypothesis is true, that is, if nothing other than normal statistical variability has taken place in the experiment. If the null hypothesis is shown to be true, than the experiment truly did not have any effect on the subjects. If the null hypothesis is shown to be false, then the researcher is justified in putting forth some alternative hypothesis that will explain the effects that were observed. The role of statistics in this process is to provide mathematical tests to find out whether or not the null hypothesis is true or false.

A simple example of this process is deciding on the effectiveness of a new medication. In testing such medications, researchers usually select two groups, one the control group and one the experimental group. The control group does not receive the new medication; it receives a neutral substance instead. The experimental group receives the medication. The null hypothesis in an experiment of this kind is that the medication will have no effect and that both groups will respond in exactly the same way, whether they have been given the medication or not.

Suppose that the results of one experiment of this kind was as follows, with the numbers shown being the number of individuals who improved or did not improve after taking part in the experiment.

At first glance, it would appear that the new medication was at least partially successful since the number of those who took it and improved (62) was greater than the number who took it and did not improve (38). But a statistical test is available that will give a more precise answer, one that will express the probability (90%, 75%, 50%, etc.) that the null hypothesis is true. This test, called the chi square test, involves comparing the observed frequencies in the table above with a set of expected frequencies that can be calculated from the number of individuals taking the tests. The value of chi square calculated can then be compared to values in a table to see how likely the results were due to chance and how likely to some real affect of the medication.

Another example of a statistical test is called the Pearson correlation coefficient. The Pearson correlation coefficient is a way of determining the extent to which two variables are somehow associated, or correlated, with each other. For example, many medical studies have attempted to determine the connection between smoking and lung cancer. One way to do such studies is to measure the amount of smoking a person has done in her or his lifetime and compare the rate of lung cancer among those individuals. A mathematical formula allows the researcher to calculate the Pearson correlation coefficient between these two sets of data-rate of smoking and risk for lung cancer. That coefficient can range between 1.0, meaning the two are perfectly correlated, and -1.0, meaning the two have an inverse relationship (when one is high, the other is low).

The correlation test is a good example of the limitations of statistical analysis. Suppose that the Pearson correlation coefficient in the example above turned out to be 1.0. That number would mean that people who smoke the most are always the most likely to develop lung cancer. But what the correlation coefficient does not say is what the cause and effect relationship, if any, might be. It does not say that smoking causes cancer.

Chi square and correlation coefficient are only two of dozens of statistical tests now available for use by researchers. The specific kinds of data collected and the kinds of information a researcher wants to obtain from these data determine the specific test to be used.

Resources

Books

Freund, John E., and Richard Smith. Statistics: A First Course. Englewood Cliffs, NJ: Prentice Hall Inc., 1986.

Hastie, T., et al. The Elements of Stastical Learning: Data Mining, Inference, and Prediction. New York: Springer Verlag, 2001.

Walpole, Ronald, and Raymond Myers, et al. Probability and Statistics for Engineers and Scientists. Englewood Cliffs, NJ: Prentice Hall, 2002.

Witte, Robert S. Statistics. 3rd ed. New York: Holt, Rinehart and Winston, Inc., 1989.

David E. Newton

KEY TERMS

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

Continuous variables: —A variable that may take any value whatsoever.
Deviation: —The difference between any one measurement and the mean of the set of scores.
Discrete variable: —A number that can have only certain specific numerical value that can be clearly separated from each other.
Frequency polygon: —A type of frequency distribution graph that is made by joining the midpoints of the top lines of each bar in a histogram to each other.
Histogram: —A bar graph that shows the frequency distribution of a variable by means of solid bars without any space between them.
Mean: —A measure of central tendency found by adding all the numbers in a set and dividing by the quantity of numbers.
Measure of central tendency: —Average.
Measure of variability: —A general term for any method of measuring the spread of measurements around some measure of central tendency.
Median: —The middle value in a set of measurements when those measurements are arranged in sequence from least to greatest.
Mode: —The value that occurs most frequently in any set of measurements.
Normal curve: —A frequency distribution curve with a symmetrical, bellshaped appearance.
Null hypothesis: —A statistical statement that nothing unusual is taking place in an experiment.
Population: —A complete set of individuals, objects, or events that belong to some category.
Range: —The set containing all the values of the function.
Standard deviation: —The square root of the variance.

Additional topics

Science EncyclopediaScience & Philosophy: Spectroscopy to Stoma (pl. stomata)Statistics - Some Fundamental Concepts, Collecting Data, Graphical Representation, Distribution Curves, Other Kinds Of Frequency Distributions - Descriptive statistics