Home | Login | Logout | View Survey/Sites | Add Survey/Sites | My Account | UserID:

ANOVA > Web-form > Online Help > ANOVA Primer

ANOVA Primer

Overview

ANOVA stands for ANalysis Of VAriance. It is a standard and widely applied statistical technique in biology used to formally compare the effects of different kinds of treatments or categorical factors on one or more measured quantitative variable(s). Typical applications include the analysis of data from ecological or agricultural field experiments and clinical drug trials. An example of the former would be the comparison of wheat yield (a quantitative variable) between field plots where different types or quantities of fertilizer (the treatment, factor or categorical variable) were applied. An example of ANOVA usage in clinical applications would be the assessment of whether a drug dosage regimen (the categorical factor) significantly improved some measured aspect (the measured or response variable) of the health of patients relative to a placebo control.

Conceptual Basis

Simply put, what ANOVA does in a formal statistical way is look at how variability around means of the response variable associated with different treatment types is distributed, and whether there is separation or overlap suggestive of a notable effect or not. Consider a simple graph where one plots the average values for each treatment type and their respective value spread or "typical ranges" (ie. variance, or more correctly, approximately two times their standard deviation). If there is overlap in the spread of values for different treatment types, then one concludes that the effect of treatments is not statistically significant. This outcome will result if means are very similar and/or if associated variances are large. The latter occurs if there is much natural variability, or if there are large measurement errors associated with the quantitative observations). Conversely, if means are well separated and variability is low then one can confidently conclude that different treatments have a statistically measurable, significant effect. This in a nutshell is the objective of ANOVA: to provide a statistical method for the assessment of whether treatments or factors are significantly different in their effect given observed variability in a quantified measurement variable.

Methodological Framework

The above illustrates the basic principle of the method but what is happening formally in an ANOVA is a decomposition of the observed variance into its component parts by treatment type. This involves calculations of the proportion of the total observed variance that is attributable to a given categorical variable or treatment factor. The result is an F-Ratio statistic (F-value) that is compared to probabilities (P, based on the F-distribution) given the sample size (n) and number of factors involved (degrees of freedom). This in turn indicates at what probability level observed differences are significant, generally differences being deemed significant if P<=5% (note: this means there is only a 5% chance of incorrectly concluding that differences are significant when in reality they are not. The corollary to this is that even if the ANOVA analysis shows a significant difference, there is still always a chance that this is a statistical artifact given errors in the data). The lower the P, the lower the chance of concluding wrongly, but also the larger the sample sizes (number of data points) required for the analysis. Generally the larger the sample size on which the analysis is based the larger statistical confidence in the outcome. Also the more categorical factors comprising an ANOVA, the more data hungry the analysis will be. If there is insufficient degrees of freedom due to small sample sizes or limited replication then the ANOVA cannot be undertaken.

The examples of analysis of variance applications given previously above are of what is referred to as One-Way ANOVA: the analysis of variance based on a single categorical variable or factor examined in isolation (eg. fertilizer concentration treatment on wheat yield). However, in reality wheat yield may also be influenced by additional factors (eg. pesticide treatment, light exposure, field slope etc etc) that may operate singly or in complex synergistic (combined) ways. Multifactor ANOVA (ie. 2-way ANOVA and its multi-way extensions) provides a generalized statistical methodology for extending the basic one-way approach to examine simultaneously the effects of multiple categorical factors on a measurement variable to assess whether they are having a significant impact AND also whether possible interactions between factors are having significant effects. For example, it is quite likely that both pesticide application and higher light levels will promote better wheat yield. Additionally, however, high light may also degrade pesticide chemicals and render it inactive, such that indirect interactions between these factors will become important in influencing observed crop yield under different treatment regimes. ANOVA essentially provides a complete decomposition of observed variance amongst relevant categorical factors in a manner that also explicitly accounts for possible interactions between factors. Again, this decomposition and the assessment of factor and interaction significance is based on calculation of F-ratio statistics and associated probability levels. Such multifactor ANOVA is precisely what has been implemented in the case of the Reef Check WRAS ANOVA system.

Finally, it is important to note that ANOVA is refered to as a parametric technique, that is it makes certains assumptions about the shape or distribution that the variance takes (ie. as a normal or bell-shaped distribution). Another assumption is that the data are "heteroscedastic", or that there are no trends or correlation between variances and means in the data, otherwise clearly significant biases will be introduced into the analysis. These assumptions are fine if one is dealing say with variables that are well behaved (normally distributed), such as the heights of individuals in a population of people. Here, ANOVA can confidently be applied to the raw data to assess whether alternative diet treatments had a significant impact on the heights of people in populations fed different diets. In cases where these assumptions are not met, as is the case for abundance estimates or other population census data, then transformation of the raw data is necessary to stabilize variances prior to analysis via ANOVA.

References

Bailey, N.T. 1981. Statistical Methods in Biology (2nd Ed.). Hodder & Stroughton. London.

Sokal, R.S & F.J. Rohlf. 1981. Biometry (2nd Ed.). W.H. Freeman & Compny. New York.

Privacy Statement | Site Map | Terms of Use

Web site hosted at ReefBase