Association, Causation and Ecological Correlation
Association, causation and ecological correlation are concepts that people often misinterpret. These concepts are often used by people in replacement with the other. In the following sections, a clear discussion of the three concepts will be put in focus. In addition to the discussion are some concepts that are also related to association, causation and ecological correlation. Association Association is the relationship of a variable to another variable. When one speaks of association, this does not denote causation.
Association does not mean that a variable causes another variable to happen. Association is a prerequisite to determine whether two variables exhibit causation (Good & Hardin, 2003). Association is a relationship between two variables. In statistics, a relationship between two variables can be quantified. One way to quantify association is by correlation. Correlation is a statistical method which describes the degree of association between two variables. The number that a correlation produced is called the correlation coefficient.
This number is used to interpret the degree of association of two variables. A correlation coefficient can have a value from -1 to 1. The -1 coefficient means that there is a perfect negative relationship between two variables. The 1 coefficient means that there is a perfect positive relationship between two variables. A zero correlation on the other hand means that there is no relationship between two variables (Agresti & Franklin, 2007). Causation Causation is a relationship between two variables wherein a variable brings about the other variable.
Another definition says that any changes in a variable will cause a change in the other variable provided that there is a relationship between the two variables (Sirkin, 2005). Causation only happens when there is association. On the other hand, association does not imply that there is causation. For example, smoking is associated to lung cancer. Unless proven true, one cannot say that smoking causes lung cancer. Regression is the method used to determine if there is causation between variables provided that the variables are correlated (Yale University, 1997).
There are many factors that affect interpretations of relationships between two variables. Two of those factors are outliers, influential observations and lurking variables. When causation is already tested for two variables using a regression model, one might see data values that are so far away from the curve generated by the rest of the data. These data that are far away from the curve generated by the other data are called outliers. Outliers can be an erroneous data that influences causation by having poorly fitted regression curve.
Influential observations are data that lies in the regression curve but are very far from the other data. These observations may have certain impacts in causation. For example, if an influential observation exists in linear regression curve, then the influential observation might affect the slope of the regression curve (“Linear Regression”, 1997). Lurking variables is another factor that can influence causation. Lurking variables are variables that are not needed in the modeling of causation. They are also called third party variables.
These variables can affect the relationship between variables being tested (Agresti & Franklin, 2007). Ecological Correlation Ecological correlations are correlation of averages. This means that averages are taken into consideration in determining the strength of relationship no the individual. This type of correlation is said to have a misleading approach because they are usually computed to be very high. This high correlation is due to disregarding of individual variability (Agresti & Franklin, 2007).
As already said one should be aware that ecological correlations are usually very high and it might differ to the correlations of individual measurements. This might cause a misleading interpretation of association of two variables (Waller & Gotway, 2004). Association, causation and ecological correlation are clearly different from each other. One should not use one of the said concepts in place of the other. Improper use of the said concepts will lead to the misinterpretation of data being tested for association, causation and ecological correlation. References Agresti, A. & Franklin, C. (2007).
Statistics : The art and science of learning from data. Upper Saddle River, N. J. Pearson Prentice Hall. Good, P. , & Hardin, J. (2003). Common errors in statistics: And how to avoid them. Hoboken, NJ: John Wiley & Sons, Inc. Sirkin, M. (2005). Statistics for the social sciences. Thousand Oaks, CA: Sage Publications, Inc. Waller, L. , & Gotway, C. (2004). Applied spatial statistics for public health data. Hoboken, NJ: John Wiley & Sons, Inc. Yale University. (1997). Linear Regression. Retrieved June 11, 2009, from http://www. stat. yale. edu/Courses/1997-98/101/linreg. htmSample Essay of Eduzaurus.com