5E Correlation
and Causation
A correlation exists between two
variables A and B, when
- higher values of A consistently go with
higher values of B, positive
correlation.
- higher values of A consistently go with lower
values
of B, negative
correlation.
strength of the correlation: the more
closely two variables follow the general trend. In a perfect correlation,
all data points lie on a straight line.
The following exercises list pairs of variables.
State possible measurement units for the variables.
Are the variables correlated?
If yes, positively or negatively?
Is the correlation strong or weak?
-
height and weight:
measurement units: inches and lbs
positively correlated
weak correlation
-
price and demand
-
altitude on a mountain hike and surrounding air temperature
-
weigth of a car and price of car
scatter diagram: each point
represents
the values of 2 variables.
exercise 9
guidelines for establishing causality:
-
effect is correlated with suspected cause, while other factors vary
-
group includes (doesn't include) suspected cause - effect present
(absent)
-
larger amount of suspected cause -> larger amount of effect
-
if effect has several possible causes, test by eliminating all but one
cause and see if the effect is present
-
if possible, test the suspected cause with an experiment. If the test
cannot
ethically be performed on humans, consider doing it with cell cultures,
or computer models.
-
try to determine the physical mechanism by which the suspected cause
produces
the effect.
possible explanations for a correlation:
-
coincidence
-
both variables might be directly influenced by some common underlying
cause.
-
one of the variables might cause the other.
Are the phenomena correlated? Positively or negatively? Strongly or
weakly?
State the correlation clearly.
Then state whether the correlation is most likely due to coincidence,
a common underlying cause, or a direct cause.
-
In one US city, the crime rate increased at the same time that the number
of people in prison increased
-
Sales of tea in a local restaurant are positively correlated with ticket
sales at the local swimming pool.
-
You are trying to identify the cause of late afternoon headaches that
plague
you several days each week. For each of the following observations ,
explain
which of the six guidelines for establishing causality you used and what
you concluded:
-
the headaches only occur only on days that you go to work.
-
if you stop drinking Coke at lunch, the headaches persist.
-
in the summer the headaches occur less frequently, if you open the windows
of your office slightly. The occur even less often, if you open your
office
window fully.
Conclusions?
Broad levels of confidence in causality:
possible cause: there is
a correlation, but one cannot determine whether correlation implies
causality
(=> start investigation)
probable cause: good reason to
suspect that correlation involves cause (=> warrant for search or wiretap)
cause beyond reasonable doubt:
found a physical model that is so successful in explaining how one thing
causes another that doubting the causality would seem unreasonable (=>
usual standard for conviction)
next lecture