a study is valid then it truly represents what it
was intended to represent.
Experimental validity refers to the manner in
which variables that influence both the results of
the research and the generalizability to the
population at large.
It is broken down into two groups: (1)
Internal Validity and (2) External Validity.
Internal validity refers to a study’s
ability to determine if a causal relationship exists
between one or more independent variables and one or
more dependent variables.
In other words, can we be reasonably sure
that the change (or lack of change) was caused by
Researchers must be aware of aspects that may
reduce the internal validity of a study and do
whatever they can to control for these threats.
These threats, if left ignored, can reduce
validity to the point that any results are
meaningless rendering the entire study invalid.
There are eight major threats to internal
validity that are discussed below and summarized in
History refers to any event outside of the
research study that can alter or effect subjects’
research does not occur within a vacuum, subjects
often experience environmental events that are
different from one another.
These events can play a role in their
performance and must therefore be addressed.
One way to assure that these events do not
impact the study is to control them, or make
everyone’s experience identical except for the
Since this is often impossible, using
randomization procedures can often minimize this
risk, assuring that outside events that occur in one
group are also likely to occur in the other.
While not a major concern in very short
studies such as a survey study, maturation can play
a major role in longer-term studies.
Maturation refers to the natural
physiological or psychological changes that take
place as we age.
This is especially important in childhood and
must be addressed through subject matching or
For instance, an episode of major depression
typically decreases significantly within a six-month
period even without treatment.
Imagine we tested a new medication designed
to treat depression.
If our results showed that subjects who took
this medication showed a significant decrease in
depressive symptoms within six months, could we
truly say that the medication caused the decrease in
not, especially since maturation alone would have
shown similar results.
People tend to perform better at any activity
the more they are exposed to that activity.
Testing is no exception.
When subjects, especially in single group
studies, are given a test as a pretest and then the
same test as a posttest, the chances that they will
perform better the second time due merely to
practice is a concern.
For this reason, two group studies with a
control group are recommended.
Statistical regression, or regression to the
mean, is a concern especially in studies with
It refers to the tendency for subjects who
score very high or very low to score more toward the
mean on subsequent testing.
If you get a 99% on a test, for instance, the
odds that your score will be lower the second time
are much greater than the odds of increasing your
If the measurement device(s) used in your
study changes during the course of the study,
changes in scores may be related to the instrument
rather than the independent variable.
For instance, if your pretest and posttest
are different, the change in scores may be a result
of the second test being easier than the first
rather than the teaching method employed.
For this reason, it is recommended that pre-
and posttests be identical or at least highly
Selection refers to the manner in which
subjects are selected to participate in a study and
the manner in which they are assigned to groups.
If there are differences between the groups
prior to the study taking place, these differences
will continue throughout the study and may appear as
a change in a statistical analysis.
Addressing these differences through subject
matching or randomization is highly recommended.
We engage in research in order to learn
something new or to support a belief or theory. Therefore, we as researchers may be biased toward the results
we want. This
bias can effect our observations and possibly even
result in blatant research errors that skew the
study in the direction we want.
Using an experimenter who is unaware of the
anticipated results (usually called a double blind
study because the tester is blind to the results)
works best to control for this bias.
Mortality, or subject dropout, is always a
concern to researchers.
They can drastically affect the results when
the mortality rate or mortality quality is different
Imagine in the work experience study if many
motivated students dropped out of one group due to
illness and many low motivated students dropped out
of the other group due to personal factors.
The result would be a difference in
motivation between the two groups at the end and
could therefore invalidate the results.
7.1: Controlling for Threats to Internal Validity
to Internal Validity
selection, random assignment
extreme scores, randomization
consistency, assure alternative form
selection, random assignment
matching and omission
Validity. External validity refers
to the generalizability of a study.
In other words, can we be reasonable sure that
the results of our study consisting of a sample of
the population truly represents the entire population?
Threats to external validity can result in
significant results within a sample group but an inability
for this to be generalized to the population at large.
Four of these threats are discussed below and
summarized in Table 7.2.
Subjects are often provided with cues to the
anticipated results of a study.
When asked a series of questions about
depression, for instance, subjects may become wise
to the hypothesis that certain treatments work
better in treating mental illness.
When subjects become wise to anticipated
results (often called a placebo effect), they can
begin to exhibit performance that they believe is
expected of them. Making sure that subjects are not aware of anticipated
outcomes (referred to as a blind study) reduces the
possibility of this threat.
Similar to a placebo, research has found that
the mere presence of others watching your
performance causes a change in your performance.
If this change is significant, can we be
reasonably sure that it will also occur when no one
is watching? Addressing
this issue can be tricky but employing a control
group to measure the Hawthorne effect of those not
receiving any treatment can be very helpful. In this sense, the control groups is also being observed and
will exhibit similar changes in their behavior as
the experimental group therefore negating the
Effects (or Carryover Effects).
Order effects refer to the order in which
treatment is administered and can be a major threat
to external validity if multiple treatments are
used. If subjects are given medication for two months, therapy for
another two months, and no treatment for another two
months, it would be possible, and even likely, that
the level of depression would be least after the
final no treatment phase.
Does this mean that no treatment is better
than the other two treatments?
It likely means that the benefits of the
first two treatments have carried over to the last
phase, artificially elevating the no treatment
The term interaction refers to the fact that
treatment can affect people differently depending
on the subject’s characteristics.
Potential threats to external validity include
the interaction between treatment and any of the following:
selection, history, and testing.
As an example, assume a group of subjects volunteer
for a study on work experience and college grades.
One group agrees to find part time work the
summer before starting their freshman year and the
other group agrees to join a softball leaguer over
the summer. The group that agreed to work is likely inherently different
than the group that agreed to play softball.
The selection itself may have placed higher
motivated subjects in one group and lower motivated
students in the other. If the work groups earn higher grades in the first semester,
can we truly say it was caused by the work experience? It is likely that the motivation caused both the work experience
and the higher grades.
7.2: Controlling for Threats to External Validity
to Internal Validity
study, control group
treatment order, multiple groups
matching, naturalistic observation