Inferential
Procedures
Specific
procedures used to make inferences about an unknown
population or unknown score vary depending on the
type of data used and the purpose of making the
inference. There
are five main categories of inferential procedures
that will be discussed in this chapter: t-test,
ANOVA, Factor Analysis, Regression Analysis, and
Meta Analysis.
t-Test.
A t-test is perhaps the most simple of the
inferential statistics.
The purpose of this test is to determine if a
difference exists between the means of two groups
(think ‘t’ for two).
For example, to determine if the GPA’s of
students with prior work experience differs from the
GPAs of students without this experience, we would
employ the t-test by comparing the GPAs of each
group to each other.
To
compare these groups, the t-test statistical formula
includes the means, standard deviations, and number
of subjects for each group.
Each of these sets of data can be derived by
using descriptive statistics discussed in the
previous chapter.
Therefore, the t-test can be computed by hand
in a relatively short amount of time depending on
the number of subjects within each data set.
ANOVA.
The term ANOVA is short for Analysis of
Variance and is typically used when there are one or
more independent variables and two or more dependent
variables. If
we were to study the effects of work experience on
college grades, we would have one independent and
one dependant variable and a simple t-test would
suffice. What
if we also wanted to understand the effects of age,
race, and economic background on college grades?
To use a simple t-test would mean we would
have to perform one t-test for every pair of data. For this example, we would need to compare work and grades,
age and grades, race and grades, and income and
grades, resulting in four independent statistical
procedures. Add an additional dependent variable,
such as length of time it takes to graduate and we
double the number of procedures required to eight.
We
could do eight t-tests or we could simply do an
ANOVA, which analyzes all eight sets of data at one
time. The
ANOVA is superior for complex analyses for two
reasons, the first being its ability to combine
complex data into one statistical procedure. The second benefit over a simple t-test is the ANOVA’s
ability to determine what are called interaction
effects. With
a t-test we could determine if the means of older
and younger students are different on the variable
of grades (referred to as a main effect).
We could also determine whether or not the
means of whites and blacks differed in terms of
grades (main effect as well), but we could not
determine how these two variables (age and race)
interact with each other.
Consider the data in Table 9.1, representing
the number of data points we would have for a study
with just three independent variables (each with
only two levels) and two dependent variables.
If
you look at the data closely, you may notice that
the mean GPA for blacks is 3.0 and the mean GPA for
whites is also 3.0.
A simple t-test comparing the means of blacks
and whites would certainly not find a difference.
However, when you combine with this the
interaction of GPA and age, the data looks
completely different.
The mean GPA is 2.5 for older blacks, 3.5 for
older whites, 3.5 for younger blacks, and 2.5 for
younger whites.
Now we can see that there is a difference
between blacks and whites: (1) older blacks have
higher GPAs than older whites and (2) younger whites
have higher GPAs than younger blacks.
This represents the interaction effects of
race and age that would not have been detected by a
simple t-test.
Table
9.1: Hypothetical Three Way Analysis of Variance
with Two Means
| Independent
Variables |
Dependent
Variables |
| Work |
Age |
Race |
GPA |
Time |
| Yes |
Older |
Black |
3.0 |
12 |
| No |
Older |
Black |
2.0 |
8 |
| Yes |
Older |
White |
4.0 |
12 |
| No |
Older |
White |
3.0 |
8 |
| Yes |
Younger |
Black |
3.0 |
4 |
| No |
Younger |
Black |
4.0 |
8 |
| Yes |
Younger |
White |
2.0 |
4 |
| No |
Younger |
White |
3.0 |
8 |
Looking
at work experience and length of time to graduation
also reveals interesting results.
For those with work experience, the mean time
to graduation was eight years.
For those without work experience, the
average time to graduation was also eight years. But this simple main effect does not tell the whole story.
See if you can determine any interaction
effects that play a role in the length of time to
graduation.
Factor
Analysis.
A factor analysis is used when an attempt is
being made to break down a large data set into
different subgroups or factors.
By using a somewhat complex procedure that is
typically performed using specialized software, a
factor analysis will look at each question within a
group of questions to determine how these questions
accumulate together.
If
we were to give a class a test on basic mathematics
and then perform a factor analysis on the results,
for example, we would likely find that questions
related to addition tend to be answered at the same
rate and questions related to subtraction would tend
to be answered at the same rate.
In other words, students who are good at
addition would do well on most addition questions
and students who were poor at addition would score
poorly on most addition questions.
Therefore a math test consisting of addition
and subtraction would likely have two factors.
Regression
Analysis.
When a correlation is used we are able to
determine the strength and direction of the
relationship between two or more variables.
If we determined that the correlation between
a midterm and a final exam was +.95, we could say
that these two tests are strongly and directly
related to each other.
In other words, a student who scored high on
one would likely score high on the other.
Regression
Analysis takes this a step further.
By creating a regression formula based on the
known data, we can predict a student’s score on
the final (for example) merely by knowing her score
on the midterm.
If two variables were correlated at +1.0 or
–1.0 (perfect correlations) this prediction would
be extremely accurate.
If the correlation coefficient was +/-0.9,
the prediction would be good but less accurate than
a perfect correlation.
The farther from a perfect correlation, the
less accurate the results of the prediction.
Take a look at the perfectly correlated
scores for the first five students below and see if
you can predict the final exam score for the sixth
student based on her score on the midterm.
Table
9.2: Hypothetical Test Scores
| Student |
Midterm |
Final |
| Bob |
80 |
88 |
| Sue |
50 |
55 |
| Ling |
60 |
66 |
| Frank |
80 |
88 |
| Henry |
90 |
99 |
| Lisa |
70 |
?? |
When
the data set is much larger and the correlation less
than perfect, making a prediction requires the use
of the statistical regression, which is basically a
geometric formula used to determine where a score
falls on a straight line. By using this statistic, we develop a formula that is used to
estimate one data point based on another data point
in a known correlation.
The formula for the data above would be
‘Final = Midterm X 1.1.’
Did you predict Lisa’s score on the final
correctly?
Meta
Analysis.
A meta analysis refers to the combining of
numerous studies into one larger study.
When this technique is used, each study
becomes one subject in the new meta study.
For instance, the combination of 12 studies
on work experience and college grades would result
in a meta study with 12 subjects.
While the process is a little more complex
than this in reality, the meta analysis basically
combines many studies together to determine if the
results of all of them, when taken as a whole, are
significant. The
meta study is especially helpful when different
related studies conducted in the past have found
different results.
|