Types
of Distributions
When
datasets are graphed they form a picture that can
aid in the interpretation of the information.
The most commonly referred to type of
distribution is called a normal distribution or
normal curve and is often referred to as the bell
shaped curve because it looks like a bell.
A normal distribution is symmetrical, meaning
the distribution and frequency of scores on the left
side matches the distribution and frequency of
scores on the right side.
Many
distributions fall on a normal curve, especially
when large samples of data are considered.
These normal distributions include height,
weight, IQ, SAT Scores, GRE and GMAT Scores, among
many others. This
is important to understand because if a distribution
is normal, there are certain qualities that are
consistent and help in quickly understanding the
scores within the distribution
The
mean, median, and mode of a normal distribution are
identical and fall exactly in the center of the
curve.
This means that any score below the mean
falls in the lower 50% of the distribution of scores
and any score above the mean falls in the upper 50%.
Also, the shape of the curve allows for a
simple breakdown of sections.
For instance, we know that 68% of the
population fall between one and two standard
deviations (See Measures of Variability Below) from
the mean and that 95% of the population fall between
two standard deviations from the mean.
Figure 8.1 shows the percentage of scores
that fall between each standard deviation.
Figure
8.1: The Normal Curve
As an example, lets look at the normal curve associated
with IQ Scores (See Figure 8.2).
The mean, median, and mode of a Wechslers
IQ Score is 100, which means that 50% of IQs fall
at 100 or below and 50% fall at 100 or above.
Since 68% of scores on a normal curve fall
within one standard deviation and since an IQ score
has a standard deviation of 15, we know that 68% of
IQs fall between 85 and 115.
Comparing the estimated percentages on the
normal curve with the IQ scores, you can determine
the percentile rank of scores merely by looking at
the normal curve.
For example, a person who scores at 115 performed
better than 87% of the population, meaning that a
score of 115 falls at the 87^{th} percentile.
Add up the percentages below a score of 115
and you will see how this percentile rank was determined.
See if you can find the percentile rank of
a score of 70.
Figure
8.2: IQ Score Distributions
Skew.
The skew of a distribution refers to how the
curve leans. When
a curve has extreme scores on the right hand side of
the distribution, it is said to be positively
skewed. In
other words, when high numbers are added to an
otherwise normal distribution, the curve gets pulled
in an upward or positive direction.
When the curve is pulled downward by extreme
low scores, it is said to be negatively skewed.
The more skewed a distribution is, the more
difficult it is to interpret.
Figure
8.3: Distribution Skew
Kurtosis.
Kurtosis refers to the peakedness or flatness
of a distribution.
A normal distribution or normal curve is considered
a perfect mesokurtic distribution.
Curves that contain more score in the center
than a normal curve tend to have higher peaks and
are referred to as leptokurtic.
Curves that have fewer scores in the center
than the normal curve and/or more scores on the outer
slopes of the curve are said to be platykurtic.
Figure
8.4: Distribution Kurtosis
Statistical
procedures are designed specifically to be used with
certain types of data, namely parametric and
nonparametric.
Parametric data consists of any data set that
is of the ratio or interval type and which falls on
a normally distributed curve.
Nonparametric data consists of ordinal or
ratio data that may or may not fall on a normal
curve. When
evaluating which statistic to use, it is important
to keep this in mind.
Using a parametric test (See Summary of
Statistics in the Appendices) on nonparametric data
can result in inaccurate results
because of the difference in the quality of this
data. Remember,
in the ideal world, ratio, or at least interval
data, is preferred and the tests designed for
parametric data such as this tend to be the most
powerful.
