Hypothesis
Tests
with
Censored
Data
450c
01­
1
Hypothesis
Tests
for
Environmental
Problems
One
sample
tests
involve
comparing
data
to
a
threshold
concentration.
This
is
often
done
to
support
risk
assessment
Two
sample
tests
are
used
when
two
distinct
data
sets
need
to
be
compared.
This
is
needed,
for
example,
when
comparing
site
concentrations
with
background
concentration
We
will
look
at
these
in
the
context
of
censored
data
450c
01­
2
Hypothesis
Tests
with
Censored
Data
The
most
common
hypothesis
tests
that
are
used
are
the
one
and
two
sample
t­
tests
Both
of
these
are
parametric
tests
with
assumptions
that
the
populations
of
interest
are
normally
distributed
For
t­
tests,
the
mean
and
variance
of
the
underlying
population
must
be
estimated
from
the
data
450c
01­
3
Hypothesis
Tests
with
Censored
Data
The
presence
of
censored
data
impares
our
ability
to:
estimate
the
mean
and
variance
from
the
data
assess
the
validity
of
the
distributional
assumption
If
data
censoring
makes
it
difficult
to
apply
a
t­
test,
are
other
testing
methods
available?

One
possibility
is
to
explore
non­
parametric
testing
alternatives
450c
01­
4
Why
Nonparametrics?

Non­
parametric
tests
often
require
fewer
assumptions
about
the
underlying
population
than
parametric
tests
The
concepts
involved
with
non­
parametric
test
procedures
are
generally
easier
to
understand
and
explain
Many
non­
parametric
tests
are
useable
in
situations
where
normal
theory
is
not
applicable
450c
01­
5
Why
Not
Nonparametrics?

One
disadvantage
of
non­
parametric
tests
is
that
only
the
relative
magnitudes
of
the
data
are
incorporated
into
the
hypothesis
testing
process,
resulting
in
some
loss
of
information
In
situations
where
the
assumptions
of
a
parametric
test
are
satisfied,
the
corresponding
non­
parametric
test
has
less
power
(
in
a
statistical
sense)
450c
01­
6
Ranking
Most
common
non­
parametric
tests
are
based
on
the
relative
ordering
of
the
data
Placing
the
data
in
order,
relative
to
one
another,
is
called
ranking
Once
the
data
are
ordered
(
or
ranked
),
the
specific
positions
in
the
data
set
are
called
order
statistics
450c
01­
7
Ranking
for
uncensored
data
is
simple.
The
only
complicatoin
is
ties,
but
they
are
also
handled
with
some
ease
For
censored
data
some
adjustment
needs
to
be
made
for
the
non­
detects
A
method
that
we
use
to
deal
with
this
situation
is
called
the
Gehan
ranking
scheme
The
best
way
to
become
familiar
with
Gehan
ranking
is
to
consider
a
small
example
Gehan
Ranking
450c
01­
8
Gehan
Ranking
This
example
ranks
the
data
directly,
without
first
shifting
the
data
by
subtracting
the
action
limit
Suppose
we
have
a
sample
with
the
following
values.
The
"<"
symbol
in
front
of
a
value
denotes
a
non­
detect
1
<
4
5
7
<
12
15
2
<
4
17
8
450c
01­
9
Gehan
Ranking
We
can
see
that
we
have
multiple
detection
limits
to
deal
with
in
this
data
set
First
consider
the
initial
ranks
of
the
data
set
as
if
there
were
no
detection
limit
issues
These
values
will
be
called
the
initial
ranks
1
2
<
4
<
4
5
7
8
<
12
15
17
450c
01­
10
Gehan
Ranking
Principles
First
rank
the
data
in
order
as
if
the
detection
status
were
irrelevant
Start
with
the
highest
value
and
work
your
way
to
the
smallest
value
Determine
the
lowest
and
highest
possible
ranks
that
each
datum
can
have
Non­
detects
are
assumed
to
be
tied
with
all
of
the
values
below
them
450c
01­
11
Gehan
Ranking
In
this
example,
the
value
of
17
receives
a
rank
of
10
because
is
larger
than
all
of
the
other
values,
even
in
the
presence
of
non­
detects
The
value
of
15
similarly
receives
a
rank
of
9
Value
1
2
<
4
<
4
5
7
8
<
12
15
17
Initial
Rank
1
2
3
4
5
6
7
8
9
10
Gehan
Rank
9
10
450c
01­
12
Gehan
Ranking
The
<
12
value
is
a
non­
detect
and
therefore
it
is
assumed
to
be
tied
with
all
of
the
values
below
it
So
we
need
to
average
all
of
the
ranks
up
to
and
including
the
rank
of
<
12
(
1+
2+
3+
4+
5+
6+
7+
8)/
8
=
36/
8
=
4.5
Value
1
2
<
4
<
4
5
7
8
<
12
15
17
Initial
Rank
1
2
3
4
5
6
7
8
9
10
Gehan
Rank
4.5
9
10
450c
01­
13
Gehan
Ranking
The
value
of
8
is
certainly
greater
the
6
values
below
it
regardless
of
the
detection
status
of
the
values
below
it
Therefore
the
rank
of
8
must
be
at
least
7
The
value
of
8
is
treated
as
a
tie
with
<
12
but
it
is
certainly
less
than
15
and
17
Therefore
the
rank
of
8
can
be
at
most
8
Value
1
2
<
4
<
4
5
7
8
<
12
15
17
Initial
Rank
1
2
3
4
5
6
7
8
9
10
Gehan
Rank
7.5
4.5
9
10
450c
01­
14
Gehan
Ranking
The
value
of
7
is
certainly
greater
the
5
values
below
it
regardless
of
the
detection
status
of
the
values
below
it
Therefore
the
rank
of
7
must
be
at
least
6
The
value
of
7
is
treated
as
a
tie
with
<
12
but
it
is
certainly
less
than
8,
15
and
17
Therefore
the
rank
of
7
can
be
at
most
7
Value
1
2
<
4
<
4
5
7
8
<
12
15
17
Initial
Rank
1
2
3
4
5
6
7
8
9
10
Gehan
Rank
6.5
7.5
4.5
9
10
450c
01­
15
Gehan
Ranking
The
value
of
5
is
certainly
greater
the
4
values
below
it
regardless
of
the
detection
status
of
the
values
below
it
Therefore
the
rank
of
5
must
be
at
least
5
The
value
of
5
is
treated
as
a
tie
with
<
12
but
it
is
certainly
less
than
7,
8,
15
and
17
Therefore
the
rank
of
5
can
be
at
most
6
Value
1
2
<
4
<
4
5
7
8
<
12
15
17
Initial
Rank
1
2
3
4
5
6
7
8
9
10
Gehan
Rank
5.5
6.5
7.5
4.5
9
10
450c
01­
16
Gehan
Ranking
The
two
values
of
<
4
are
certainly
less
than
5,
7,
8,
15
and
17
Therefore
the
rank
of
<
4
is
at
most
5
The
value
of
<
4
are
taken
to
be
tied
with
the
value
of
<
12
as
well
as
the
values
1
and
2
Therefore,
the
Gehan
rank
of
the
value
<
4
is
the
average
of
1,
2,
3,
4
and
5
Value
1
2
<
4
<
4
5
7
8
<
12
15
17
Initial
Rank
1
2
3
4
5
6
7
8
9
10
Gehan
Rank
5.5
6.5
7.5
4.5
9
10
450c
01­
17
Gehan
Ranking
The
values
of
<
4
receive
the
Gehan
rank
that
is
the
average
of
the
ranks
it
could
possibly
take
So
the
rank
of
<
4
=
(
1+
2+
3+
4+
5)/
5
=
15/
5
=
3
Value
1
2
<
4
<
4
5
7
8
<
12
15
17
Initial
Rank
1
2
3
4
5
6
7
8
9
10
Gehan
Rank
3
3
5.5
6.5
7.5
4.5
9
10
450c
01­
18
Gehan
Ranking
The
value
of
2
is
certainly
greater
than
the
value
of
1
Therefore,
the
Gehan
rank
of
2
is
at
least
2
The
value
of
2
is
treated
as
tied
with
<
4,
<
4,

and
<
12
The
Gehan
rank
of
the
value
of
2
is
the
average
of
2,
3,
4,
and
5:
(
2+
3+
4+
5)/
4
=
3.5
Value
1
2
<
4
<
4
5
7
8
<
12
15
17
Initial
Rank
1
2
3
4
5
6
7
8
9
10
Gehan
Rank
3.5
3
3
5.5
6.5
7.5
4.5
9
10
450c
01­
19
Gehan
Ranking
Finally,
the
value
of
1
certainly
less
than
all
of
the
detected
values
The
value
of
1
is
treated
as
tied
with
the
values
<
4,
<
4
and
<
12
The
Gehan
rank
of
the
value
of
1
is
(
1+
2+
3+
4)/
4
=
2.5
Value
1
2
<
4
<
4
5
7
8
<
12
15
17
Initial
Rank
1
2
3
4
5
6
7
8
9
10
Gehan
Rank
2.5
3.5
3
3
5.5
6.5
7.5
4.5
9
10
450c
01­
20
Gehan
Ranking
Summary
In
conclusion,
we
have
the
following
Gehan
ranks
that
can
now
be
used
in
nonparametric
hypothesis
tests
Using
these
ranks
accounts
for
all
possible
values
that
the
non­
detects
may
have
Value
1
<
4
<
4
2
<
12
5
7
8
15
17
Initial
Rank
1
3
4
2
8
5
6
7
9
10
Gehan
Rank
2.5
3
3
3.5
4.5
5.5
6.5
7.5
9
10
450c
01­
21
Wilcoxon
Nonparametric
Tests
The
Wilcoxon
Sign
Rank
test
is
the
non­
parametic
equivalent
to
the
one­
sample
t­
test
(
e.
g.,
for
comparison
of
a
mean
to
a
threshold
of
concern)

The
Wilcoxon
Rank
Sum
(
WRS)
test
is
the
nonparametric
equivalent
to
the
two­
sample
t­
test
(
comparison
of
two
data
sets,
e.
g.,

background
and
site
concentrations)
450c
01­
22
Wilcoxon
Rank
Sum
Test
The
WRS
test
compares
two
populations
by
jointly
ranking
samples
taken
from
each
The
sums
of
the
ranks
of
the
samples
from
the
populations
are
compared
The
magnitude
of
the
results
are
not
considered,
only
their
ranks,
thus
this
is
not
a
good
test
if
magnitudes
are
important
Gehan
ranking
allows
nondetects
to
be
accounted
for
in
this
test
450c
01­
23
Wilcoxon
Rank
Sum
Test
Combine
two
data
sets
of
size
n1
and
n2
into
one
data
set
of
size
m
Rank
from
1
to
m
Sum
the
ranks
of
the
n1
measurements
from
the
first
data
set
(
W)

Calculate
Z
and
compare
to
standard
normal
tables
12
/

)

1
(
2
/

)

1
(

2
1
1
+
+

 

=
m
n
n
m
n
W
Z
450c
01­
24
TCE
Example
We
have
the
following
concentration
data
for
tetrachloroethene
(
TCE)
at
a
site
in
New
York
We
will
demonstrate
a
2­
sample
test
by
comparing
site
concentration
data
both
before
and
after
remediation
treatment
Is
the
post­
treatment
concentration
distribution
different
than
the
pre­
treatment
concentration
distribution
450c
01­
25
TCE
Pre
and
Post
Treatment
Data
Pre­
treatment
TCE
data
Post­
treatment
TCE
data
<
2*
<
2
<
6.5
15
32
53
76
86
100
150
790
1300
1500
1900
3000
3400
3700
<
4200
4300
4400
4700
4800
<
3
<
9
17
20
25
42
46
51
59
700
3300
3300
450c
01­
26
TCE
Pre
and
Post
Treatment
Data
The
first
step
of
performing
a
Wilcoxon
test
is
to
pool
the
data
into
one
data
set
Initially
we
will
ignore
the
effect
of
non­
detects
(
replace
them
with
DLs),
then
we
will
use
the
Gehan
ranking
system
to
account
for
the
nondetects
<
2
<
2
<
3
<
6.5
<
9
15
17
20
25
32
42
46
51
53
59
76
86
100
150
700
790
1300
1500
1900
3000
3300
3300
3400
3700
<
4200
4300
4400
4700
4800
450c
01­
27
TCE
Example:

WRS
Test
Results
When
we
apply
Wilcoxon
Rank
Sum
test
to
these
data,
we
get
the
following
results
Wilcoxon
Rank
Sum
Test
Z­
statistic
=
­
1.946
p­
value
=
0.026
Wilcoxon
with
Gehan
ranking
Z­
statistic
=
­
1.787
p­
value
=
0.037
450c
01­
28
Other
Nonparametric
Tests
The
WRS
test
looks
at
the
difference
between
distributions
based
on
their
rankings
in
a
combined
data
set
There
are
two
tests
called
the
Quantile
Test
and
the
Slippage
test,
that
deal
with
differences
that
occur
in
the
tails
of
the
distributions
This
is
often
important
when
one
of
the
distributions
has
a
background
component
and
a
contamination
component
450c
01­
29
Quantile
Test
The
WRS
test
determines
whether
an
unusually
large
proportion
of
the
observations
from
one
data
set
exceed
the
observations
from
the
other
data
set
The
phrase
unusually
large
takes
into
account
the
sample
sizes
of
the
data
sets
The
Quantile
test
operates
on
the
same
basic
principle
as
the
WRS
test
450c
01­
30
Quantile
Test
The
Quantile
test
determines
whether
more
of
the
observations
in
the
top
20%
(
or
chosen
percentile)
of
the
combined
data
set
come
from
the
site
data
set
than
would
be
expected
by
chance,
given
the
relative
sample
sizes
of
the
two
data
sets
Selection
of
the
quantile
to
test
should
be
based
on
the
decision
context
of
the
problem
450c
01­
31
Quantile
Test
1.
Decide
on
a
quantile
of
interest
2.
n
=
sample
size
of
first
data
set
m
=
sample
size
of
second
data
set
3.
Combine
data
sets
and
calculate
adjusted
ranks
=
rank
in
combined
data
set/(
m+
n+
1)

4.
k
=
number
of
combined
sample
results
with
"
adjusted
ranks"
larger
than
the
quantile
5.
s
=
number
of
first
data
set
results
with
adjusted
ranks
larger
than
Quantile
450c
01­
32
 
 
 
 
 
 
 
 
+
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

+

=
 =
m
n
m
i
k
i
m
k
n
m
p
k
s
i
(
)!

!
!
b
a
b
a
b
a
 

=

 
 
 
 
 
 
 
 
Finally,
calculate
the
p­
value
where
the
combinatoric
symbol
is
interpreted
as
and
 =

=

 

 

 

 

 

 

 

=
a
j
j
a
a
a
a
1
1
2
)

2
(

)

1
(

!
L
Quantile
Test
450c
01­
33
TCE
Example:

Quantile
Test
Results
Here
are
some
Quantile
test
results
for
the
TCE
data
*
there
are
non­
detects
in
the
data
that
are
greater
than
the
quantile
of
interest
With
Gehan
ranking,
the
results
remained
the
same
in
this
case
Quantile
p­
value
0.90
0.257
0.80
0.055*

0.70
0.211*

0.60
0.059*
450c
01­
34
Slippage
Test
The
Slippage
test
determines
if
an
unusually
large
number
of
the
observations
in
one
data
set
are
greater
than
the
maximum
value
(
or
less
than
the
minimum
value)
from
another
data
set
This
test
may
be
appropriate
when,
for
example
whether
a
new
source
has
increased
the
highest
contamination
levels
over
baseline
conditions
the
number
of
organisms
surviving
for
a
shorter
period
than
the
minimum
survival
time
under
baseline
conditions
is
of
concern
450c
01­
35
Slippage
Test
1.
n
=
sample
size
of
first
data
set
m
=
sample
size
of
second
data
set
2.
s
=
number
of
results
from
the
first
data
set
that
exceed
the
maximum
result
in
the
second
data
set
450c
01­
36
Slippage
Test
a
a
a
j
a
j
×
 

×
 

×
×
×
=

 =
)

1
(

)

2
(

2
1
1
L
 
 
 

 
 
=
 

=
 

 

+
=

+
=
=

=
m
s
i
i
m
j
i
n
m
j
m
n
j
m
j
j
j
j
j
n
p
1
1
1
1
1
Then,
calculate
the
p­
value
based
on
n,
m,

and
s
where
450c
01­
37
When
applied
to
the
TCE
data
from
New
York
to
test
whether
the
pre
data
has
significantly
many
results
above
the
largest
post
concentration:

Slippage
Test
p­
value
=
0.049
Slippage
Test
with
Gehan
ranking
p­
value
=
0.056
TCE
Example:

Slippage
Test
Results
450c
01­
38
Conclusions
Censored
data
often
occurs
in
environmental
media
sampling
The
conceptual
site
model
and
objectives
of
the
analyses
should
be
the
basis
for
decisions
on
how
to
deal
with
censored
data
For
any
method
of
including
censored
data
in
statistical
analyses,
the
impact
of
the
method
should
be
carefully
considered
450c
01­
39
