Attachment
2
Statistical
Impact
of
Relying
on
a
6­
Test
Data
Set
Versus
a
10­
Test
Set
The
original
"
chi­
squared"­
and
"
Fisher's
exact"­
based
statistical
analysis
of
Colitag
 
/
reference
method
comparability
has
been
compared
to
a
revised
analysis
based
on
a
subset
of
data
(
using
only
those
tests
with
clearly
established
hold
times);
details
from
both
analyses
are
included
in
Attachment
3.
The
conclusion
(
i.
e.,
that
the
comparability
study
did
not
identify
a
"
statistically
significant
difference"
in
performance
between
the
reference
methods
and
Colitag
 
)
did
not
change
when
the
smaller
data
set
was
used.

Statistically
significant
differences
in
method
performance
are
indicated
where
the
chi­
squared
probability
or
Fisher's
exact
two­
sided
probability
falls
below
0.05.
As
noted
below,
in
the
case
of
both
total
coliforms
and
E.
coli,
and
for
both
the
complete
(
10­
test)
and
smaller
(
6­
test)
data
set,
both
chi­
squared
and
Fisher's
exact
probabilities
indicate
that
the
comparability
study
did
not
identify
a
statistically
significant
difference
in
method
performance.
Total
coliform
values
are
lower
when
using
the
smaller
data
set,
though
still
well
above
the
threshold.
E.
coli
values
using
the
smaller
data
set
are
greater
than
or
equal
to
those
associated
with
the
larger
data
set;
again,
in
the
case
of
E.
coli
all
values
are
well
above
the
threshold.

10­
test
data
set
6­
test
data
set
Threshold
Statisticallysignificant
difference?

Total
coliforms
Chi­
Squared
probability
0.84
0.51
0.05
no
Fisher's
exact
2­
sided
probability
0.92
0.60
0.05
no
E.
coli
Chi­
Squared
probability
0.92
1.0
0.05
no
Fisher's
exact
2­
sided
probability
1.0
1.0
0.05
no
Having
established
the
conclusion
itself
(
per
the
above),
EPA
next
examined
the
strength
of
such
conclusion
(
i.
e.,
the
confidence
that
the
conclusion
is
accurate)
when
using
a
10­
test
study
versus
a
6­
test
study.

In
discussing
the
"
strength"
of
the
conclusion,
it
is
important
to
first
understand
the
nature
and
limitations
of
the
method
comparison.
EPA's
analysis
is
only
designed
to
identify
whether
there
is
a
statistically
significant
difference
in
two
methods
being
compared
(
the
new/
proposed
method
and
the
reference
method).
If
the
chi­
squared
and
Fishers­
exact
analysis
of
the
comparability
study
results
does
not
identify
a
statistically
significant
difference
between
the
new
method
and
the
reference
method,
EPA
is
satisfied
that
the
new
method
should
be
proposed
for
compliance
monitoring.
2
In
considering
two
methods,
three
situations
are
possible:
(
1)
the
method
performance
is
identical;
(
2)
the
new
method
outperforms
the
reference
method;
or
(
3)
the
new
method
underperforms
the
reference
method.
Either
of
the
first
two
possibilities
is
an
acceptable
situation;
what
EPA
is
concerned
about
in
its
assessment
is
accurately
identifying
those
cases
where
the
reference
method
outperforms
the
new
method
(
i.
e.,
the
third
possibility).
What
EPA
has
considered,
therefore,
is
the
possibility
of
declaring
the
two
methods
"
comparable",
when,
in
fact,
the
reference
method
performs
better
than
the
new
method.

The
potential
for
such
error
depends
on
the
number
of
tests
performed
(
N)
and
the
"
absolute"
(
but
unknown)
success
rate
(
S)
of
the
reference
method
(
i.
e.,
the
degree
to
which
the
reference
method
outperforms
the
new
method);
by
this
definition,
the
reference
method
yields
more
accurate
results
"`
S'%
of
the
time".
Based
on
these
figures,
and
using
a
binomial
distribution
approach,
one
can
calculate
"
beta",
which
represents
"
the
likelihood
of
declaring
the
methods
`
comparable',
when,
in
fact,
the
reference
method
outperforms
the
new
method".
The
likelihood
of
drawing
the
"
right"
conclusion
may
be
represented
as
(
1­
beta).
The
results
of
such
an
analysis
are
summarized
below:

"
S"
(
the
absolute
success
rate
of
the
reference
method)
N
(
number
of
tests)
beta
N
(
number
of
tests)
beta
0.6
10
0.95
6
0.95
0.7
10
0.85
6
0.88
0.8
10
0.62
6
0.74
0.838
10
0.50
6
0.65
0.871
10
0.38
6
0.56
0.891
10
0.30
6
0.50
0.9
10
0.26
6
0.47
0.917
10
0.20
6
0.41
0.956
10
0.07
6
0.24
0.963
10
0.05
6
0.20
In
statistical
analyses,
one
often
wishes
to
establish
the
likelihood
of
drawing
the
"
right"
conclusion
(
in
this
case,
properly
identifying
those
cases
where
the
reference
method
outperforms
the
new
method)
at
a
particular
goal.
One
can
do
so
by
holding
"
1­
beta"
constant
(
i.
e.,
at
the
goal)
and
examining
how
"
S"
changes
for
various
"
N"
values.
Using
such
approach,
the
above
data
suggest
that
if
one
wishes
to
establish
the
likelihood
of
drawing
the
"
right"
conclusion
50%
3
of
the
time,
a
10­
test
study
would
meet
this
goal
if
the
reference
method
truly
outperforms
the
new
method
84%
of
the
time
or
more;
for
a
6­
test
approach,
the
goal
would
be
met
if
the
reference
method
truly
outperforms
the
new
method
89%
of
the
time
or
more.

If
one
wishes
to
establish
the
likelihood
of
drawing
the
"
right"
conclusion
80%
of
the
time,
a
10­
test
study
would
meet
this
goal
if
the
reference
method
truly
outperforms
the
new
method
92%
of
the
time
or
more;
for
a
6­
test
approach,
the
goal
would
be
met
if
the
reference
method
truly
outperforms
the
new
method
96%
of
the
time
or
more.

The
above
points
suggest
that,
even
under
a
worst­
case
scenario,
where
data
from
only
6
tests
are
used
in
the
analysis
of
method
comparability,
the
conclusion
does
not
change,
nor
is
the
strength
of
that
conclusion
substantially
impacted.
