An algorithm to apply Fisher's exact test to any twodimensional contingency table
There are
two main difficulties in applying Fisher's
exact test to contingency tables with more than two rows and two
columns. Taken
together, these two difficulties have led some textbook authors to say
categorically that the test can be used only for 2 ´ 2 tables. The first difficulty is the lack
of an algorithm that is both simple to program and known to be
logically sound.
The second difficulty is that the computation becomes extremely
timeconsuming
when the table increases in size. This paper presents solutions to both
difficulties, and an approximate method of estimating the magnitude of
the
second difficulty.
Consider
the following 3 ´ 4 contingency table, which shows how many
students (randomly sampled from a fictitious population) fall into four
mutually exclusive subject categories and three mutually exclusive
voting
categories.

Arts 
Science 
Divinity 
Maths 
Total 
Party 1 
3 
1 
0 
6 
10 
Party 2 
1 
4 
4 
2 
11 
Party 3 
0 
1 
5 
0 
6 
Total 
4 
6 
9 
8 
27 
Given the
marginal totals, and the null hypothesis
that the row categories are independent of the column categories, the
expected
frequencies within the table would be as follows.

1.48 
2.22 
3.33 
2.96 

1.63 
2.44 
3.67 
3.26 

0.89 
1.33 
2.00 
1.78 
The
observed departure from the expected frequencies
could be measured, but not reliably tested, by Pearson's c^{2}
statistic, which in this
example has a value of 17.675.
In
principle, Fisher’s method is quite simple. We
generate every possible table that is compatible with the given
marginal
totals, and calculate the exact probability p
of each table, using Fisher’s formula (1934). In the above example
there are
8634 possible tables that would give the same marginal totals. We then
need to
partition the tables into those that are more extreme than the observed
table
(i.e. less compatible with the null hypothesis) and those that are less
extreme. By summing the probabilities of the extreme tables we obtain a
P value that is used in the usual
classical way as a test of the null hypothesis.
One
obvious approach is to calculate the exact
probability that, given these marginal totals, the departure from the
expected
frequencies, as measured by c^{2}
, would be equal to or
greater than 17.675.
An
alternative and intuitively appealing approach
suggested by Anscombe ( ) was that the value of p
calculated by Fisher’s formula for a particular possible table
should be used not only for calculating the sum P of
the probabilities of extreme tables, but also as a measure of
how extreme the table is. Those values of p
that are smaller than the p value of
the observed table are summed to find the total probability of all
tables more
extreme than the observed one.
A further
detail remains to be decided: what do we
do with tables that are exactly as extreme as the observed table? I
have
adopted another of Anscome’s suggestions here: the P
value used to test H_{o} is the sum of the p
values smaller than that of the
observed table plus half the sum of the p
values equal to that of the observed table. This decision has been the
subject
of a number of studies, so I shall not elaborate on it here.
For
contingency tables more complex than 2 ´ 2 tables, the difficulty is to design a
systematic way of generating every possible table, so that it can be
tested for
extremity and its exact probability calculated.
The
systematic method developed here is to attach a
unique numerical value L to each possible table, and work through the
tables in
order of their values of L, so that none of the tables can be
overlooked or
counted twice. Consider the cells of the table, i.e. the elements of an
array
A, to be read like a book from left to right and from top to bottom.
These
elements when read in that order represent the digits of the number L,
(using
arithmetic to any base that is greater than the largest element). When
we are
considering a particular element, we call the preceding elements the
higherorder elements of the array, and the later numbers the
lowerorder
elements. We start with the table that has the smallest possible value
of L,
and then repeatedly increment L by the smallest possible increment
consistent
with the constraints of the given marginal totals, and making sure that
no
element is ever less than zero. This algorithm depends on two
subroutines
described below.
The first
subroutine (labelled as Minimise in the source code)
minimises
the value of the loworder elements of L, starting from row y
and column x, without altering the higherorder
elements, and without altering
the marginal totals.
This
subroutine depends on the following theorem,
whose proof is rather too obvious to be worth spelling out:
the
element
A(y,x) can be reduced without altering the higherorder elements or the
marginal
totals if and only if there is a nonzero element A(y+v, x+u) where u,
v are
positive integers.
The second
subroutine (labelled as Increment in the source code)
is
designed to increment by 1 the lowestorder element of A that is
capable of
being incremented without altering the higherorder elements or the
marginal
totals.
This
subroutine depends on the following theorem,
whose proof is even more obvious:
an
element
A(y,x) can be incremented without altering the higherorder elements or
the
marginal totals if and only if there exist two nonzero elements
A(y+v,x) and
A(y,x+u) where u, v and positive integers.
The
subroutines Increment
and Minimise are used alternately (in
the UK sense of the word).
There are
reasonable grounds for believing that this
systematic method of working through the possible contingency tables
does, in
fact, generate every possible table exactly once. Firstly, the program
based on
it has given the same answers as other software packages when applied
to 2 ´ 2 tables.
Secondly, it agrees exactly with
the published analyses of Yates (1934) for 2 ´ 3 tables. Thirdly, when the
probabilities calculated by equation 1 are summed for all possible
contingency
tables consistent with a given set of marginal totals, the answer found
has
invariably been between 0.99999 and 1.00001 for numerous analyses in
which the
number of possible tables varies from 6 to 2954315. Fourthly, I have
made the
source code available so that users can check its logic (see
http://www.statistics.org.uk).
For each
of the possible contingency tables
generated by the subroutine Increment,
the calculation of its exact probability p
is fairly simple, and is based on the work of Fisher (1934) and Yates
(1934).
If n_{1}, n_{2}, n_{3} . . . are the row totals and m_{1},
m_{2},
m_{3 } . . . are
the column totals and a_{1}, a_{2}, a_{3}
. . . are the elements of the array, and G
is the grand total, then
_{}
Apart from the difficulty of logical design, there remains the difficulty of lengthy computation. One method that I have used to reduce the computation time is to create a lookup table of logarithms of factorials at the start, so that the above formula can be evaluated quickly. Another obvious method is to calculate the logarithm of _{} only once at the start of the calculation, because it is a common factor in the value of p for all possible contingency tables with the given marginal totals.
A much more important method of reducing the computation time depends on recognising whole sets of possible tables that are all more extreme than the observed table, or are all less extreme. This saves us from having to generate them all one by one.
Table 1. Some examples of contingency tables analysed by the Exact Test. All these tables took less than 1 second of computing time with a 1.8 GHz Pentium 4 processor, except the last table, which took 62 seconds.
Observed contingency table 
Estimated numberof possible tableswith samemarginal totals 
Actual number of possible tables with same marginal totals 
P value by Fisher’s exact test 







25 
1 





18 
4 


6 
6 
0.109 







50 
4 
12 




2 
2 
3 

56 
35 
0.017 







50 
4 
2 




2 
3 
5 

64 
54 
7.3 ´ 10^{6} 







6 
9 
2 




7 
2 
4 

84 
74 
0.101 







25 
8 
14 




55 
27 
9 

864 
798 
0.0098 







1 
6 
0 




0 
12 
3 




2 
6 
1 

1736 
1224 
0.157 
1 
14 
0 











0 
7 
0 




0 
11 
4 




3 
6 
0 

1736 
1224 
0.011 
1 
14 
0 











3 
1 
0 
6 



1 
4 
4 
2 
7503 
8634 
0.0033 
0 
1 
5 
0 










3 
1 
0 
6 



1 
4 
4 
2 



0 
1 
5 
0 
5799660 
2954315 
0.020 
2 
5 
3 
6 










When only
part of a possible contingency table has
been filled in, it is sometimes possible to recognise that the table
when
completed is bound to be extreme, or is bound not to be extreme. It is
also
possible to calculate the exact probability of a partly specified
table. So it
is not necessary to calculate the exact probability of every particular
table:
a whole set of tables can often be processed in one short calculation.
How
do we
recognise when a partly specified table T_{0} is bound to lead
only to
extreme tables or only to nonextreme tables?
Suppose we
were using Fisher’s p as a measure of how extreme a
table is.
_{}
This
expression can be factorised into _{}and
a set of separate factors _{}for
each row or column. For a given row or column with c
cells still to be allocated and a
total of N observations to be
distributed among them, the minimum conceivable value of p
will occur when a_{1}
= a_{2} = a_{3} . . .
= N/c and the maximum conceivable value of
p will occur when one of the cells
contains N and the others contain
zero. Thus the contribution of a particular row or column to ln(p) will always lie between _{}and
_{}
If the
column totals vary more than the row totals,
this formula is best applied column by column; otherwise it is best
applied row
by row. This will help to make the limits narrower and more
informative.
For
example, suppose we have the partly specified
table as shown below.
3 
1 
2 
5 
11 
4 
6 


20 




8 




4 




6 
12 
11 
14 
12 
49 
Rows 2, 3, 4 and 5 respectively have 10, 8, 4 and 6 observations still to be distributed. The contributions of these rows to ln(p) therefore lies between the following values.
Row no. 
N 
c 
N/c 
cLn((N/c)!)
= Min contribution
to ln(p) 
Ln(N!)
= Max contribution
to ln(p) 
2 
10 
2 
5 
9.574983 
15.10441 
3 
8 
4 
2 
2.772589 
10.6046 
4 
4 
4 
1 
0 
3.178054 
5 
6 
4 
1.5 
5.3174 
6.579251 
Total 



17.66497 
35.46632 
The
contribution U of the unspecified
cells to ln(p) is therefore somewhere between
17.66497
and 35.46632.
The
contribution S of the six specified
cells to ln(p) is _{} We also use the fact
that the contribution M of the
marginal totals is _{}18.30205
The
conclusion is that _{}
and
so p = 1.412 ´ 10^{15} to 7.599 ´ 10^{8}.
If the value of Fisher’s p for the observed table lies outside this range then we do not need to construct all possible tables that start with 3, 1, 2, 5, 4, 6 . . . because we know whether they are more extreme or less extreme than the observed table. We merely need to calculate the sum of their p values, and this can be done very quickly. It is the same as the a priori probability that the table will start with 3, 1, 2, 5, 4, 6.
The
following procedure is designed only for tables
that are filled in from left to right and from top to bottom, like
writing a
paragraph. The empty rows are all at the bottom, and the empty cells in
the
partly completed row are all to the right.
First,
complete any row or column that has only one
cell unspecified: this can be done in only one way that is compatible
with the
marginal totals.
Then,
starting from the partly completed contingency
table T_{0}, construct two complete contingency tables T_{1}
and T_{2} as follows. T_{1} has the same column totals
as T_{0}
and_{ }comprises all the complete rows of T_{0}, plus
one
complementary row, which can be filled in in only one way that is
compatible
with the column totals.
Construct T_{2}
from the partly filled row
of T_{0} plus one complementary row. The column totals of T_{2}
are equal to the contents of the last row of T_{1}.
At this
stage, the rationale of the calculation
begins to be clear. The hypothetical allocation of numbers in all
possible ways
to complete the table T_{0} could be done in two stages, with
calculable probabilities. First, the numbers could be allocated to the
cells of
T_{1}, and then the last row of T_{1} could be further
divided
among the cells of T_{2}. The exact probability of T_{0}
is
therefore the product of the probabilities of T_{1} and T_{2}.
The exact
probability p_{1} of T_{1} is
calculated using Fisher’s
formula. Before doing the same with T_{2}, the empty columns to
the
right of T_{2} are merged into one empty column, whose contents
can be
filled in in only one way that is compatible with the marginal totals.
We can
then calculate the exact probability p_{2}
of T_{2} and hence calculate p
= p_{1} ´ p_{2},
which is the exact probability of the partly specified table T_{0}.
A worked
example is shown below. Suppose our partly
completed table T_{0} is as follows.
3 
1 
2 
5 
11 
4 
6 


20 




8 




4 




6 
12 
11 
14 
12 
49 
Then table
T_{1} is as follows, and its
exact probability p_{1} is 0.0059862.
3 
1 
2 
5 
11 
9 
10 
12 
7 
38 
12 
11 
14 
12 
49 
Table T_{2}
is initially
4 
6 
? 
? 
20 
5 
4 
? 
? 
18 
9 
10 
12 
7 
38 
and when
condensed it becomes
4 
6 
10 
20 
5 
4 
9 
18 
9 
10 
19 
38 
And its
exact probability is 0.0727953.
The exact
probability of table T_{0} is
therefore 0.0059862 ´ 0.0727953 = 0.000435767.
One
special case needs to be catered for explicitly:
if table T_{0} has no complete rows, then T_{1} is
empty and p_{1}
= 1.
Counting the Cost
It is wise before embarking on a calculation to try to count the cost. If we started a Fisher’s exact test with no estimate of the time required, we could be in the tantalising position of having run the program for several days and be reluctant to abandon the attempt since the solution may be only minutes away. Our decision would be easier if we had an approximate measure of the number of possible tables that would be compatible with the marginal totals.
Gail and
Mantel proposed an approximate method of
estimating the total number of possible tables that are compatible with
given
marginal totals. Their algorithm has been used by Metha and Patel in
their
calculation of exact probabilities. I have used a similar but not
identical
method. My software agrees with the worked example of Gail and Mantel,
but does
not agree with the calculations of Metha and Patel: I suspect that they
have
made a programming error.
A slight
but useful refinement can be made to Gail
and Mantel’s method. Their formulae do not treat rows and columns as
equivalent. The predicted number of possible tables changes when the
table is
rotated or transposed so that the rows become columns and the columns
become
rows. I have found that the prediction is substantially better if Gail
and
Mantel’s method is carried out twice, rotating the table between runs,
and then
using the mean of the two estimates. See fig. 1.
As the
number of rows and columns increases, and as
the counts within the cells increase, the calculation soon becomes
impracticably slow. But it may be noticed that it is a calculation that
lends
itself very easily to vector processing or parallel processing, since
there is
no need to complete the analysis of one possible contingency table
before
beginning the analysis of other possible tables with the same marginal
totals.
We may expect therefore that when parallel and vector processors become
readily
available it should become easier to analyse large contingency tables
in this
way.
Fisher (1934). The logic of inductive inference. Journal of the Royal Statistical Society, Series A, 98: 3954.
Yates, F. (1934). Contingency Tables involving small numbers and the Chisquared test. Journal of the Royal Statistical Society, Supplement 1, pages 217235.
Anscombe, F. J. (1981) Computing in Statistical Science through APL. New York: SpringerVerlag. Page 295.