How Many Judges Should There Be in a Group ?
Ann. Data. Sci. (2014) 1(3–4):359–368
DOI 10.1007/s40745-014-0026-4
How Many Judges Should There Be in a Group ?
Thomas L. Saaty · Mujgan Sağır Özdemir
Received: 30 September 2014 / Revised: 1 November 2014 / Accepted: 10 December 2014 /
Published online: 11 January 2015
© Springer-Verlag Berlin Heidelberg 2015
Abstract This paper briefly examines the question of how many judges are needed
to obtain valid and consistent judgments when using the analytic hierarchy process. It
turns out that if a judge is experienced and well versed in an area, he can be sufficient
to provide the judgments instead of diluting his accuracy with the participation of
others who may not be as good. How to discover such a person requires criteria used
to judge his adequacy and that of others.
1 Introduction
We are often asked the question: What sample size of judges is best to provide the
judgments when using the analytic hierarchy process (AHP)? Frequently students ask
the question at universities where the advisor is a statistician. We thought it may be
useful to attempt to answer this question in a short note with the hope that others
who are more expert in gathering and interpreting public opinion may help deepen the
query.
2 Observations
There are many criteria to consider in determining the size of a statistical sample.
They can be found in the statistical literature and rely, among other thing, on the
T. L. Saaty (B)
University of Pittsburgh, Pittsburgh, USA
e-mail:
M. S. Özdemir
Eskisehir Osmangazi University, Eskisehir, Turkey
e-mail:
123
360
Ann. Data. Sci. (2014) 1(3–4):359–368
underlying probability distributions, the prescribed confidence levels and on the size
of the population sampled.
It is very different when one collects judgments in the AHP. Generally, AHP applications are concerned with three different ways to frame the pairwise comparison
questions. The first is to ask which of the pair of elements is more dominant or important with respect to an attribute or criterion, the second is to ask which is the more
likely outcome as in the presidential elections, and the third is to ask which element
is preferred with respect to the attribute, recognizing that preference is entirely subjective and depends on the whims, and likes or dislikes of the individual. We believe
that the preference question can be answered by sampling as is done in statistics, and
any judge can be free to express his or her preference. Validation with respect to what
can happen out there is of no consequence in preference choices. Answering both
importance and likelihood questions requires what is known as expert knowledge in
the subject in which the decision is made.
Two factors affect the sample size of judges required to make the comparisons in the
case of importance or likelihood questions. The first is the consistency of the judgments
and the second their validity in practice. There are two further kinds of situations to
consider. One is whether the judges must agree among themselves; for example a
jury must unanimously declare a verdict of guilty or not guilty and this necessarily
involves interdependence and agreement. The other is when the judgments involve
judges who are independently providing their judgments and may be at large in the
population and unable to carry out discussions with each other. There is a bit of a
dilemma here as we shall show that the need for consistency limits the number to not
more than 7 or 8 judges. What is particularly useful in the AHP is that the judges
themselves can be assigned priorities that make the judgments of a high priority judge
count more than those with lower priority. This is done by raising their individual
judgments to the power or the respective judges’ priorities, then taking the geometric
mean, thus extending more weight to those judges that are believed to have more
expert knowledge. It is done not according to sample size as in taking statistics about
preference, but according to how much and how well they know the subject, based on
some criteria such as education, years of experience, and level of attainment in society.
The usefulness and validity of judgments depend much on how richly or sparsely
the problem is structured. It determines how valuable or lacking a judge may be in
including the important criteria necessary to determine the outcome. If he is unable to
structure the decision in sufficient detail to match what is in known in the law books
about past similar cases, his knowledge may be lacking and potentially faulty.
The usefulness and validity of the pairwise comparison judgments in making a
decision are greater the finer the subcriteria used to compare the alternatives are.
That is because as the scope of the question narrows it is easier for people to make
comparison judgments and ensure their accuracy. The following example, done many
years ago with members of the department of the Interior in Washington, shows the
structure a group arrived at that were trying to decide at what level: half full or full,
a dam should be kept. The figures below show the gradual evolution of the model
from a simple three level hierarchy shown in Fig. 1 to the four level hierarchy shown
in Fig. 2 that includes decision makers, to the five level hierarchy shown in Fig. 3
which includes interests of the decision makers, to the final six level hierarchy shown
123
Ann. Data. Sci. (2014) 1(3–4):359–368
361
Three Level Hierarchy
Focus
At what level should the Dam be kept. Full or Half-Full
Decision Criteria
Financial
Alternaves
Polical
Env’t Protecon
Half-Full-Dam
Social Protecon
Full Dam
Fig. 1 Starting three level model
Four Level Hierarchy
Focus
At what level should the Dam be kept. Full or Half-Full
Decision Criteria
Financial
Congress
Decision Makers
Polical
Env’t Protecon
Dept. of Interior
Courts
Half-Full-Dam
Alternaves
Social Protecon
Lobbies
State
Full Dam
Fig. 2 Four level hierarchy including decision makers
Five Level Hierarchy
Focus
At what level should the Dam be kept. Full or Half-Full
Decision Criteria
Financial
Congress
Decision Makers
Factors
Alternaves
Clout
Legal
Posion
Polical
Env’t Protecon
Dept. of Interior
Courts
State
Potenal
Financial Loss
Inversibility
of the Env’t
Archeological
Problems
Half-Full-Dam
Social Protecon
Lobbies
Current Financial
Resources
Full Dam
Fig. 3 Five level hierarchy including factors that distinguish the expertise of the decision makers
in Fig. 4 that includes people affected by the decision and the final figure is a structure
that has been elaborated to frame the entire story.
This three level hierarchy involves comparing a practical third level of how full a
dam should be with very general and abstract human concerns, a process that is very
hard to do whether by assigning numbers directly or by making comparisons. Again,
let us elaborate this hierarchy in a realistic way by attaching a level of decision makers.
This four level h (...truncated)