How Many Judges Should There Be in a Group ?

Annals of Data Science, Jan 2015

This paper briefly examines the question of how many judges are needed to obtain valid and consistent judgments when using the analytic hierarchy process. It turns out that if a judge is experienced and well versed in an area, he can be sufficient to provide the judgments instead of diluting his accuracy with the participation of others who may not be as good. How to discover such a person requires criteria used to judge his adequacy and that of others.

Article PDF cannot be displayed. You can download it here:

https://link.springer.com/content/pdf/10.1007%2Fs40745-014-0026-4.pdf

How Many Judges Should There Be in a Group ?

Ann. Data. Sci. (2014) 1(3–4):359–368 DOI 10.1007/s40745-014-0026-4 How Many Judges Should There Be in a Group ? Thomas L. Saaty · Mujgan Sağır Özdemir Received: 30 September 2014 / Revised: 1 November 2014 / Accepted: 10 December 2014 / Published online: 11 January 2015 © Springer-Verlag Berlin Heidelberg 2015 Abstract This paper briefly examines the question of how many judges are needed to obtain valid and consistent judgments when using the analytic hierarchy process. It turns out that if a judge is experienced and well versed in an area, he can be sufficient to provide the judgments instead of diluting his accuracy with the participation of others who may not be as good. How to discover such a person requires criteria used to judge his adequacy and that of others. 1 Introduction We are often asked the question: What sample size of judges is best to provide the judgments when using the analytic hierarchy process (AHP)? Frequently students ask the question at universities where the advisor is a statistician. We thought it may be useful to attempt to answer this question in a short note with the hope that others who are more expert in gathering and interpreting public opinion may help deepen the query. 2 Observations There are many criteria to consider in determining the size of a statistical sample. They can be found in the statistical literature and rely, among other thing, on the T. L. Saaty (B) University of Pittsburgh, Pittsburgh, USA e-mail: M. S. Özdemir Eskisehir Osmangazi University, Eskisehir, Turkey e-mail: 123 360 Ann. Data. Sci. (2014) 1(3–4):359–368 underlying probability distributions, the prescribed confidence levels and on the size of the population sampled. It is very different when one collects judgments in the AHP. Generally, AHP applications are concerned with three different ways to frame the pairwise comparison questions. The first is to ask which of the pair of elements is more dominant or important with respect to an attribute or criterion, the second is to ask which is the more likely outcome as in the presidential elections, and the third is to ask which element is preferred with respect to the attribute, recognizing that preference is entirely subjective and depends on the whims, and likes or dislikes of the individual. We believe that the preference question can be answered by sampling as is done in statistics, and any judge can be free to express his or her preference. Validation with respect to what can happen out there is of no consequence in preference choices. Answering both importance and likelihood questions requires what is known as expert knowledge in the subject in which the decision is made. Two factors affect the sample size of judges required to make the comparisons in the case of importance or likelihood questions. The first is the consistency of the judgments and the second their validity in practice. There are two further kinds of situations to consider. One is whether the judges must agree among themselves; for example a jury must unanimously declare a verdict of guilty or not guilty and this necessarily involves interdependence and agreement. The other is when the judgments involve judges who are independently providing their judgments and may be at large in the population and unable to carry out discussions with each other. There is a bit of a dilemma here as we shall show that the need for consistency limits the number to not more than 7 or 8 judges. What is particularly useful in the AHP is that the judges themselves can be assigned priorities that make the judgments of a high priority judge count more than those with lower priority. This is done by raising their individual judgments to the power or the respective judges’ priorities, then taking the geometric mean, thus extending more weight to those judges that are believed to have more expert knowledge. It is done not according to sample size as in taking statistics about preference, but according to how much and how well they know the subject, based on some criteria such as education, years of experience, and level of attainment in society. The usefulness and validity of judgments depend much on how richly or sparsely the problem is structured. It determines how valuable or lacking a judge may be in including the important criteria necessary to determine the outcome. If he is unable to structure the decision in sufficient detail to match what is in known in the law books about past similar cases, his knowledge may be lacking and potentially faulty. The usefulness and validity of the pairwise comparison judgments in making a decision are greater the finer the subcriteria used to compare the alternatives are. That is because as the scope of the question narrows it is easier for people to make comparison judgments and ensure their accuracy. The following example, done many years ago with members of the department of the Interior in Washington, shows the structure a group arrived at that were trying to decide at what level: half full or full, a dam should be kept. The figures below show the gradual evolution of the model from a simple three level hierarchy shown in Fig. 1 to the four level hierarchy shown in Fig. 2 that includes decision makers, to the five level hierarchy shown in Fig. 3 which includes interests of the decision makers, to the final six level hierarchy shown 123 Ann. Data. Sci. (2014) 1(3–4):359–368 361 Three Level Hierarchy Focus At what level should the Dam be kept. Full or Half-Full Decision Criteria Financial Alternaves Polical Env’t Protecon Half-Full-Dam Social Protecon Full Dam Fig. 1 Starting three level model Four Level Hierarchy Focus At what level should the Dam be kept. Full or Half-Full Decision Criteria Financial Congress Decision Makers Polical Env’t Protecon Dept. of Interior Courts Half-Full-Dam Alternaves Social Protecon Lobbies State Full Dam Fig. 2 Four level hierarchy including decision makers Five Level Hierarchy Focus At what level should the Dam be kept. Full or Half-Full Decision Criteria Financial Congress Decision Makers Factors Alternaves Clout Legal Posion Polical Env’t Protecon Dept. of Interior Courts State Potenal Financial Loss Inversibility of the Env’t Archeological Problems Half-Full-Dam Social Protecon Lobbies Current Financial Resources Full Dam Fig. 3 Five level hierarchy including factors that distinguish the expertise of the decision makers in Fig. 4 that includes people affected by the decision and the final figure is a structure that has been elaborated to frame the entire story. This three level hierarchy involves comparing a practical third level of how full a dam should be with very general and abstract human concerns, a process that is very hard to do whether by assigning numbers directly or by making comparisons. Again, let us elaborate this hierarchy in a realistic way by attaching a level of decision makers. This four level h (...truncated)


This is a preview of a remote PDF: https://link.springer.com/content/pdf/10.1007%2Fs40745-014-0026-4.pdf
Article home page: https://link.springer.com/article/10.1007/s40745-014-0026-4

Thomas L. Saaty, Mujgan Sağır Özdemir. How Many Judges Should There Be in a Group ?, Annals of Data Science, 2015, pp. 359-368, Volume 1, Issue 3-4, DOI: 10.1007/s40745-014-0026-4