The Use of Three-Option Multiple Choice Items for Classroom Assessment
International Journal of Assessment Tools in Education
2018, Vol. 5, No. 2, 314–324
DOI: 10.21449/ijate.421167
Published at http://www.ijate.net
http://dergipark.gov.tr/ijate
Research Article
The Use of Three-Option Multiple Choice Items for Classroom Assessment
Erkan Hasan Atalmış
1*
1
Kahraman Sutcu Imam University, Faculty of Education, Department of Educational Measurement and
Evaluation, Kahramanmaras, Turkey
Abstract: Although multiple-choice items (MCIs) are widely used for
classroom assessment, designing MCIs with sufficient number of plausible
distracters is very challenging for teachers. In this regard, previous empirical
studies reveal that using three-option MCIs provides various advantages when
compared to four-option MCIs due to less preparation and administration
time. This study examines how different elimination methods; namely, the
least selected and the random methods, influence item difficulty, item
discrimination and test reliability on decreasing the number of options in
MCIs from four to three. The research findings have revealed that the
concerning methods did not affect item difficulty, item discrimination, and
test reliability negatively. Results are discussed in relation to promoting
quality classroom assessment.
ARTICLE HISTORY
Received: 01 January 2018
Revised: 23 April 2018
Accepted: 30 April 2018
KEYWORDS
Classroom Assessment,
Item-Writing Guidelines,
Number of Options,
Multiple-Choice Items,
Test Quality
1. INTRODUCTION
Classroom assessment is an indispensable period of education and training. To what
extent the goals and behaviors that students need to gain during the semester has been
determined and how much teachers teach what they think they are teaching has been presented
through classroom assessment. Therefore, it is of high importance for teachers to carry out an
effective in-class assessment, and teachers are required to spend a significant part of their
professional work life in classroom assessment studies (Darling-Hammond & Youngs, 2002;
Stiggins, 1991). Upon examining the related literature, the significance of in-class assessment
was revealed and various recommendations were presented in this context. Among these
recommendations are that paper-pencil tests which are the mostly used method of classroom
assessment should be prepared by the teachers themselves (Frey & Schmitt, 2010). This allows
the assessment tool be consistent and compatible with the class activities as the measurement
tool.
Multiple-choice items (MCIs) are one of the most commonly used item type in classroom
assessment (Haladyna & Rodriguez, 2013). When previous studies were analyzed, both
theoretical and empirical studies regarding reliability and validity of these item types were
CONTACT: Erkan Hasan Atalmış Kahraman Sutcu Imam
University, Faculty of Education, Department of Educational Measurement and Evaluation,
Kahramanmaras, Turkey
ISSN-e: 2148-7456 /© IJATE 2018
314
Int. J. Asst. Tools in Educ., Vol. 5, No. 2, (2018) pp. 314–324
conducted and these were determined to be more reliable and valid than particularly open-ended
items (Collins, 2006; Tarrant, Knierim, Hayes, & Ware, 2006; Thorndike, 2005). However, the
studies emphasized the challenges of preparing the appropriate number of rational choices for
MCIs, so they developed alternative ways related to MCIs.
One of these alternative methods has been considered as a reduction of the number of
options. Although various studies revealed that reducing the number of options from 4 to 3 does
not have a negative effect upon test reliability and item discrimination (Atalmis & Kingston,
2017; Delgado & Prieto, 1998), no consensus has been reached so far on the comparison of
three-option and four-option items in terms of item difficulty. That is, it could not be exclusively
argued that one type is more difficult than the other in all circumstances. Even though
Rodriguez (2005) suggests that the number of options in MCIs may result from different
methods used to reduce the number of options from 4 to 3, this is not revealed empirically.
In this regard, whether different methods used in reducing the number of options from 4
to 3 has an impact upon test reliability, item discrimination and item difficulty will be
empirically examined and thus the use of 3 option items in the classroom assessment is thought
to provide a new path.
1.1. Classroom assessment activities (Assessment Criteria)
The quality of classroom activities was discussed by educators and researchers as
classroom assessment activities play a significant role in improving the outputs of the training.
In this sense, researchers emphasized that classroom assessment activities should aim at
increasing the quality of learning in the classroom, rather than largely through the traditional
sense of passing and failing the exams (Chappuis & Stiggings, 2002; Leahy, Lyon, Thompson,
& Wiliam, 2005). Hence, classroom assessment must have the ability to answer questions such
as how well learners are learning and how effectively teachers teach (Angelo & Cross, 2001).
The most important way to achieve this is to use classroom assessment methods that provide
accurate and descriptive feedback to students and teachers about learning and teaching activities
in the classroom. This is only possible with reliable, valid and useful measuring tools.
Reliability is defined as the accuracy or precision of measurement procedure and so it is
the degree to which measurement are free from error (AERA, APA, & NCME, 2014;
Thorndike, 2005). Errors can arise either from the measurement tool, the measured
characteristic, and the person who measure or from the environment. In this context, test
reliability is negatively influenced by such factors as incorrectly responded questions whose
answers are known to the students, involvement of guessing factor, subjective evaluation of
teachers, testing environment, and cheating. Thus, the fact that tests used in the classroom are
mostly composed of more questions, objectively scored and sensitive in selecting the test
environment will increase the test reliability.
Validity is the test quality that indicates the degree to which a measuring instrument
measures the desired property (AERA, APA, & NCME, 2014; Haladyna & Rodriguez, 2013).
Hence, the validity of a measurement tool is measured through different features, such as
content-related validity, construct validity and criterion-related validity. Content-related
validity is about how much the test covers the features desired to measure (Thorndike, 2005).
To illustrate, the extent to which a test prepared in a mathematics class covers the acquisitions
of the unit that is to be measured relates to content-related validity. In this respect, more
question-based testing also increases content-related validity just as test reliability. The
construct validity refers to the fact that the construct to be measured is measured without any
other mixing (Messick, 1989). For instance, (...truncated)