A meeting report: OECD-GESIS Seminar on Translating and Adapting Instruments in Large-Scale Assessments (2018)
Behr and Zabal Measurement Instruments for the Social Sciences
https://doi.org/10.1186/s42409-019-0011-y
(2019) 1:10
MEETING REPORT
Open Access
A meeting report: OECD-GESIS Seminar on
Translating and Adapting Instruments in
Large-Scale Assessments (2018)
Dorothée Behr*
and Anouk Zabal
In memoriam of Fons van de Vijver
Abstract
This report summarizes the main themes and conclusions from the OECD-GESIS Seminar on Translating and
Adapting Instruments in Large-Scale Assessments, which took place at the Organization for Economic Co-operation
and Development (OECD), Paris, in June 2018. The five sessions covered the topics (1) etic (universal) vs. emic
(culture-specific) measurement instruments, (2) language- and culture-sensitive development of measurement
instruments, (3) international guidelines vs. implementation in countries and by translators, (4) tools and
technological developments, and (5) quality control of translations. Key players in the field presented on best
practice, lessons learned, and innovations and also made suggestions for moving the field forward.
Keywords: Cross-national, Cross-cultural, Translation, Adaptation, Comparability, Equivalence, Assessment, Test,
Questionnaire, Instrument
Introduction
The OECD has recently launched a methodological
seminar series to foster discussion among and crossfertilization across the different stakeholders involved
in designing, managing, and analyzing large-scale
assessments. The seminars address both theoretical and
practical developments (Organization for Economic
Co-operation and Development, 2018; Thorn, 2018).
With the Programme for International Student Assessment (PISA) and the Programme for the International
Assessment of Adult Competencies (PIAAC), to
name but two major OECD studies, the OECD is
one of the key players and drivers behind comparative assessment and, thus, very well placed to launch
this important series. The topic chosen for the 2018
seminar was translation and adaptation of measurement instruments, given its central importance in
achieving comparable data. William Thorn from the
OECD, together with Dorothée Behr and Anouk
Zabal from GESIS – Leibniz Institute for the Social
* Correspondence:
GESIS – Leibniz Institute for the Social Sciences, P.O. Box 12 21 55, 68072
Mannheim, Germany
Sciences (Mannheim, Germany), were responsible for
setting up the agenda and bringing together a unique
group of speakers with wide-ranging international expertise. The talks by key players in the field, including both academics and practitioners, were followed
by 113 international participants. The overarching
questions “What is comparability?” and “How can
translations be produced that meet the objectives for
comparability?” were addressed across different stages
of instrument development and production. The
agenda was structured along the following topics (see
Table 1):
The structure of the seminar reflected the fact that
thinking about translation quality and comparability
should essentially start at the development stage of
the source instrument and not just at the translation
stage. After all, if translatability or other comparability
issues are only detected once the translation process
has started, it is often too late to modify the source
instrument to counteract these problems. The presenters in each session were encouraged to present
and discuss current implementations and best practice, limitations, and future directions. The sessions
© The Author(s). 2019 Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0
International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and
reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to
the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver
(http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
Behr and Zabal Measurement Instruments for the Social Sciences
(2019) 1:10
Table 1 Overview of topics covered at the seminar
Stage
Topics
Source
instrument
development
1. Etic (universal) vs. emic (culture-specific)
measurement instruments
Translation
3. International guidelines vs. implementation by
countries and by translators
2. Language- and culture-sensitive development
of measurement instruments to ensure
comparability, cultural relevance, and translatability
4. Tools and technological developments
Quality control
5. Quality control of translations
were organized with a view to triggering a constructive
discussion among both presenters and the audience and
towards fostering an exchange of ideas between the very
heterogeneous players in the area of translation and
adaptation of measurement instruments. This report
is structured along the seminar topics, as outlined in
Table 1.
Etic (universal) vs. emic (culture-specific)
measurement instruments
The first session raised the fundamental question as to
which kind of measurement instrument is best suited
to achieve comparability in cross-national studies. The
internationally widely acclaimed researcher Fons van de
Vijver (2018) set the scene for the entire seminar with
the first presentation. He made a convincing plea for
the need to combine both etic and emic instruments.
Etic instruments rely on the assumption of universally
applicable constructs that can be “transported” into
other cultures through translation. Advantages of such
instruments include the ease of direct cross-cultural
comparison and the use of tried-and-tested instruments. Emic instruments, on the other hand, rely on
culture-specific operationalization of constructs; advantages of these instruments include increased ecological
validity and construct coverage as well as the reduction
of Western bias in the case on non-Western countries.
Studies such as PISA or PIAAC predominantly follow
an etic approach that calls for translation of source instruments and allows for only minor types of adaptations within a clearly defined framework. With the
increase of countries and thus of cultural variation in
such studies, three types of paradoxes come to the fore:
(a) the “analysis paradox,” according to which fewer
conclusions can be drawn because scalar equivalence,
the highest form of equivalence which allows for direct
comparison of means, is increasingly difficult to
achieve; (b) the “test design paradox,” according to
which the cultural coverage decreases since it is necessary to focus on content that has at least some relevance in all participating countries; and (c) the “test
length paradox,” according to which more items lead to
Page 2 of 7
more design and analysis problems—longer instruments may be more informative for the different stakeholders, but they are also less likely to show a high
level of invariance. Against this backdro (...truncated)