Large-scale Assessments in Education

http://link.springer.com/journal/40536

List of Papers (Total 59)

Subject-specific strength and weaknesses of fourth-grade students in Europe: a comparative latent profile analysis of multidimensional proficiency patterns based on PIRLS/TIMSS combined 2011

Background In 2011 the Progress in International Reading Literacy Study (PIRLS) and the Trends in International Mathematics and Science Study (TIMSS) were conducted at fourth grade in a number of participating countries with a shared representative sample. In this article we investigate whether there are multidimensional proficiency patterns across the competency domains or not...

Context factors and student achievement in the IEA studies: evidence from TIMSS

Background The present study investigates what factors related to the school context influence student achievement on TIMSS mathematics tests across countries. A systematic review of the literature on PIRLS, TIMSS, and ICCS was conducted upstream to identify those school, teacher, and classroom factors shown to be useful predictors of student performance in previous IEA studies...

A statistical analysis of the characteristics of the intended curriculum for Japanese primary science and its relationship to the attained curriculum

This study statistically investigates the characteristics of the intended curriculum for Japanese primary science, focusing on the learning content. The study used the TIMSS 2011 Grade 4 Curriculum Questionnaire data as a major source for the learning content prescribed at the national level. Confirmatory factor analysis was used to determine the extent to which a topic area was...

Determinants of country differences in effects of parental education on children’s academic achievement

Background In a previous study, the total, direct and indirect effects of parental education on reading, mathematics and science achievement have been estimated for Grade 4 pupils of 37 countries that participated in PIRLS and TIMSS 2011 studies (Gustafsson et al. in TIMSS and PIRLS 2011: Relationships among reading, mathematics, and science achievement at the fourth grade...

Assessment of fit of item response theory models used in large-scale educational survey assessments

Latent regression models are used for score-reporting purposes in large-scale educational survey assessments such as the National Assessment of Educational Progress (NAEP) and Trends in International Mathematics and Science Study (TIMSS). One component of these models is based on item response theory. While there exists some research on assessment of fit of item response theory...

Who likes to learn new things: measuring adult motivation to learn with PIAAC data from 21 countries

Background Despite the importance of lifelong learning as a key to individual and societal prosperity, we know little about adult motivation to engage in learning across the lifespan. Building on educational psychological approaches, this article presents a measure of Motivation-to-Learn using four items from the background questionnaire of the Programme for the International...

Causal inference with large-scale assessments in education from a Bayesian perspective: a review and synthesis

This paper reviews recent research on causal inference with large-scale assessments in education from a Bayesian perspective. I begin by adopting the potential outcomes model of Rubin (J Educ Psychol 66:688-701, 1974) as a framework for causal inference that I argue is appropriate with large-scale educational assessments. I then discuss the elements of Bayesian inference arguing...

Causal inferences with large scale assessment data: using a validity framework

To answer the calls for stronger evidence by the policy community, educational researchers and their associated organizations increasingly demand more studies that can yield causal inferences. International large scale assessments (ILSAs) have been targeted as a rich data sources for causal research. It is in this context that we take up a discussion around causal inferences and...

Is computer availability at home causally related to reading achievement in grade 4? A longitudinal difference in differences approach to IEA data from 1991 to 2006

Research on effects of home computer use on children’s development of cognitive abilities and skills has yielded conflicting results, with some studies showing positive effects, others no effects, and yet others negative effects. These studies have typically used non-experimental designs and one of the main reasons for the conflicting results is that studies differ with respect...

Introduction to instrumental variables and their application to large-scale assessment data

In the social sciences, estimating causal effects is particularly difficult. Gold standards are set by randomized experiments in many cases expensive, unenforceable for ethical and practical reasons. Recent research has drawn attention to techniques that under some conditions, could estimate causal effects on non-experimental observable data. One technique is the instrumental...

Does non-participation in preschool affect children’s reading achievement? International evidence from propensity score analyses

While expectations are high for early childhood education to support students’ reading literacy, research findings are inconclusive. The purpose of the study is to estimate the effect of preschool non-participation on reading literacy at the end of primary school. That is, what is the average achievement of children who did not attend preschool compared to what it would have been...

A multilevel analysis of Swedish and Norwegian students’ overall and digital reading performance with a focus on equity aspects of education

Background Influence of external factors in general, and socioeconomic background factors in particular, on traditional reading performance has been extensively researched and debated. While traditional reading is well investigated in this respect, there is a lack of studies on equity aspects related to digital reading achievement, in spite of the fact that time spent on reading...

Gender differences in variability and extreme scores in an international context

This study examines gender differences in the variability of student performance in reading, mathematics and science. Twelve databases from IEA and PISA were used to analyze gender differences within an international perspective from 1995 to 2015. Effect sizes and variance ratios were computed. The main results are as follows. (1) Gender differences vary by content area, students...

TIMSS data in an African comparative perspective: Investigating the factors influencing achievement in mathematics and their psychometric properties

Relationships among motivational constructs from the 2011 Trends in International Mathematics and Science Study (TIMSS 2011) were investigated for eight-graders in all the five participating African countries, representing 38,806 (49 % girls). First, we investigated the psychometric properties (factor structure, reliabilities, method effect, and measurement invariance—country and...

Affective characteristics and mathematics performance in Indonesia, Malaysia, and Thailand: what can PISA 2012 data tell us?

Background The results of the Programme for International Student Assessment (PISA) 2012 showed that Indonesia, Malaysia, and Thailand underperformed and were positioned in the bottom third out of 65 participating countries for mathematics, science, and reading literacies. The wide gap between these three countries and the top performing countries has prompted this study to...

Examining the attitude-achievement paradox in PISA using a multilevel multidimensional IRT model for extreme response style

In this paper, we consider a two-level multidimensional item response model that examines country differences in extreme response style (ERS) as a possible cause for the achievement-attitude paradox in PISA 2006. The model is an extension of Bolt & Newton (2011) that uses response data from seven attitudinal scales to assess response style and to control for its effects in...

Changes in achievement on PISA: the case of Ireland and implications for international assessment practice

The PISA 2009 results for Ireland indicated a large decline in reading literacy scores since PISA 2000 (the largest of 38 countries). The decline in mathematics scores since PISA 2003 was the second largest of 39 countries. In contrast, there was no change in science achievement since PISA 2006. These results prompted detailed investigations into possible reasons for the declines...

Migrant pupils’ scientific performance: the influence of educational system features of origin and destination countries

Background Earlier studies using a double perspective (destination & origin) indicate that several macro-characteristics of both destination and origin countries affect the educational performance of migrant children. This paper explores the extent to which educational system features of destination and origin countries can explain these differences in educational achievement of...

An item response theory approach to longitudinal analysis with application to summer setback in preschool language/literacy

Background As the popularity of classroom observations has increased, they have been implemented in many longitudinal studies with large probability samples. Given the complexity of longitudinal measurements, there is a need for tools to investigate both growth and the properties of the measurement scale. Methods A practical IRT model with an embedded growth model is illustrated...

The acquisition of problem solving competence: evidence from 41 countries that math and science education matters

Background On the basis of a ‘problem solving as an educational outcome’ point of view, we analyse the contribution of math and science competence to analytical problem-solving competence and link the acquisition of problem solving competence to the coherence between math and science education. We propose the concept of math-science coherence and explore whether society...

Nested multiple imputation in large-scale assessments

Background In order to measure the proficiency of person populations in various domains, large-scale assessments often use marginal maximum likelihood IRT models where person proficiency is modelled as a random variable. Thus, the model does not provide proficiency estimates for any single person. A popular approach to derive these proficiency estimates is the multiple imputation...

Using response time to investigate students' test-taking behaviors in a NAEP computer-based study

Background Large-scale survey assessments have been used for decades to monitor what students know and can do. Such assessments aim at providing group-level scores for various populations, with little or no consequence to individual students for their test performance. Students' test-taking behaviors in survey assessments, particularly the level of test-taking effort, and their...

Detecting differential item functioning using generalized logistic regression in the context of large-scale assessments

Background When studying student performance across different countries or cultures, an important aspect for comparisons is that of score comparability. In other words, it is imperative that the latent variable (i.e., construct of interest) is understood and measured equivalently across all participating groups or countries, if our inferences regarding performance can be regarded...

Differential influences of affective factors and contextual factors on high-proficiency readers and low-proficiency readers: a multilevel analysis of PIRLS data from Hong Kong

Background This study examined the impact of the reading-related affective factors home environment and school environment on predicting the likelihood of students being either high-proficiency or low-proficiency readers. Data from 3,875 Hong Kong SAR Grade 4 students participating in an international comparative assessment were analyzed. Methods Multilevel regression analysis...