Effective Learning During COVID-19: Multilevel Covariates Matching and Propensity Score Matching (pdf)

Article PDF cannot be displayed. You can download it here:

https://link.springer.com/content/pdf/10.1007/s40745-022-00392-x.pdf

Effective Learning During COVID-19: Multilevel Covariates Matching and Propensity Score Matching

Annals of Data Science https://doi.org/10.1007/s40745-022-00392-x Effective Learning During COVID-19: Multilevel Covariates Matching and Propensity Score Matching Siying Guo1 · Jianxuan Liu2 · Qiu Wang3 Received: 13 January 2021 / Revised: 6 March 2022 / Accepted: 12 March 2022 © The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2022 Abstract In large-scale observational data with a hierarchical structure, both clusters and interventions often have more than two levels. Popular methods in the binary treatment literature do not naturally extend to the hierarchical multilevel treatment case. For example, most K-12 and universities have moved to an unprecedented hybrid learning module during the COVID-19 pandemic where learning modes include hybrid and fully remote learning, while students were clustered within a class and school region. It is challenging to evaluate the effectiveness of the learning outcomes of the multilevel treatments in a hierarchically data structured. In this paper, we study a covariates matching method and develop a generalized propensity score matching method to reduce the bias of estimation in the intervention effect. We also propose simple algorithms to assess the covariates balance for each approach. We examine the finite sample performance of the methods via simulation studies and apply the proposed methods to analyze the effectiveness of learning modes during the COVID-19 pandemic. Keywords COVID-19 · Generalized propensity score · Matching · Multilevel hybrid learning · Potential outcome B Jianxuan Liu Siying Guo Qiu Wang 1 School of Criminal Justice and Public Administration, Kean University, Union, USA 2 Department of Mathematics, Syracuse University, Syracuse, USA 3 School of Education, Syracuse University, Syracuse, USA 123 Annals of Data Science 1 Introduction The global impact of COVID-19 [1] has led to social and economic crises [2], further widening inequalities and exacerbating global poverty. In response to the COVID19 pandemic, local, state, and federal agencies have implemented social distancing or lockdown measures designed to slow the spread of the disease [3]. With the implementation of these measures, our daily routines have been changed, which has profoundly impacted the academic learning as well as psychological and physical health of K-12 and college students. Students with special needs, minorities, and poor students experienced additional negative impacts [4]. The UN 2020 report [5] showed that since the outbreak of COVID-19 began, there were more than 1.52 billion children and youth, 87% of the global population, unable to learn in traditional classroom settings. Most K-12 and colleges have moved to online or a new hybrid learning module to maintain social distancing and slow the spread of disease. In effect, the pandemic has been an extraordinarily challenging time for teachers and students, especially the transition to new teaching and learning modes. However, the pandemic has created an opportunity to rethink how we educate and to improve pedagogies to help students succeed. In the era of big data, numerous works has been conducted to learn knowledge from large scale of data, for example, [6–10] and among other. In terms of educational policy, it is fundamentally important to evaluate the effectiveness of the unprecedented learning modules during the pandemic across a wide range of school clusters and student backgrounds. In order to understand the complexities these complexities, we engaged analysis of hierarchically structured data from observational studies. In large-scale observational data with a hierarchical structure, both clusters and interventions often have more than two levels [11]. The larger units are clusters [12], groups [13, 14], communities [15], or schools [16–19]. It is referred to the clusterrandomized trails (CRT) design [12, 14]. CRT design ensures that each cluster consists of multiple comparable individuals [20] to ensure that their baseline characteristics do not confound with corresponding outcomes. For example, in educational studies using cluster design, the cluster sizes varied from 5 in ECLS (Early Childhood Longitudinal Study) to 60 in LSAY(Longitudinal Study of American Youth) and the mean is about 13 [21]. When clusters are assigned to educational interventions, unbalance in baseline covariates among groups often occurs [22], which results in selection bias. Selection bias is ubiquitous in observational studies when the “golden rule” of randomization fails [23], and will not necessary lead to causal estimate of the intervention [24]. However, it is not always plausible to conduct randomized trials due to cost related concerns and more importantly ethical issues. When large-scale hierarchically structured data from observational studies are employed, it is crucial to remove selection bias [16, 25] so that the data can be viewed as if they were from randomized studies. Similar with all observational studies, the baseline characteristics among individuals in each hierarchical cluster are not guaranteed to be comparable, thus confounding the outcome. To establish the causal effect of the intervention, a large body of literature has been developed to evaluate the average causal effect consistently. The methods can be classified as matching [26–29], stratification [26, 30], covariance adjustment [26], 123 Annals of Data Science inverse probability weighting [31–34], and augmented inverse probability weighting which provides double protection to model misspecification [35–38]. To estimate causal effect using observational data, it is preferable to resemble a randomized experiment as closely as possible through balancing the covariates among different treatment groups. It is natural to match on covariates so that the observed samples can be viewed as if they were from a randomized experiment. It is not always possible to obtain matched covariates when there are larger numbers of clusters while the size in each cluster is not large due to the sparsity in the hierarchically structured population. [26] made an important advancement with the introduction of the propensity score to circumvent the problem. The propensity score is defined as the conditional probability of an treatment given the observed covariates. It is also a balancing score in the sense that conditional on the propensity score, the distributions of the measured covariates are the same between treatment groups. In order to evaluate the actual intervention effects of instructional or educational methods, a growing number of educational studies have employed propensity score as a method for reducing bias known to plague observational studies and increasing the balance between treatment and comparison groups [39–43]. However, these studies focused on either binary treatment options or several treatments in non-hierarchically structured populations. Further, binary (...truncated)