Posts by Collection



A Topical and Methodological Systematic Review of Meta-Analyses Published in the Educational Measurement Literature

Published in Educational Measurement: Issues and Practice, 2020

This systematic review investigated the topics studied and reporting practices of published meta-analyses in educational measurement. Our findings indicated that meta-analysis is not a highly utilized methodological tool in educational measurement; on average, less than one meta-analysis has been published per year over the past 30 years (28 meta-analyses were published between 1986 and 2016). Within the field, researchers have utilized meta-analysis to study three primary subject areas: test format effects, test accommodations, and predictive validity of operational testing programs. In regard to reporting practices, authors often failed to provide descriptive details of both their search strategy and sample characteristics limiting reproducibility and generalizability of findings, respectively. Furthermore, diagnostic analyses of outliers, publication bias, and statistical power were not provided for the majority of studies, putting into question the validity of inferences made from the meta-analyses sampled. The lack of transparent and replicable practices of meta-analyses in educational measurement is a concern for generating credible research syntheses that can assist the field in improving evidence-based practices. Recommendations are provided for improving training and editorial standards of meta-analytic research.

Recommended citation: Rios, J. A., Ihlenfeldt, S. D., Dosedel, M., & Riegelman, A. (2020). A topical and methodological systematic review of meta‐analyses published in the educational measurement literature. Educational Measurement: Issues and Practice, 39(1), 71–81.

Are Accommodations for English Learners on State Accountability Assessments Evidence-Based? A Multistudy Systematic Review and Meta-Analysis

Published in Educational Measurement: Issues and Practice, 2020

The objectives of this two-part study were to: (a) investigate English learner (EL) accommodation practices on state accountability assessments of reading/English language arts and mathematics in grades 3–8, and (b) conduct a meta-analysis of EL accommodation effectiveness on improving test performance. Across all distinct testing programs, we found that at least one EL test accommodation was provided for both test content areas. The most popular accommodations provided were supplying students with word-to-word dual language dictionaries, reading aloud test directions and items in English, and allowing flexible time/scheduling. However, we found minimal evidence that testing programs provide practitioners with recommendations on how to assign relevant accommodations to EL test takers’ English proficiency level. To evaluate whether accommodations used in practice are supported with evidence of their effectiveness, a meta-analysis was conducted. On average, across 26 studies and 95 effect sizes (N = 11,069), accommodations improved test performance by .16 standard deviations. Both test content and sampling design were found to moderate accommodation effectiveness; however, none of the accommodations investigated were found to have intervention effects that were statistically different from zero. Overall, these results suggest that currently employed EL test accommodations lack evidence of their effectiveness.

Recommended citation: Rios, J. A., Ihlenfeldt, S. D., & Chavez, C. (2020). Are accommodations for English learners on state accountability assessments evidence-based? A multi-study systematic review and meta-analysis. Educational Measurement: Issues and Practice, 39(4), 65-75.

State Assessment Score Reporting Practices for English Learner Parents

Published in Educational Measurement: Issues and Practice, 2021

This study sought to investigate how states communicate results for academic achievement and English language proficiency (ELP) assessments to parents who are English learners (EL). This objective was addressed by evaluating: (a) whether score reports and interpretive guides for state academic achievement and ELP assessments in each state were translated for EL parents; and (b) if so, whether recommended score reporting guidelines were followed in practice. Results demonstrated that for state achievement tests, 29 states had translated score reports and 28 had translated interpretive guides. Nearly every state translated these materials for their ELP assessments in a wide variety of languages. Across ELP and state achievement assessments, most states were found to limit statistical jargon, utilize figures/graphics to communicate test results, and include follow-up information for parents, which represent improvements observed in prior reviews. However, states rarely provided personalization, statements on intended score use, a student’s score history, or a direct link to their interpretive guide in their score reports. Recommendations on making score reports and interpretive guides more accessible and interpretable for EL parents are discussed.

Recommended citation: Rios, J. A., & Ihlenfeldt, S. D. (2021). State assessment score reporting practices for English learner parents. Educational Measurement: Issues and Practice, 40(3), 31-41.

To What Degree Does Rapid Guessing Distort Aggregated Test Scores? A Meta-analytic Investigation

Published in Educational Assessment, 2022

The present meta-analysis sought to quantify the average degree of aggregated test score distortion due to rapid guessing (RG). Included studies group-administered a low-stakes cognitive assessment, identified RG via response times, and reported the rate of examinees engaging in RG, the percentage of RG responses observed, and/or the degree of score distortion in aggregated test scores due to RG. The final sample consisted of 25 studies and 39 independent samples comprised of 443,264 unique examinees. Results demonstrated that an average of 28.3% of examinees engaged in RG (21% were deemed to engage in RG on a nonnegligible number of items) and 6.89% of item responses were classified as rapid guesses. Across 100 effect sizes, RG was found to negatively distort aggregated test scores by an average of 0.13 standard deviations; however, this relationship was moderated by both test content area and filtering procedure.

Recommended citation: Rios, J. A., Deng, J., & Ihlenfeldt, S. D. (2022). To what degree does rapid guessing underestimate test performance? A meta-analytic investigation. Educational Assessment.

A meta-analysis on the predictive validity of English language proficiency assessments for college admissions

Published in Language Testing, 2022

For institutions where English is the primary language of instruction, English assessments for admissions such as the Test of English as a Foreign Language (TOEFL) and International English Language Testing System (IELTS) give admissions decision-makers a sense of a student’s skills in academic English. Despite this explicit purpose, these exams have also been used for the practice of predicting academic success. In this study, we meta-analytically synthesized 132 effect sizes from 32 studies containing validity evidence of academic English assessments to determine whether different assessments (a) predicted academic success (as measured by grade point average [GPA]) and (b) did so comparably. Overall, assessments had a weak positive correlation with academic achievement (r = .231, p < .001). Additionally, no significant differences were found in the predictive power of the IELTS and TOEFL exams. No moderators were significant, indicating that these findings held true across school type, school level, and publication type. Although significant, the overall correlation was low; thus, practitioners are cautioned from using standardized English-language proficiency test scores in isolation in lieu of a holistic application review during the admissions process.

Recommended citation: Ihlenfeldt, S. D., & Rios, J. A. (2022). Do admissions English assessments predict success in higher education? A meta-analysis. Language Testing. Advanced online publication.


Cultivating Research Interests in Educational Measurement


Myself and another speaker discussed strategies for developing research interests in the field of educational measurement. The talk was in three parts: (1) discussing our own interests, (2) presenting strategies, (3) answering audience questions.

State Assessment Score Reporting Practices for English Learner Parents


We investigated nationwide whether score reporting for state accountability and ELP assessments (e.g., WIDA) was accessible to parents of English learners (i.e., translated and following best score reporting practices). Results indicate differences between assessment types, as well as key trends across both. Implications for practice are discussed.

Evaluating Rapid Guessing Response Patterns on Multistage Assessment: A Simulation Study


Noneffortful rapid guessing (RG) on multistage assessment undermines the validity of inferences made from ability estimates. Multiple response patterns were simulated for simulees of varying ability to explore estimation accuracy based on the location of RG. An effort-moderated IRT model is explored as a potential solution. Implications are discussed

Investigating How Emotional Affect Moderates the Relationship Between Feedback Type and Uptake


This study explores the relationship between feedback presentation, emotional affect, and feedback uptake. Participants were randomly assigned to one of three feedback conditions after taking a GRE practice test. Results indicate that both affect (positive and negative) and feedback condition are significantly associated with reported likelihood to change studying behaviors.

Individual Score Report designs for Alternate and General Assessments: What are other states doing?


State education agencies (SEAs) are required by the Federal Peer Review Critical Element 6.4 to provide an individual score report (ISR) for the general assessment and the alternate assessment (AA-ISR), and as such, they have been doing so for many years. While general assessment ISRs and AA-ISRs have come a long way since their inception, due to many factors such as advancements in research and technology as well as ever-changing community needs, these reports must continually evolve in order to improve critical communication between SEAs and the families they serve.