Maximizing Learning Outcomes: A Comparative Analysis of IRT- and CTT- Differentiated Learning based Design

Dwi Rismi Ocy; Awaluddin Tjalla; Soeprijanto Soeprijanto

doi:10.17509/pdgia.v22i3.77064

Maximizing Learning Outcomes: A Comparative Analysis of IRT- and CTT- Differentiated Learning based Design

Dwi Rismi Ocy, Awaluddin Tjalla, Soeprijanto Soeprijanto

Abstract

Challenges in addressing diverse student abilities often hinder effective learning, particularly in complex subjects like linear equations and inequalities. This research aimed to compare the effectiveness of Item Response Theory (IRT)-based and Classical Test Theory (CTT)-based differentiated learning designs in improving student performance in linear equations and inequalities. Conducted in two secondary schools, the study involved 126 students, with 61 students in the IRT group and 65 students in the CTT group. A quasi-experimental design with pretest-posttest control groups was employed to assess learning progress. The results showed that both IRT and CTT-based learning interventions led to significant improvements in student performance. However, the IRT-based approach, which grouped students based on their individual ability levels and tailored tasks to their proficiency, resulted in a significantly higher average posttest score and a very large effect size. The CTT-based approach also showed improvement but with a smaller effect size. The findings suggest that IRT offers a more precise and effective method for differentiating instruction, leading to better learning outcomes, particularly in complex subjects like linear equations. This study underscores the potential of IRT in enhancing educational practices and improving student learning outcomes.

Keywords

Differentiated Instruction, Pedagogical Strategies, Student-Centered Learning, Item Response Theory (IRT), Classical Test Theory (CTT)

Full Text:

PDF

References

Abedalaziz, N., & Leng, C. H. (2018). The relationship between CTT and IRT approaches in Analyzing Item Characteristics. MOJES: Malaysian Online Journal of Educational Sciences, 1(1), 64-70.

Alsariera, Y. A., Baashar, Y., Alkawsi, G., Mustafa, A., Alkahtani, A. A., & Ali, N. A. (2022). Assessment and evaluation of different machine learning algorithms for predicting student performance. Computational intelligence and neuroscience, 2022(1), 4151487. https://doi.org/10.1155/2022/4151487

Becker, A., & Nekrasova-Beker, T. (2018). Investigating the effect of different selected-response item formats for reading comprehension. Educational Assessment, 23(4), 296-317. https://doi.org/10.1080/10627197.2018.1517023

Beggrow, E. P., Ha, M., Nehm, R. H., Pearl, D., & Boone, W. J. (2014). Assessing scientific practices using machine-learning methods: How closely do they match clinical interview performance?. Journal of Science education and Technology, 23, 160-182. https://doi.org/10.1007/s10956-013-9461-9

Bustamante, J. C., & Navarro, J. J. (2022). Technological tools for the intervention and computerized dynamic assessment of executive functions. In Handbook of Research on Neurocognitive Development of Executive Functions and Implications for Intervention (pp. 310-339). IGI Global Scientific Publishing. https://doi.org/10.4018/978-1-7998-9075-1.ch014

Cardamone, C. N., Abbott, J. E., Rayyan, S., Seaton, D. T., Pawl, A., & Pritchard, D. E. (2012, February). Item response theory analysis of the mechanics baseline test. In AIP Conference Proceedings (Vol. 1413, No. 1, pp. 135-138). American Institute of Physics. https://doi.org/10.1063/1.3680012

Carlson, J. E., & von Davier, M. (2017). Item response theory. Advancing human assessment: The methodological, psychological and policy contributions of ETS, 133-178. https://doi.org/0.1007/978-3-319-58689-2

Chen, Y., Li, X., Liu, J., & Ying, Z. (2025). Item response theory—A statistical framework for educational and psychological measurement. Statistical Science, 40(2), 167-194. https://doi.org/10.1214/23-STS896

Cook, D. A., & Hatala, R. (2016). Validation of educational assessments: a primer for simulation and beyond. Advances in simulation, 1, 1-12. https://doi.org/10.1186/s41077-016-0033-y

Cook, D. A., Kuper, A., Hatala, R., & Ginsburg, S. (2016). When assessment data are words: validity evidence for qualitative educational assessments. Academic Medicine, 91(10), 1359-1369. https://doi.org/10.1097/ACM.0000000000001175

Csapó, B., & Molnár, G. (2019). Online diagnostic assessment in support of personalized teaching and learning: The eDia system. Frontiers in psychology, 10, 1522. https://doi.org/10.3389/fpsyg.2019.01522

Devayanti, D., Suryana, D., & Sunarya, Y. Implementing Rasch Model as an Approach to Test Academic Integrity Instrument's Validity and Reliability. Pedagogia Jurnal Ilmu Pendidikan, 21(1), 25–36. https://doi.org/10.17509/pdgia.v21i1.54133

Diaz, N. V. M., Yoon, S. Y., Trytten, D. A., & Meier, R. (2023). Development and Validation of the Engineering Computational Thinking Diagnostic for Undergraduate Students. IEEE Access, 11, 133099-133114. 10.1109/ACCESS.2023.3335931

Dumont, H., & Ready, D. D. (2023). On the promise of personalized learning for educational equity. Npj science of learning, 8(1), 1-6. https://doi.org/10.1038/s41539-023-00174-x

Eren, B., Gündüz, T., & Tan, Ş. (2023). Comparison of methods used in detection of DIF in cognitive diagnostic models with traditional methods: Applications in TIMSS 2011. Journal of Measurement and Evaluation in Education and Psychology, 14(1), 76-94. https://doi.org/10.21031/epod.1218144

Gilbert, J. B., Kim, J. S., & Miratrix, L. W. (2023). Modeling item-level heterogeneous treatment effects with the explanatory item response model: Leveraging large-scale online assessments to pinpoint the impact of educational interventions. Journal of Educational and Behavioral Statistics, 48(6), 889-913. https://doi.org/10.3102/10769986231171710

Hitt, D. H., & Tucker, P. D. (2016). Systematic review of key leader practices found to influence student achievement: A unified framework. Review of educational research, 86(2), 531-569. https://doi.org/10.3102/00346543156149

Ju, G. F., & Bork, A. (2005, July). The implementation of an adaptive test on the computer. In Fifth IEEE International Conference on Advanced Learning Technologies (ICALT'05) (pp. 822-823). IEEE. https://doi.org/10.1109/ICALT.2005.274

Kubsch, M., Czinczel, B., Lossjew, J., Wyrwich, T., Bednorz, D., Bernholt, S., ... & Rummel, N. (2022, August). Toward learning progression analytics—Developing learning environments for the automated analysis of learning using evidence centered design. In Frontiers in education (Vol. 7, p. 981910). Frontiers Media SA. https://doi.org/10.3389/feduc.2022.981910

Larrain, M., & Kaiser, G. (2022). Interpretation of students’ errors as part of the diagnostic competence of pre-service primary school teachers. Journal für Mathematik-Didaktik, 43(1), 39-66. https://doi.org/10.1007/s13138-022-00198-7

Lee, Y. (2019). Estimating student ability and problem difficulty using item response theory (IRT) and TrueSkill. Information Discovery and Delivery, 47(2), 67-75.

Mallillin, L. L. D. (2022). Teaching and learning intervention in the educational setting: adapting the teacher theory model. International Journal of Educational Innovation and Research, 1(2), 99-121. https://doi.org/10.31949/ijeir.v1i2.2493

Pak, K., Polikoff, M. S., Desimone, L. M., & Saldívar García, E. (2020). The adaptive challenges of curriculum implementation: Insights for educational leaders driving standards-based reform. Aera Open, 6(2), 1 –15. https://doi.org/10.1177/233285842093282

Pea, R. D. (2018). The social and technological dimensions of scaffolding and related theoretical concepts for learning, education, and human activity. In Scaffolding (pp. 423-451). Psychology Press.

Pliakos, K., Joo, S. H., Park, J. Y., Cornillie, F., Vens, C., & Van den Noortgate, W. (2019). Integrating machine learning into item response theory for addressing the cold start problem in adaptive learning systems. Computers & Education, 137, 91-103. https://doi.org/10.1016/j.compedu.2019.04.009

Raykov, T., & Marcoulides, G. A. (2016). On the relationship between classical test theory and item response theory: From one to the other and back. Educational and psychological measurement, 76(2), 325-338. https://doi.org/10.1177/0013164415576958

Tetzlaff, L., Schmiedek, F., & Brod, G. (2021). Developing personalized education: A dynamic framework. Educational Psychology Review, 33, 863-882. https://doi.org/10.1007/s10648-020-09570-w

Tian, X., Han, X., Cheng, H. N., Chang, W. C., Liao, C. C., Sun, J., ... & Liu, S. (2017, July). Applying item response theory to analyzing and improving the item quality of an online Chinese reading assessment. In 2017 6th IIAI International Congress on Advanced Applied Informatics (IIAI-AAI) (pp. 754-759). IEEE. https://doi.org/10.1109/IIAI-AAI.2017.100

Tornabene, R. E., Lavington, E., & Nehm, R. H. (2018). Testing validity inferences for Genetic Drift Inventory scores using Rasch modeling and item order analyses. Evolution: Education and Outreach, 11, 1-16. https://doi.org/10.1186/s12052-018-0082-x

Van der Linden, W. J., & Glas, C. A. (Eds.). (2010). Elements of adaptive testing (Vol. 10, pp. 978-0). New York: Springer. https://doi.org/10.1007/978-0-387-85461-8

Willis, L., Badrinarayan, A., & Martinez, M. (2022). Quality criteria for systems of performance assessment for school, district, and network leaders. Learning Policy Institute. https://doi.org/10.54300/439.730

DOI: https://doi.org/10.17509/pdgia.v22i3.77064