Predicting Student Dropout Using Machine Learning Algorithms

Authors

DOI:

https://doi.org/10.58190/imiens.2024.103

Keywords:

Artificial Neural Networks, Decision Tree, Machine Learning, Random Forest, Student Dropout

Abstract

This article comprehensively examines the use of machine learning algorithms to predict and reduce student dropout rates. These methods, developed to monitor and support student achievement in education, also aimed to enhance success rates in education and ensure more effective student engagement in the learning process. Big data analysis and machine learning models provide important contributions to the development of strategic solutions to the problem of school dropout by predicting student movements and trends. This study uses a dataset consisting of 4424 student data and has 37 features. The dataset is divided into three classes: "Dropout", "Enrolled" and "Graduate" according to the students' school dropout status. Decision Tree (DT), Random Forest (RF) and Artificial Neural Network (ANN) competitions, which are frequently used in such training studies in the literature, are aimed at this dataset. According to the obtained operations, DT showed moderate performance with an accuracy rate of 70.1%.  The RF algorithm showed higher success with an accuracy rate of 75.5%. The highest success was achieved by the ANN algorithm with an accuracy rate of 77.3%. ANN's flexible structure has produced superior results compared to other algorithms for this dataset, its ability provide successful classification in complex datasets. The article ultimately demonstrates how machine learning-based prediction models can have a significant impact on student achievement and offer a powerful tool for reducing school dropouts.

Downloads

Download data is not yet available.

References

[1] Aina, C., Baici, E., Casalone, G., and Pastore, F. (2022). The determinants of university dropout: A review of the socio-economic literature. Socio-Economic Planning Sciences, 79, 101102. https://doi.org/10.1016/j.seps.2021.101102

[2] Domar, A. D. (2004). Impact of psychological factors on dropout rates in insured infertility patients. Fertility and sterility, 81(2), 271-273. https://doi.org/10.1016/j.fertnstert.2003.08.013

[3] Bennett, R. (2003). Determinants of undergraduate student drop out rates in a university business studies department. Journal of Further and Higher Education, 27(2), 123-141. https://doi.org/10.1080/030987703200065154

[4] Tang, C., Zhao, L., and Zhao, Z. (2018). Child labor in China. China Economic Review, 51, 149-166. https://doi.org/10.1016/j.chieco.2016.05.006

[5] Mehra, D., Sarkar, A., Sreenath, P., Behera, J., and Mehra, S. (2018). Effectiveness of a community based intervention to delay early marriage, early pregnancy and improve school retention among adolescents in India. BMC public health, 18, 1-13. https://doi.org/10.1186/s12889-018-5586-3

[6] Kaplan, D. S., Peck, B. M., and Kaplan, H. B. (1997). Decomposing the academic failure–dropout relationship: A longitudinal analysis. The Journal of Educational Research, 90(6), 331-343. https://doi.org/10.1080/00220671.1997.10544591

[7] Brorson, H. H., Arnevik, E. A., Rand-Hendriksen, K., and Duckert, F. (2013). Drop-out from addiction treatment: A systematic review of risk factors. Clinical psychology review, 33(8), 1010-1024. https://doi.org/10.1016/j.cpr.2013.07.007

[8] Archambault, I., Janosz, M., Dupéré, V., Brault, M. C., and Andrew, M. M. (2017). Individual, social, and family factors associated with high school dropout among low‐SES youth: Differential effects as a function of immigrant status. British Journal of Educational Psychology, 87(3), 456-477. https://doi.org/10.1111/bjep.12159

[9] Stratton, L. S., O’Toole, D. M., and Wetzel, J. N. (2007). Are the factors affecting dropout behavior related to initial enrollment intensity for college undergraduates? Research in Higher Education, 48(4), 453-485. https://doi.org/10.1007/s11162-006-9033-4

[10] Wood, L., Kiperman, S., Esch, R. C., Leroux, A. J., and Truscott, S. D. (2017). Predicting dropout using student-and school-level factors: An ecological perspective. School Psychology Quarterly, 32(1), 35.

[11] Perreira, K. M., Harris, K. M., and Lee, D. (2006). Making it in America: High school completion by immigrant and native youth. Demography, 43(3), 511-536. https://doi.org/10.1353/dem.2006.0026

[12] Christenson, S. L., and Thurlow, M. L. (2004). School dropouts: Prevention considerations, interventions, and challenges. Current Directions in Psychological Science, 13(1), 36-39. https://doi.org/10.1111/j.0963-7214.2004.01301010.x

[13] Janosz, M., Le Blanc, M., Boulerice, B., and Tremblay, R. E. (2000). Predicting different types of school dropouts: A typological approach with two longitudinal samples. Journal of educational psychology, 92(1), 171.

[14] Ameen, A. O., Alarape, M. A., and Adewole, K. S. (2019). Students’ academic performance and dropout predictions: A review. Malaysian Journal of Computing, 4(2), 278-303.

[15] Rahmani, A. M., Azhir, E., Ali, S., Mohammadi, M., Ahmed, O. H., Ghafour, M. Y., ... and Hosseinzadeh, M. (2021). Artificial intelligence approaches and mechanisms for big data analytics: a systematic study. PeerJ Computer Science, 7, e488. https://doi.org/10.7717/peerj-cs.488

[16] Gubbels, J., Van der Put, C. E., and Assink, M. (2019). Risk factors for school absenteeism and dropout: A meta-analytic review. Journal of youth and adolescence, 48, 1637-1667. https://doi.org/10.1007/s10964-019-01072-5

[17] Sorensen, L. C. (2019). “Big Data” in educational administration: An application for predicting school dropout risk. Educational Administration Quarterly, 55(3), 404-446. https://doi.org/10.1177/0013161X18799439

[18] Lakkaraju, H., Aguiar, E., Shan, C., Miller, D., Bhanpuri, N., Ghani, R., and Addison, K. L. (2015, August). A machine learning framework to identify students at risk of adverse academic outcomes. In Proceedings of the 21th ACM SIGKDD international conference on knowledge discovery and data mining (pp. 1909-1918). https://doi.org/10.1145/2783258.2788620

[19] Rumberger, R. W., and Lim, S. A. (2008). Why students drop out of school: A review of 25 years of research.

[20] Becker, B. E., and Luthar, S. S. (2002). Social-emotional factors affecting achievement outcomes among disadvantaged students: Closing the achievement gap. Educational psychologist, 37(4), 197-214. https://doi.org/10.1207/S15326985EP3704_1

[21] Realinho, V., Vieira Martins, M., Machado, J., and Baptista, L. (2021). Predict Students' Dropout and Academic Success [Dataset]. UCI Machine Learning Repository. https://doi.org/10.24432/C5MC89.

[22] Koklu, N. and Sulak, S. A. (2024a). The Systematic Analysis of Adults' Environmental Sensory Tendencies Dataset. Data in Brief, Vol.55, 110640, https://doi.org/10.1016/j.dib.2024.110640

[23] Arlot, S., and Celisse, A. (2010). A survey of cross-validation procedures for model selection. https://doi.org/10.1214/09-SS054

[24] Kaya, I. and Cinar, I. (2024). Evaluation of Machine Learning and Deep Learning Approaches for Automatic Detection of Eye Diseases. Intelligent Methods In Engineering Sciences, 3(1), 37-45.

[25] Rana, K. K. (2014). A survey on decision tree algorithm for classification. International journal of Engineering development and research, 2(1), 1-5.

[26] Charbuty, B., and Abdulazeez, A. (2021). Classification based on decision tree algorithm for machine learning. Journal of Applied Science and Technology Trends, 2(01), 20-28.

[27] Koklu N. and Sulak S.A., (2024b). “Classification of Environmental Attitudes with Artificial Intelligence Algorithms”, Intell Methods Eng Sci, vol. 3, no. 2, pp. 54–62, Jun. 2024, https://doi.org/10.58190/imiens.2024.99

[28] Loh, W. Y. (2011). Classification and regression trees. Wiley interdisciplinary reviews: data mining and knowledge discovery, 1(1), 14-23. https://doi.org/10.1002/widm.8

[29] Xu, M., Watanachaturaporn, P., Varshney, P. K., and Arora, M. K. (2005). Decision tree regression for soft classification of remote sensing data. Remote Sensing of Environment, 97(3), 322-336. https://doi.org/10.1016/j.chieco.2016.05.006

[30] Sulak, S. A. and Koklu, N. (2024). Analysis of Depression, Anxiety, Stress Scale (DASS‐42) With Methods of Data Mining. European Journal of Education, e12778. https://doi.org/10.1111/ejed.12778

[31] Biau, G., and Scornet, E. (2016). A random forest guided tour. Test, 25, 197-227. https://doi.org/10.1007/s11749-016-0481-7

[32] Breiman, L. (2001). Random forests. Machine learning, 45, 5-32. https://doi.org/10.1023/A:1010933404324

[33] Koklu, N. and Sulak, S.A. (2024c). Using artificial intelligence techniques for the analysis of obesity status according to the individuals' social and physical activities. Sinop Üniversitesi Fen Bilimleri Dergisi, 9(1), 217-239. https://doi.org/10.33484/sinopfbd.1445215

[34] Svetnik, V., Liaw, A., Tong, C., Culberson, J. C., Sheridan, R. P., and Feuston, B. P. (2003). Random forest: a classification and regression tool for compound classification and QSAR modeling. Journal of chemical information and computer sciences, 43(6), 1947-1958. https://doi.org/10.1021/ci034160g

[35] Pang, H., Lin, A., Holford, M., Enerson, B. E., Lu, B., Lawton, M. P., ... and Zhao, H. (2006). Pathway analysis using random forests classification and regression. Bioinformatics, 22(16), 2028-2036. https://doi.org/10.1093/bioinformatics/btl344

[36] Agatonovic-Kustrin, S., and Beresford, R. (2000). Basic concepts of artificial neural network (ANN) modeling and its application in pharmaceutical research. Journal of pharmaceutical and biomedical analysis, 22(5), 717-727. https://doi.org/10.1016/S0731-7085(99)00272-1

[37] Zurada, J. (1992). Introduction to artificial neural systems. West Publishing Co..

[38] Kumar, B. R., Vardhan, H., Govindaraj, M., and Vijay, G. S. (2013). Regression analysis and ANN models to predict rock properties from sound levels produced during drilling. International Journal of Rock Mechanics and Mining Sciences, 58, 61-72. https://doi.org/10.1016/j.ijrmms.2012.10.002

[39] Abiodun O. I. et al., "Comprehensive Review of Artificial Neural Network Applications to Pattern Recognition," in IEEE Access, vol. 7, pp. 158820-158846, 2019, doi: 10.1109/ACCESS.2019.2945545.

Downloads

Published

2024-09-30

Issue

Section

Research Articles

How to Cite

[1]
“Predicting Student Dropout Using Machine Learning Algorithms”, Intell Methods Eng Sci, vol. 3, no. 3, pp. 91–98, Sep. 2024, doi: 10.58190/imiens.2024.103.

Similar Articles

11-20 of 30

You may also start an advanced similarity search for this article.