Using Naïve Bayes Algorithm to Predict and Classify Alcohol Addiction Severity: A Machine Learning Approach for Public Health Interventions

Francis Balazon

doi:10.48017/dj.v10i1.3131

Autores

Francis Balazon College of Teacher Education Graduate School, Batangas State University The National Engineering University, Philippines https://orcid.org/0000-0003-0143-2983

DOI:

https://doi.org/10.48017/dj.v10i1.3131

Palavras-chave:

Aprendizado de máquina, Algoritmo Naïve Bayes, Agrupamento K-means, Alcoolismo, Dependência de álcool

Resumo

O vício em álcool tem emergido cada vez mais como uma preocupação significativa na saúde global, com os métodos atuais de previsão e classificação revelando certas limitações. O principal objetivo deste estudo foi aprofundar a compreensão da previsão e classificação dos níveis de dependência alcoólica, empregando o Algoritmo Naive Bayes e a Clusterização K-means. Através de uma pesquisa abrangente, foram coletados dados de 500 participantes, iluminando fatores como a frequência de consumo de álcool e os impactos negativos associados. A metodologia utilizou o Algoritmo Naive Bayes, registrando uma notável precisão de 95%, precisão de 93%, recall de 97% e um F1 Score de 95%. Simultaneamente, o método de Clusterização K-means delineou efetivamente três níveis distintos de vício: menos viciado, moderadamente viciado e altamente viciado. Quando justaposto com a literatura e metodologias existentes, a abordagem do estudo mostra superior precisão e um sistema de classificação refinado, oferecendo uma ferramenta potente para profissionais de saúde identificarem e abordarem o vício em álcool. As possíveis vias para exploração futura incluem a integração de algoritmos variados e a investigação de outras facetas do vício.

Métricas

Carregando Métricas ...

Biografia do Autor

Francis Balazon, College of Teacher Education Graduate School, Batangas State University The National Engineering University, Philippines

0000-0003-0143-2983; College of Teacher Education Graduate School, Batangas State University The National Engineering University, Philippines, francis.balazon@g.batstate-u.edu.ph

Referências

Ali, D. S., Ghoneim, A., & Saleh, M. (2017). Data clustering method based on mixed similarity measures. In Proceedings of the 6th International Conference on Operations Research and Enterprise Systems. https://doi.org/10.5220/0006245600001482

Ali, S. F., Onaivi, E. S., Dodd, P. R., Cadet, J. L., Schenk, S., Kuhar, M. J., & Koob, G. F. (2011). Understanding the global problem of drug addiction is a challenge for IDARS scientists. Current Neuropharmacology, 9(1), 2–7. https://doi.org/10.2174/157015911795017245

American Psychological Association. (2012). Understanding alcohol use disorders and their treatment. https://www.apa.org/topics/substance-use-abuse-addiction/alcohol-disorders

An, Q., Rahman, S., Zhou, J., & Kang, J. J. (2023). A comprehensive review on machine learning in healthcare industry: Classification, restrictions, opportunities and challenges. Sensors, 23, 4178. https://doi.org/10.3390/s23094178

Azeraf, E., Monfrini, E., & Pieczynski, W. (2022). Improving usual Naive Bayes classifier performances with neural Naive Bayes-based models. In Proceedings of the 11th International Conference on Pattern Recognition Applications and Methods. ttps://doi.org/10.5220/0010890400003122

Bazett, T. (2022). Introduction to Bayes’ Theorem. In Bayesian Inference. https://doi.org/10.1007/978-3-030-95792-6_3

Bhatt, A. (2022). Alcohol addiction and abuse. Addiction Center. https://www.addictioncenter.com/alcohol/

Bèchet, N. B., Shanbhag, N. C., & Lundgaard, I. (2020). Glymphatic function in the gyrencephalic brain. BioRxiv. https://doi.org/10.1101/2020.11.09.373894

Bijnen, E. J. (1973). Coefficients for defining the degree of similarity between objects. In Cluster Analysis (pp. 4–20). https://doi.org/10.1007/978-94-011-6782-6_2

Centers for Disease Control and Prevention. (2022). Alcohol-related disease impact application website.

Chaudhary, M. (2020). K-means clustering in machine learning. Medium. https://medium.com/@cmukesh8688/k-means-clustering-in-machine-learning-252130c85e23

Chiva-Blanch, G., & Badimon, L. (2019). Benefits and risks of moderate alcohol consumption on cardiovascular disease: Current findings and controversies. Nutrients, 12(1), 108. https://doi.org/10.3390/nu12010108

David, C., et al. (2016). Usability of a smartphone app to reduce excessive alcohol consumption. Frontiers in Public Health, 4. https://doi.org/10.3389/conf.fpubh.2016.01.00064

Deng, Z., Choi, K.-S., Chung, F.-L., & Wang, S. (2010). Enhanced soft subspace clustering integrating within-cluster and between-cluster information. Pattern Recognition, 43(3), 767–781. https://doi.org/10.1016/j.patcog.2009.09.010

Early exposure to child abuse or neglect can cause long term health consequences. (2009). PsycEXTRA Dataset. https://doi.org/10.1037/e572212009-002

Epinephrine. (2023). Reactions Weekly, 1968(1), 138–138. https://doi.org/10.1007/s40278-023-44302-4

Franjic, S. (2021). Frequent alcohol consumption can have detrimental health consequences. Archives of Psychiatry and Behavioral Sciences, 4(1), 29–34. https://doi.org/10.22259/2638-5201.0401005

Habehh, H., & Gohel, S. (2021). Machine learning in healthcare. Current Genomics, 22(4), 291–300. https://doi.org/10.2174/1389202922666210705124359

Harmful use of alcohol kills more than 3 million people each year, most of them men. (2023). Human Rights Documents Online. https://doi.org/10.1163/2210-7975_hrd-9841-20180011

Hartung, T. (2018). Making big sense from big data. Frontiers in Big Data, 1, October. https://doi.org/10.3389/fdata.2018.00005

Jacobs, K. (1978). Positive contents and measures. In Measure and Integral (pp. 26–71). https://doi.org/10.1016/b978-0-12-378550-3.50005-0

Jarman, M. P., & Haider, A. H. (2019). When one data set is insufficient—Things to consider when linking secondary data—Reply. JAMA Surgery, 154(2), 187. https://doi.org/10.1001/jamasurg.2018.4751

Jo, T. (2020). K means algorithm. In Machine Learning Foundations (pp. 217–240). https://doi.org/10.1007/978-3-030-65900-4_10

Khalaf, A., Majeed, A., Akeel, W., & Salah, A. (2017). Students’ success prediction based on Bayes algorithms. International Journal of Computer Applications, 178(7), 6–12. https://doi.org/10.5120/ijca2017915506

Kim, K. (2017). A weighted k-modes clustering using new weighting method based on within-cluster and between-cluster impurity measures. Journal of Intelligent & Fuzzy Systems, 32(1), 979–990. https://doi.org/10.3233/jifs-16157

Kozak, M., Zieliński, A., & Singh, S. (2008). Stratified two-stage sampling in domains: Sample allocation between domains, strata, and sampling stages. Statistics & Probability Letters, 78(8), 970–974. https://doi.org/10.1016/j.spl.2007.09.057

Lee, R. B., Baring, R., Maria, M. S., & Reysen, S. (2015). Attitude towards technology, social media usage and grade-point average as predictors of global citizenship identification in Filipino university students. International Journal of Psychology, 52(3), 213–219. ://doi.org/10.1002/ijop.12200

Lewis, D. J. (1969). Positive instances of reinstatement. Science, 166(3906), 772–772. https://doi.org/10.1126/science.166.3906.772-a

Mean average precision. (n.d.). Springer Reference. https://doi.org/10.1007/springerreference_65277

Nembach, E. (1975). Critical resolved shear stress of materials which simultaneously contain various types of obstacles impeding the glide of dislocations. In The Movement of Molecules Across Cell Membranes (pp. 413–416). https://doi.org/10.1007/978-3-540-37413-0_19

Palupi, E. S. (2021). Employee turnover classification using PSO-based naïve Bayes and naïve Bayes algorithm in PT. Mastersystem Infotama. Jurnal Riset Informatika, 3(3), 233–240. https://doi.org/10.34288/jri.v3i3.232

Schwenkreis, F. (2022). Using the silhouette coefficient for representative search of team tactics in noisy data. In Proceedings of the 11th International Conference on Data Science, Technology and Applications. https://doi.org/10.5220/0011100600003269

Sudhinaraset, M., Wigglesworth, C., & Takeuchi, D. T. (2016). Social and cultural context of alcohol use: Influences in a social-ecological framework. Alcohol Research, 38(1), 35–45. https://pubmed.ncbi.nlm.nih.gov/27159810

Sullivan, M. G. (2009). Too many pregnant women still drink alcohol. Family Practice News, 39(12), 33. https://doi.org/10.1016/s03007073(09)70489-x

Unsupervised learning—Clustering using K-means. (2019). In Python® Machine Learning (pp. 221–242). https://doi.org/10.1002/9781119557500.ch10

Vongprechakorn, K., Chumuang, N., & Farooq, A. (2019). Prediction model for amphetamine behaviors based on Bayes network classifier. In 2019 14th International Joint Symposium on Artificial Intelligence and Natural Language Processing (iSAI-NLP) (pp. 1–6). https://doi.org/10.1109/iSAI-NLP48611.2019.9045560

Whiteman, H. (2022). Drinking alcohol can clear brain waste, study finds. Medical News Today. https://www.medicalnewstoday.com/articles/320824

Woodman, R. J., & Mangoni, A. A. (2023). A comprehensive review of machine learning algorithms and their application in geriatric medicine: Present and future. Aging Clinical and Experimental Research. https://doi.org/10.1007/s40520-023-02552-2

Xiao, N., Li, K., Zhou, X., & Li, K. (2019). A novel clustering algorithm based on directional propagation of cluster labels. In 2019 International Joint Conference on Neural Networks (IJCNN). https://doi.org/10.1109/ijcnn.2019.8852159