Analisis Pengaruh Recursive Feature Elimination Terhadap Kinerja Model Prediksi Dini Diabetes Mellitus di RS PKU Muhammadiyah Bima

Penulis

  • Nani Sulistianingsih Program Studi Sistem dan Teknologi Informasi, Universitas Muhammadiyah Mataram
  • Siti Agrippina Alodia Yusuf Program Studi Sistem dan Teknologi Informasi, Universitas Muhammadiyah Mataram
  • Anggreni Anggreni Program Studi Sistem dan Teknologi Informasi, Universitas Muhammadiyah Mataram
  • Firda Niken Sari Program Studi Sistem dan Teknologi Informasi, Universitas Muhammadiyah Mataram
  • M. Pandawan Juniarta Program Studi Sistem dan Teknologi Informasi, Universitas Muhammadiyah Mataram
  • Juryanti Permatasari Program Studi Sistem dan Teknologi Informasi, Universitas Muhammadiyah Mataram

DOI:

https://doi.org/10.35746/jtim.v7i3.774

Kata Kunci:

Diabetes Mellitus, RFE, Featuter Selection

Abstrak

Early detection of diabetes mellitus is a crucial step in preventing chronic complications and enabling more effective disease management. This study aims to analyze the impact of the Recursive Feature Elimination (RFE) method on the performance of machine learning-based diabetes prediction models at RS PKU Muhammadiyah Bima. A quantitative approach was employed by implementing the CRISP-DM framework, encompassing data selection, preprocessing, transformation, data mining, and model evaluation. Missing values in height and weight variables were imputed using linear regression based on age and gender features. The transformation process included calculating the Body Mass Index (BMI) as a new feature relevant to diabetes risk. Evaluation was carried out on three classification algorithms—Support Vector Machine (SVM), Logistic Regression (LR), and Random Forest (RF)—both before and after the application of RFE. The results showed that all models experienced significant performance improvements following feature selection with RFE, achieving 100% in all evaluation metrics. Insulin and BMI were consistently selected features, underscoring their contribution to diabetes detection. It can be concluded that RFE effectively reduces model complexity without sacrificing accuracy, thereby supporting the efficient implementation of predictive models in clinical settings.

Unduhan

Data unduhan tidak tersedia.

Referensi

Fregoso-Aparicio, J. Noguez, L. Montesinos, and J. A. García-García, “Machine learning and deep learning predictive models for type 2 diabetes: a systematic review,” Diabetol Metab Syndr, vol. 13, no. 1, Dec. 2021, https://doi.org/10.1186/s13098-021-00767-9.

O. O. Oladimeji, A. Oladimeji, and O. Oladimeji, “Classification models for likelihood prediction of diabetes at early stage using feature selection,” Applied Computing and Informatics, vol. 20, no. 3–4, pp. 279–286, Jun. 2024, https://doi.org/10.1108/ACI-01-2021-0022.

M. J. Uddin et al., “A Comparison of Machine Learning Techniques for the Detection of Type-2 Diabetes Mellitus: Experiences from Bangladesh,” Information (Switzerland), vol. 14, no. 7, Jul. 2023, https://doi.org/10.3390/info14070376.

E. Sreehari and L. D. D. Babu, “Critical Factor Analysis for prediction of Diabetes Mellitus using an Inclusive Feature Selection Strategy,” Applied Artificial Intelligence, vol. 38, no. 1, 2024, https://doi.org/10.1080/08839514.2024.2331919.

I. J. Kakoly, M. R. Hoque, and N. Hasan, “Data-Driven Diabetes Risk Factor Prediction Using Machine Learning Algorithms with Feature Selection Technique,” Sustainability (Switzerland), vol. 15, no. 6, Mar. 2023, https://doi.org/10.3390/su15064930.

Kementerian Kesehatan Republik Indonesia, “PERATURAN MENTERI KESEHATAN REPUBLIK INDONESIA No.30 Tahun 2013,” 2013. Accessed: Sep. 30, 2024. https://peraturan.bpk.go.id/Details/172111/permenkes-no-30-tahun-2013

A. Z. Arrayyan, H. Setiawan, and K. T. Putra, “Naive Bayes for Diabetes Prediction: Developing a Classification Model for Risk Identification in Specific Populations,” Semesta Teknika, vol. 27, no. 1, pp. 28–36, Apr. 2024, https://doi.org/10.18196/st.v27i1.21008.

Dinas Kesehatan Provinsi NTB, “Cakupan Pelayanan Kesehatan Penderita Diabetes Melitus Prov NTB - SMT I 2023,” 2023. Accessed: Sep. 30, 2024. https://data.ntbprov.go.id/dataset/pelayanan-kesehatan-penderita-diabetes-melitus-dm-di-provinsi-ntb

E. Sabitha and M. Durgadevi, “Improving the Diabetes Diagnosis Prediction Rate Using Data Preprocessing, Data Augmentation and Recursive Feature Elimination Method,” IJACSA) International Journal of Advanced Computer Science and Applications, vol. 13, no. 9, 2022, https://dx.doi.org/10.14569/IJACSA.2022.01309107

H. Jeon and S. Oh, “Hybrid-recursive feature elimination for efficient feature selection,” Applied Sciences (Switzerland), vol. 10, no. 9, May 2020, https://doi.org/10.3390/app10093211.

Y. Han, L. Huang, and F. Zhou, “A dynamic recursive feature elimination framework (dRFE) to further refine a set of OMIC biomarkers,” Bioinformatics, vol. 37, no. 15, pp. 2183–2189, Aug. 2021, https://doi.org/10.1093/bioinformatics/btab055.

S. Matharaarachchi, M. Domaratzki, and S. Muthukumarana, “Minimizing features while maintaining performance in data classification problems,” PeerJ Comput Sci, vol. 8, 2022, https://doi.org/10.7717/PEERJ-CS.1081.

M. DEM?R and ?. KILIÇ, “An Application of Feature Selection Methods to Compare the Performances of Classification Algorithms,” Afyon Kocatepe University Journal of Sciences and Engineering, vol. 22, no. 6, pp. 1307–1313, Dec. 2022, https://doi.org/10.35414/akufemubid.1153610.

E. Helmud, E. Helmud, F. Fitriyani, and P. Romadiana, “Classification Comparison Performance of Supervised Machine Learning Random Forest and Decision Tree Algorithms Using Confusion Matrix,” Jurnal Sisfokom (Sistem Informasi dan Komputer), vol. 13, no. 1, pp. 92–97, Feb. 2024, https://doi.org/10.32736/sisfokom.v13i1.1985.

A. Theissler, M. Thomas, M. Burch, and F. Gerschner, “ConfusionVis: Comparative evaluation and selection of multi-class classifiers based on confusion matrices,” Knowl Based Syst, vol. 247, p. 108651, Jul. 2022, https://doi.org/10.1016/J.KNOSYS.2022.108651.

A. Deyol, . A., O. Shandilya, and Y. Nayak, “Comparative Study of Different Machine Learning Classifiers Using Multiple Feature Selection Techniques for Breast Cancer Classification,” Int J Res Appl Sci Eng Technol, vol. 10, no. 12, pp. 614–620, Dec. 2022, https://doi.org/10.22214/ijraset.2022.47953.

M. Zhang, “Research on Diabetes Risk Prediction Using Multiple Machine Learning Models,” Transactions on Materials, Biotechnology and Life Sciences, vol. 5, 2024.

U. Das and B. Ahmed, “An Explainable AI-based Machine Learning Approach for Predicting Diabetes in the Early Stage Using the Influential Features,” Jun. 2024, https://doi.org/10.20944/preprints202406.0364.v1.

S. A. Hamzah, “Association between Lipid Profiles and Renal Functions among Adults with Type 2 Diabetes,” Dubai Diabetes and Endocrinology Journal, vol. 25, no. 3–4, pp. 134–138, 2019, https://doi.org/10.1159/000502005.

B. Sutara, F. Vulture, and R. Novianti, “Application of K-Means algorithm with CRISP-DM method in student data analysis as a support for promotion strategy SIDE: Scientifict Development Journal,” SIDE: Scientifict Development Journal, vol. 1, no. 1, 2024, https://ojs.arbain.co.id/index.php/side/article/view/6

N. Solomon, Y. Lokhnygina, and S. Halabi, “Comparison of regression imputation methods of baseline covariates that predict survival outcomes,” J Clin Transl Sci, vol. 5, no. 1, 2021, https://doi.org/10.1017/cts.2020.533.

I. Saputra, Belajar Mudah Data Mining Untuk Pemula. Bandung: Informatika Bandung, 2023.

M. A. Usni Zamzami Hasibuan dan Palmizal and M. Usni Zamzami Hasibuan, “Sosialisasi Penerapan Indeks Massa Tubuh (IMT) di Suta Club,” Jurnal Cerdas Sifa Pendidikan, vol. 10, no.2, pp. 84–89, 2021, https://doi.org/10.22437/csp.v10i2.15585

N. Gray, G. Picone, F. Sloan, and A. Yashkin, “Relation between BMI and diabetes mellitus and its complications among US older adults,” South Med J, vol. 108, no. 1, pp. 29–36, 2015, https://doi.org/10.14423/SMJ.0000000000000214.

A. S. Sandhu, G. Sharma, and P. K. Mishra, “Diabetes Prediction Using Machine Learning: A Comparative Analyses of Classification Algorithms,” in 2024 International Conference on Signal Processing and Advance Research in Computing (SPARC), 2024, pp. 1–6. https://doi.org/10.1109/SPARC61891.2024.10828692.

D. Wilimitis and C. G. Walsh, “Practical Considerations and Applied Examples of Cross-Validation for Model Development and Evaluation in Health Care: Tutorial,” JMIR AI, vol. 2, p. e49023, Dec. 2023, https://doi.org/10.2196/49023.

M. Rose and H. R. Hassen, “A Survey of Random Forest Pruning Techniques,” Academy and Industry Research Collaboration Center (AIRCC), Dec. 2019, pp. 99–109. https://doi.org/10.5121/csit.2019.91808.

A. Tsigler and P. L. Bartlett, “Benign overfitting in ridge regression,” 2023. http://jmlr.org/papers/v24/22-1398.html.

A. Arafa, M. Radad, M. Badawy, and N. El - Fishawy, “Regularized Logistic Regression Model for Cancer Classification,” in 2021 38th National Radio Science Conference (NRSC), IEEE, Jul. 2021, pp. 251–261. https://doi.org/10.1109/NRSC52299.2021.9509831.

D. Gunawan and H. Setiawan, “Convolutional Neural Network dalam Analisis Citra Medis,” 2022.

Y. I. Putri and A. Rosidi, “Sensitifitas dan Spesifisitas IMT dan Lingkar Perut Sebagai Indikator Risiko Diabetes Mellitus Sensitivity and Specificity of BMI and Abdominal Circumference as Indicators of Diabetes Risks,” Jurnal Kesehatan Medika Saintika, vol.9, no.1, 68-77, 2018, http://dx.doi.org/10.30633/jkms.v9i1.113

H. Sanada et al., “High Body Mass Index is an Important Risk Factor for the Development of Type 2 Diabetes NIH Public Access $watermark-text $watermark-text $watermark-text,” 2012.

P.-O. Côté, A. Nikanjam, N. Ahmed, D. Humeniuk, and F. Khomh, “Data Cleaning and Machine Learning: A Systematic Literature Review,” Oct. 2023, http://arxiv.org/abs/2310.01765

T. Emmanuel, T. Maupong, D. Mpoeleng, T. Semong, B. Mphago, and O. Tabona, “A survey on missing data in machine learning,” J Big Data, vol. 8, no. 1, p. 140, 2021, https://doi.org/10.1186/s40537-021-00516-9.

G. S. and L. X. I. and P. N. M. and S. J. L. Ehrig Molly and Bullock, “Imputation and Missing Indicators for Handling Missing Longitudinal Data: Data Simulation Analysis Based on Electronic Health Record Data,” JMIR Med Inform, vol. 13, p. e64354, Mar. 2025, https://doi.org/10.2196/64354.

X. H. Cao, I. Stojkovic, and Z. Obradovic, “A robust data scaling algorithm to improve classification accuracies in biomedical data,” BMC Bioinformatics, vol. 17, no. 1, Sep. 2016, https://doi.org/10.1186/s12859-016-1236-x.

D. K. Hestiani and A. Mappa Oudang, “Tentang Waspada Hipertensi, Hiperglikemia, Hiperurisemia, Dan Hiperkolesterolemia Health Services: Health Check-Ups & Health Education On Hypertension, Hyperglycaemia, Hyperuricaemia, And Hypercholesterolemia Awareness,” Jurnal PEDAMAS (Pengabdian Kepada Masyarakat), vol. 2, no. 5, 1362-1371, 2024, https://pekatpkm.my.id/index.php/JP/article/view/430.

S. Supadmi et al., Sindrom Metabolik Pada Lanjut Usia (Lansia). 2024.

B. Satria, T. A. Y. Siswa, and W. J. Pranoto, “Optimasi Random Forest dengan Genetic Algorithm dan Recursive Feature Elimination pada High Dimensional Data Stunting Samarinda,” JURNAL MEDIA INFORMATIKA BUDIDARMA, vol. 8, no. 3, p. 1778, Jul. 2024, https://doi.org/10.30865/mib.v8i3.7883.

Diterbitkan

2025-07-14

Terbitan

Bagian

Articles

Cara Mengutip

[1]
N. Sulistianingsih, S. A. A. Yusuf, A. Anggreni, F. N. Sari, M. P. Juniarta, dan J. Permatasari, “Analisis Pengaruh Recursive Feature Elimination Terhadap Kinerja Model Prediksi Dini Diabetes Mellitus di RS PKU Muhammadiyah Bima”, jtim, vol. 7, no. 3, hlm. 548–560, Jul 2025, doi: 10.35746/jtim.v7i3.774.

Artikel paling banyak dibaca berdasarkan penulis yang sama