Deteksi Inefisiensi pada Klaim BPJS Kesehatan dengan menggunakan Machine Learning

Authors

  • Hanif Noer Rofiq a:1:{s:5:"en_US";s:20:"Kementerian Keuangan";}

DOI:

https://doi.org/10.53756/jjkn.v3i1.134

Keywords:

Machine Learning, Random Forest, Tomek Links, Inefisiensi, BPJS Kesehatan

Abstract

Inefficiency in claims submitted by health facilities to BPJS Kesehatan is a problem that needs attention. With the growing number of claims, traditional method such as manual verification is unable to handle enormous amounts of data swiftly. Simultaneously, there is a demand to accelerate the settlement of claims with a limited number of verifiers. One way that can be adopted is to use Machine Learning to detect potential inefficient transactions rapidly. This study compares several Machine Learning algorithms: Random Forest, Gradient Boosting Classifier, Decision Tree, Support Vector Machine, Naive Bayes, CatBoost, and XGBoost. In addition, oversampling and under sampling methods are also used because the dataset is imbalanced. The best results were obtained using the Random Forest + Tomek Links model, which produced an F1 score of 19,53, with the five most influential variables: location of the health facility, participant's age, first-level health facilities diagnosis, participant's primary diagnosis, and health facility type.

References

Annisa, R., Winda, S., Dwisaputro, E., & Isnaini, K. N. (2020). Mengatasi Defisit Dana Jaminan Sosial Kesehatan Melalui Perbaikan Tata Kelola. Integritas: Jurnal Antikorupsi, 6(2), 209-224.

Badan Penyelenggara Jaminan Sosial. (2023). Peserta Program JKN. Diakses pada 12 April 2023, dari https://bpjs-kesehatan.go.id/bpjs/.

Badan Penyelenggara Jaminan Sosial. (2023). Sukses Pertahankan WTM, Ini Sejumlah Capaian BPJS Kesehatan di Tahun 2021. Diakses pada 12 April 2023, dari https://www.bpjs-kesehatan.go.id/bpjs/post/read/2022/2341/Sukses-Pertahankan-WTM-Ini-Sejumlah-Capaian-BPJS-Kesehatan-di-Tahun-2021.

Badan Penyelenggara Jaminan Sosial. (2022). Laporan Pengelolaan Program dan Keuangan BPJS Kesehatan Tahun 2021 (Auditan). BJPS Kesehatan.

Chandra, A., & Staiger, D. O. (2020). Identifying sources of inefficiency in healthcare. The Quarterly Journal of Economics, 135(2), 785-843.

D. Berrar, Cross-Validation. Encyclopedia of Bioinformatics and Computational Biology, Elsevier, vol. 1, pp. 542-545, 2019

Ding, Y., Fan, L., & Liu, X. (2021). Analysis of feature matrix in machine learning algorithms to predict energy consumption of public buildings. Energy and Buildings, 249, 111208.

Eni, Y., & Edi Abdurachman, R. (2020). Analysis of Efficiency Level Using Data Envelopment Analysis (Dea) Method At The Indonesian Hospitals. PalArch's Journal of Archaeology of Egypt/Egyptology, 17(7), 2463-2475.

GEMCİ, F., İBRİKÇİ, T., & ÇEVİK, U. Comparative of Success of KNN With New Proposed K-Split Method And Stratified Cross Validation On Remote Homologue Protein Detection. Eskişehir Technical University Journal of Science and Technology A-Applied Sciences and Engineering, 23(1), 87-108.

Hairani, H., Anggrawan, A., & Priyanto, D. (2023). Improvement Performance of the Random Forest Method on Unbalanced Diabetes Data Classification Using Smote-Tomek Link. JOIV: International Journal on Informatics Visualization, 7(1), 258-264.

Huilgol, P. (last modification 30 Mei 2023). Precision and Recall | Essential Metrics for Data Analysis (Updated 2023). https://www.analyticsvidhya.com/blog/2020/09/precision-recall-machine-learning/#What_is_Recall.

Ibrahim, M., Torki, M., & El-Makky, N. (2018, December). Imbalanced toxic comments classification using data augmentation and deep learning. In 2018 17th IEEE international conference on machine learning and applications (ICMLA) (pp. 875-878). IEEE.

Janet, B., & Ganesh, D. P. S. (2022, April). Credit Card Fraud Detection with Unbalanced Real and Synthetic dataset using Machine Learning models. In 2022 International Conference on Electronic Systems and Intelligent Computing (ICESIC) (pp. 73-78). IEEE.

Jiang, T., Gradus, J. L., & Rosellini, A. J. (2020). Supervised machine learning: a brief primer. Behavior Therapy, 51(5), 675-687.

Lee, T., Kim, M., & Kim, S. P. (2020). Improvement of P300-based brain–computer interfaces for home appliances control by data balancing techniques. Sensors, 20(19), 5576.

Lu, C., Lin, S., Liu, X., & Shi, H. (2020, May). Telecom fraud identification based on ADASYN and random forest. In 2020 5th International Conference on Computer and Communication Systems (ICCCS) (pp. 447-452). IEEE.

Mishra, P., Biancolillo, A., Roger, J. M., Marini, F., & Rutledge, D. N. (2020). New data preprocessing trends based on ensemble of multiple preprocessing techniques. TrAC Trends in Analytical Chemistry, 132, 116045.

Mqadi, N. M., Naicker, N., & Adeliyi, T. (2021). Solving misclassification of the credit card imbalance problem using near miss. Mathematical Problems in Engineering, 2021, 1-16.

Pramono, S. (2018). Gambaran Proses Klaim Jkn Kis Dan Pelaksanaan Kredensialing Fasilitas Pelayanan Kesehatan Mitra Bpjs Kesehatan Kantor Cabang Purwokerto (Doctoral dissertation, Universitas Harapan Bangsa).

Putra, D. A. A., & Kusumo, M. P. (2016). Model Verifikasi Klaim BPJS Pasien Rawat Inap di RS PKU Muhammadiyah Gamping. In Prosiding Interdisciplinary Postgraduate Student Conference.tract/135/2/785/5698324

Putri, N. K. A., Karjono, K., & Uktutias, S. A. (2019). Analisis Faktor-Faktor Penyebab Keterlambatan Pengajuan Klaim BPJS Kesehatan Pasien Rawat Inap Di RSUD Dr. R. Sosodoro Djatikoesoemo Bojonegoro. Jurnal Manajemen Kesehatan Yayasan RS. Dr. Soetomo, 5(2), 134-143.

Sawangarreerak, S., & Thanathamathee, P. (2020). Random forest with sampling techniques for handling imbalanced prediction of university student depression. Information, 11(11), 519.

Shalabi, A., Coolen, A. C. C., & de Rinaldis, E. (2014). Overcoming computational inability to predict clinical outcome from high-dimensional patient data using Bayesian methods. arXiv preprint arXiv:1406.5062.

Shaohui, D., Qiu, G., Mai, H., & Yu, H. (2021, January). Customer transaction fraud detection using random forest. In 2021 IEEE International Conference on Consumer Electronics and Computer Engineering (ICCECE) (pp. 144-147). IEEE.

Shrank, W. H., Rogstad, T. L., & Parekh, N. (2019). Waste in the US health care system: estimated costs and potential for savings. Jama, 322(15), 1501-1509.

Tyagi, S., & Talbar, S. N. (2022). CSE-GAN: A 3D conditional generative adversarial network with concurrent squeeze-and-excitation blocks for lung nodule segmentation. Computers in Biology and Medicine, 147, 105781.

Wang, S., Liu, S., Zhang, J., Che, X., Yuan, Y., Wang, Z., & Kong, D. (2020). A new method of diesel fuel brands identification: SMOTE oversampling combined with XGBoost ensemble learning. Fuel, 282, 118848.

Ye, H., Xiang, L., & Gan, Y. (2019, October). Detecting financial statement fraud using random forest with SMOTE. In IOP Conference Series: Materials Science and Engineering (Vol. 612, No. 5, p. 052051). IOP Publishing.

Zeng X. Martinez TR. Distribution-balanced stratified cross-validation for accuracy estimation. Journal of Experimental & Theoretical Artificial Intelligence, 2000.

Zhang, Y., & Ling, C. (2018). A strategy to apply machine learning to small datasets in materials science. Npj Computational Materials, 4(1), 25.

Zhu, R., Wang, Z., Ma, Z., Wang, G., & Xue, J. H. (2018). LRID: A new metric of multi-class imbalance degree based on likelihood-ratio test. Pattern Recognition Letters, 116, 36-42.

Peraturan Menteri Kesehatan Nomor 76 Tahun 2016 tentang Pedoman Indonesian Case Base Group (INA-CBG) dalam Pelaksanaan Jaminan Kesehatan Nasional.

Peraturan Menteri Kesehatan Republik Indonesia Nomor 16 Tahun 2019 tentang Pencegahan dan Penanganan Kecurangan (Fraud) serta Pengenaan Sanksi Administrasi Terhadap Kecurangan (Fraud) dalam Pelaksanaan Program Jaminan Kesehatan Nasional.

Published

30-06-2023

How to Cite

Rofiq, H. N. (2023). Deteksi Inefisiensi pada Klaim BPJS Kesehatan dengan menggunakan Machine Learning . Jurnal Jaminan Kesehatan Nasional, 3(1). https://doi.org/10.53756/jjkn.v3i1.134