Bridging Predictive Performance and Transparency: A Multi-Model Framework for Small-Business Loan Default Segmentation
Abstract
Purpose: This research aims to develop a practical and interpretable modeling framework that bridges predictive performance and transparency for SME loan default segmentation. Authors examine whether a calibrated LightGBM as a champion model, paired with an EBM challenger, can maximize predictive accuracy while meeting the transparency and compliance needs of lenders. The goal is to accurately classify loans into risk tiers (e.g., low, medium, high risk of default), while maintaining clarity in decision-making.
Methods: For the multi-tier classification, authors report confusion matrices and tier-wise performance (e.g., what fraction of actual defaults fell into High vs. Medium, etc.), but since the Medium tier is a derived category, they primarily focus on the calibrated probabilities and their alignment with outcomes rather than treating it as a separate ground-truth class. Model selection and hyperparameter tuning were performed via cross-validation on the training set. The LightGBM and EBM models were primarily optimized for ROC AUC. Authors also monitored calibration (via calibration plots) to ensure the LightGBM + isotonic pipeline was yielding well-calibrated probabilities. All results reported in the next section are on the unseen test set, simulating how the models would perform on new loan applications.
Findings: A calibrated Light Gradient Boosting Machine (LightGBM) achieves the highest performance (ROC-AUC 0.969), while an Explainable Boosting Machine (EBM) offers nearly equal accuracy (ROC-AUC 0.963) with full transparency. With observed default rates of 2.5%, 48.8%, and 89.7%, calibrated LightGBM probability outputs are used to determine risk tiers of Low, Medium, and High. Our results show that modern ensemble methods significantly outperform traditional models, and when paired with inherently interpretable alternatives like EBM, they provide both superior predictive power and regulatory-compliant explainability.
Implications: It would be valuable to test the framework on different datasets (e.g., LendingClub data, mortgage datasets, or non-U.S. SME loans) to ensure its robustness. A broad validation would strengthen confidence that a LightGBM–EBM approach generalizes well across credit contexts, or highlight what adjustments are needed (perhaps tuning hyperparameters or calibration differently).
Originality: A practical blueprint for SME credit risk management on commodity hardware, this LightGBM–EBM champion–challenger stack provides state-of-the-art accuracy, interpretable insights, and capital-efficient risk segmentation.
-
Page Number : 1-17
-
Published Date : 2025-12-24
-
Keywords
Loan prediction, Gradient boosting, Explainable boosting machine, Probability calibration, Machine learning, Feature selection
-
DOI Number
10.15415/jtmge/2025.162001
-
Authors
Minh Nguyen Hoang, Thota Sai Karthikeya, and Thota Sree Mallikharjuna Rao
References
- Agarwal, R., Frosst, N., Zhang, X., Caruana, R., & Hinton, G. (2021). Neural additive models: Interpretable machine learning with neural nets. Advances in Neural Information Processing Systems (NeurIPS).
- Arik, S. Ö., & Pfister, T. (2021). Explainable Neural Networks (XNNs): A survey. IEEE Transactions on Neural Networks and Learning Systems.
- Baesens, B., Gestel, T. V., Viaene, S., Stepanova, M., Suykens, J., & Vanthienen, J. (2003). Benchmarking state-of-the-art classification algorithms for credit scoring. Journal of the Operational Research Society, 54(6), 627– 635.
- Baesens, B., Roesch, D., & Scheule, H. (2023). Machine learning and AI for credit risk. Wiley & Sons.
- Barocas, S., Hardt, M., & Narayanan, A. (2019). Fairness and machine learning: Limitations and opportunities. MIT Press.
- Bellotti, T., & Crook, J. (2013). Forecasting and stress testing credit card default using dynamic models. International Journal of Forecasting, 29(4), 563–574.
- Binns, R., Veale, M., Van Kleek, M., & Shadbolt, N. (2018). ‘It’s reducing a human being to a percentage’: Perceptions of justice in algorithmic decisions. Proceedings of CHI 2018. https://doi.org/10.1145/3173574.3173951
- Bone-Winkel, G. F., & Reichenbach, F. (2024). Improving credit risk assessment in P2P lending with explainable machine learning survival analysis. Digital Finance. https://doi.org/10.1007/s42521-024-00114-3
- Breiman, L. (2001). Random Forests. Machine Learning, 45(1), 5–32. https://doi.org/10.1023/A: 1010933404324
- Carmody, E., & Ajani, T. (2023). Predicting credit default with machine learning: An Australian case study. Journal of Risk Modeling, 7(1), 45–62.
- Černevičienė, J., & Kabašinskas, A. (2024). Explainable artificial intelligence (XAI) in finance: A systematic literature review. Journal of Financial Innovation, 15(2), 100–124.
- Dessain, J., Bentaleb, N., & Vinas, F. (2023). Cost of explainability in AI: An example with credit scoring models. In Advances in Financial Machine Learning (pp. 498–516). Springer. https://doi.org/ 10.1007/978-3-031-44064-9_26
- Do, T. T., Babaei, G., & Pagnottoni, P. (2024). Explainable machine learning for credit risk management when features are dependent. Measurement: Interdisciplinary Research and Perspectives, 22(4), 315–340.
- Haque, F. M. A., & Hassan, M. M. (2024). Bank loan prediction using machine learning techniques. American Journal of Industrial and Business Management, 14(12), 1690–1711. https://doi.org/10.4236/ajibm. 2024.1412085
- Hjelkrem, L. O., & Lange, P. E. (2023). Explaining deep learning models for credit scoring with SHAP. Journal of Risk and Financial Management, 16(4), 221. https://doi.org/10.3390/jrfm16040221
- Jakubik, P., & Teleu, S. (2025). Improving credit risk assessment in uncertain times: Insights from IFRS 9. Risks, 13(2), 38. https://doi.org/10.3390/risks13020038
- Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., … & Liu, T. Y. (2017). LightGBM: A highly efficient gradient boosting decision tree. In Advances in Neural Information Processing Systems 30 (NIPS 2017).
- Li, H., & Wu, W. (2023). Loan default predictability with explainable machine learning. Finance Research Letters, 52, 104993.
- Lou, Y., Caruana, R., & Gehrke, J. (2012). Intelligible models for classification and regression. In Proceedings of KDD 2012 (pp. 150–158). ACM.
- Nallakaruppan, M. K., Balusamy, B., Shri, M. L. A., & Kannan, K. (2023). Transforming credit risk assessment: A systematic review of AI and machine learning approaches. Journal of Banking and Financial Technology, 27(3), 234–256.
- Nori, H., Jenkins, S., Koch, P., & Caruana, R. (2019). InterpretML: A unified framework for machine learning interpretability. arXiv preprint arXiv:1909.09223.
- Noriega, J. R., Rivera, L., & Herrera, A. (2023). Ensemble-based machine learning algorithm for loan default risk prediction. Journal of Computational Finance and Economics, 2(1), 33–47.
- Sinap, V. (2024). A comparative study of loan approval prediction using machine learning methods. Gazi Üniversitesi Fen Bilimleri Dergisi (GUJSC). https://doi.org/10.29109/gujsc.1455978
- Singh, V., Yadav, A., Awasthi, R., & Partheeban, G. N. (2021). Prediction of modernized loan approval system based on machine learning approach. Proceedings of CONIT 2021 (IEEE). https://doi.org/10.1109/ CONIT51480.2021.9498475
- Tavakoli, M., Chandra, R., Tian, F., & Bravo, C. (2023). Multi-modal deep learning for credit rating prediction. arXiv:2301.01234.
- Uddin, N., Ahamed, M. K., Uddin, M. A., Islam, M. M., Talukder, M. A., & Aryal, S. (2023). Ensemble machine learning-based bank loan approval prediction. International Journal of Cognitive Computing in Engineering, 4, 327–339. https://doi.org/10.1016/j.ijcce.2023.09.001
- Zadrozny, B., & Elkan, C. (2002). Transforming classifier scores into accurate multiclass probability estimates. In Proceedings of KDD 2002 (pp. 694–699).