Health Insurance Cost Prediction Using Machine Learning Based Regression
Author(s):
Ramesh Prasad Bhatta
Journal:
International Journal of Advances in Engineering and Computer Science
Abstract
Accurate prediction of health insurance costs is essential for effective financial planning and policy formulation in the healthcare sector. With the growing availability of healthcare-related data, machine learning methods have become increasingly useful for estimating medical expenses based on individual characteristics. This study observes the performance of four regression-based machine learning models KNN, Ridge Regression, Lasso Regression, and Extreme Gradient Boosting (XGBoost) using the medical most personal on dataset obtained from Kaggle. The model performance was evaluated using the R²-score and Root Mean Square Error (RMSE). The results show that XGBoost achieved the highest prediction accuracy with an R²-score of 0.88 and an RMSE of 3217.53. In comparison, KNN, Ridge, and Lasso achieved R²-scores of 0.76, 0.74, and 0.72, respectively, with higher RMSE values. The results specify XGBoost is more effective in catching complex relationships within insurance cost data, foremost to improved prediction accuracy. These findings highlight XGBoost as the most effective model for accurate prediction of health insurance costs. The implication was to develop the best model for accurately predicting medical costs which can considerably benefit insurance firms in risk assessment and premium computation.
Keywords:
Machine Learning, Regression, RMSE, XGBoost, KNN.