Sinkhole Susceptibility Analysis Using Machine Learning for West Central Florida

Olanrewaju Muili


In this study, I compared the predictive capability of five machine learning methods (logistic regression (LR), multilayer perceptron neural network (MLP), support vector machine (SVM), k-nearest neighbor (KNN), and random forest (RF)), and used the best-performing model to construct a sinkhole susceptibility map (SSM) for west central Florida. A total of 9 layers were extracted from the collected data and employed as conditional factors for the correlation analysis. Factors with negligible contribution to the quality of predictions, according to the information gain ratio technique, were later discarded. The validation of the machine learning models, performed using different statistical indices and receiver operating characteristic (ROC) curves, revealed that the RF model has the highest prediction, and was used for the subsequent analysis. The SSM was divided into two levels (high susceptibility (H) and low susceptibility (L)), and the result was verified by Root Mean Squared Error (RMSE) and Confusion Matrix (CM).