Maximizing Interpretability and Cost-Effectiveness of Surgical Site Infection (SSI) Predictive Models Using Feature-Specific Regularized Logistic Regression on Preoperative Temporal Data
AuthorKocbek, Primoz; Fijacko, Nino; Soguero-Ruiz, Cristina; Mikalsen, Karl Øyvind; Maver, Uros; Brzan, Petra Povalej; Stozer, Andraz; Jenssen, Robert; Skrøvseth, Stein Olav; Stiglic, Gregor
This study describes a novel approach to solve the surgical site infection (SSI) classification problem. Feature engineering has traditionally been one of the most important steps in solving complex classification problems, especially in cases with temporal data. The described novel approach is based on abstraction of temporal data recorded in three temporal windows. Maximum likelihood L1-norm (lasso) regularization was used in penalized logistic regression to predict the onset of surgical site infection occurrence based on available patient blood testing results up to the day of surgery. Prior knowledge of predictors (blood tests) was integrated in the modelling by introduction of penalty factors depending on blood test prices and an early stopping parameter limiting the maximum number of selected features used in predictive modelling. Finally, solutions resulting in higher interpretability and cost-effectiveness were demonstrated. Using repeated holdout cross-validation, the baseline C-reactive protein (CRP) classifier achieved a mean AUC of 0.801, whereas our best full lasso model achieved a mean AUC of 0.956. Best model testing results were achieved for full lasso model with maximum number of features limited at 20 features with an AUC of 0.967. Presented models showed the potential to not only support domain experts in their decision making but could also prove invaluable for improvement in prediction of SSI occurrence, which may even help setting new guidelines in the field of preoperative SSI prevention and surveillance.
Source at https://doi.org/10.1155/2019/2059851.