Go to main content
Formats
Format
BibTeX
MARCXML
TextMARC
MARC
DataCite
DublinCore
EndNote
NLM
RefWorks
RIS

Files

Abstract

The present work evaluates the effectiveness of various supervised machine learning (ML) methods for attrition modeling using real-world employee data, which includes self-reported, HRIS, and performance-related features. Seven algorithms—tree-based methods (CART, random forest), regularized regression (elastic net, LASSO, ridge regression), and a hybrid method incorporating decision trees and regularized regression (XGBoost)—are compared to the traditional logistic regression model across two sample sizes (500, 1000) and 7 months (April-October). In summary, the two ensemble methods—XGBoost and random forest—demonstrated the best classification performance and performed equally well across different sample sizes. These methods relied on information from all three data sources to make predictions, with tenure and pay being the most important features. The study not only seeks to inform the use of ML in attrition modeling but also provides a foundation for future explanatory research by comparing predictive models with traditional methods. It is hoped that these results will inform practitioners in model selection and will guide additional research in this growing area of inquiry.

Details

PDF

Statistics

from
to
Export
Download Full History