Here’s a selection of my data science and analytics projects. Each one represents a real-world problem, a creative solution, and a story told through data. Click View on GitHub to explore the full code, datasets, and documentation.
Problem Statement:
In this project, I built a machine learning model to predict which employees are likely to leave the company (Attrition = 1). Early prediction of attrition helps HR teams implement proactive retention strategies and reduce the risk of losing key talent.
Objective:
To analyze patterns in employee data and develop a predictive model that identifies employees at high risk of attrition using advanced machine learning techniques.
Tools & Techniques Used:
• Python (Pandas, NumPy, Seaborn, Scikit-learn, XGBoost)
• Data Cleaning & Preprocessing
• Exploratory Data Analysis (EDA)
• Feature Engineering
• Oversampling & Undersampling (SMOTE)
• Classification Models: Decision Tree, Random Forest, AdaBoost, Gradient Boosting, and XGBoost
• Model Tuning & Evaluation using F1 Score and Confusion Matrix
Best Model: Tuned XGBoost Classifier
• Accuracy: 94.56%
• Precision: 90.08%
• Recall: 77.12%
• F1 Score: 83.10%
Insights Discovered:
• Employees working overtime, traveling frequently for business, or having lower monthly income showed higher attrition.
• Sales and R&D departments had a higher tendency of attrition.
• Environment satisfaction and job roles had visible influence on attrition.
• Around 16.1% of the workforce (approx. 237 out of 1470 employees) were at risk of leaving.
Conclusion:
The Tuned XGBoost model offered the most balanced and accurate prediction for employee attrition. With this tool, organizations can take data-driven actions to improve retention and optimize workforce management
Hi, I’m Godwin — Welcome to My World of Data Science & Authored Works
Step inside a space where data meets creativity, and every project tells a story worth hearing.
© 2025 Godwin Data Story • All Rights Reserved