Projects

Turning Data Into Insight, and Insight Into Action

Here’s a selection of my data science and analytics projects. Each one represents a real-world problem, a creative solution, and a story told through data. Click View on GitHub to explore the full code, datasets, and documentation.

Project Title: Employee Attrition Prediction

Problem Statement:
In this project, I built a machine learning model to predict which employees are likely to leave the company (Attrition = 1). Early prediction of attrition helps HR teams implement proactive retention strategies and reduce the risk of losing key talent.

Objective:
To analyze patterns in employee data and develop a predictive model that identifies employees at high risk of attrition using advanced machine learning techniques.
Tools & Techniques Used:
• Python (Pandas, NumPy, Seaborn, Scikit-learn, XGBoost)
• Data Cleaning & Preprocessing
• Exploratory Data Analysis (EDA)
• Feature Engineering
• Oversampling & Undersampling (SMOTE)
• Classification Models: Decision Tree, Random Forest, AdaBoost, Gradient Boosting, and XGBoost
• Model Tuning & Evaluation using F1 Score and Confusion Matrix
Best Model: Tuned XGBoost Classifier
• Accuracy: 94.56%
• Precision: 90.08%
• Recall: 77.12%
• F1 Score: 83.10%
Insights Discovered:
• Employees working overtime, traveling frequently for business, or having lower monthly income showed higher attrition.
• Sales and R&D departments had a higher tendency of attrition.
• Environment satisfaction and job roles had visible influence on attrition.
• Around 16.1% of the workforce (approx. 237 out of 1470 employees) were at risk of leaving.

Conclusion:
The Tuned XGBoost model offered the most balanced and accurate prediction for employee attrition. With this tool, organizations can take data-driven actions to improve retention and optimize workforce management