Analyze employee performance and departmental productivity using SQL (SQLite) for KPI aggregation and Python for analytics and visualization.
This project transforms HR data into actionable business insights, identifying high performers, efficiency trends, and departmental KPIs.
This project demonstrates how to:
- Use SQL for feature engineering and KPI calculation
- Use Python for data analysis and visualization
- Deliver data-driven HR insights in an end-to-end workflow
- Compute department-level performance indicators
- Assess employee efficiency and attendance
- Visualize relationships between workload, tasks, and ratings
- Generate clean and ready-to-use summary reports
employee-performance-analytics/
├── README.md
├── requirements.txt
├── data/
│ └── employees.csv
├── src/
│ ├── create_db.py
│ ├── queries.sql
│ ├── analyze_performance.py
│ └── utils.py
└── outputs/
├── department_kpis.csv
├── performance_summary.csv
└── charts/
├── avg_rating_by_department.png
├── performance_vs_hours.png
└── task_completion_rate.png
| Column | Description |
|---|---|
employee_id |
Unique employee identifier |
name |
Employee name |
department |
Department name (Engineering, Sales, etc.) |
role |
Role title |
date |
Record date (YYYY-MM-DD) |
tasks_completed |
Number of tasks completed |
hours_worked |
Hours worked on that day |
rating |
Daily performance rating (1–5) |
projects |
Active projects |
absences |
1 if absent, else 0 |
The dataset (
employees.csv) is synthetic, generated with realistic departmental trends and biases.
The SQL script creates views and extracts three analytical datasets:
-
department_kpis– Department-level KPIs:- Average rating
- Tasks per department
- Total hours
- Absence rates
-
employee_summary– Individual performance summaries:- Total tasks, hours, projects, absences
- Average ratings
- Tasks per hour (efficiency)
-
daily_productivity– Day-wise workload and productivity data.
Chart: outputs/charts/avg_rating_by_department.png
Insight:
- Clear variation between departments (Finance & Engineering higher, Support & Sales lower)
- Indicates which departments maintain strong consistency and performance culture.
Chart: outputs/charts/performance_vs_hours.png
Insight:
- Positive correlation: higher hours → more tasks (up to a plateau)
- Clusters show standard workloads; outliers can reveal inefficiency or exceptional performers.
Chart: outputs/charts/task_completion_rate.png
Insight:
- Most employees average around 1 task/hour
- High performers exceed 1.4, low performers under 0.8
- Useful for spotting training needs or recognizing excellence.
python -m venv .venv
# Windows
.venv\Scripts\activate
# macOS/Linux
source .venv/bin/activatepip install -r requirements.txtpython src/create_db.py --csv data/employees.csv --db hr.dbpython src/analyze_performance.py --db hr.db --sql src/queries.sql --outdir outputsAll CSV reports and charts will be saved in the outputs/ directory.
- Department-level performance gaps can reveal resource or leadership factors.
- Efficiency distribution (tasks/hour) identifies both low performers and power users.
- Workload-to-performance trends help balance effort vs productivity.
- Absence tracking adds HR alignment to the analysis.
| Tool | Purpose |
|---|---|
| Python (pandas, matplotlib) | Data manipulation & visualization |
| SQLite | Querying & KPI computation |
| SQL | Feature engineering & aggregation |
| Jupyter / VS Code | Development & presentation |
This project demonstrates:
- SQL + Python integration for analytics
- Data storytelling and visualization
- Human Resource analytics capability
- Clean, modular, and reproducible data science workflow