Research Project
Empirical Software Engineering: Identifying Bug-Prone Java Files via Metrics
Description
- Implemented scripts to extract 18 source-code metrics (e.g., coupling, complexity via CKJM) and 7 change metrics (e.g., LOC added/deleted) from 835+ SVN diffs.
- Trained scikit-learn classifiers that achieved >91% accuracy at flagging bug-prone files, which guided Agile policy updates and reduced post-release defects.
- Automated the metrics pipeline and addressed multicollinearity, improving reproducibility and model reliability across iterations.
Stack: Java, Python, scikit-learn, SVN
