+ . . PROJECT DETAIL . . +
DRIVER RISK SCORING SYSTEM
[ ACTIVE ]ML pipeline for large-scale driver risk scoring using vehicle telemetry.
Developed a driver behavior risk scoring pipeline that processes large-scale vehicle telemetry and trip datasets to generate normalized driver risk scores. The system extracts behavioral driving signals and applies gradient boosting models to estimate driver risk levels.
Key Contributions
- Implemented a high-performance data ingestion and preprocessing layer in Rust for transforming large trip and event datasets.
- Built feature engineering pipelines in Python (Pandas) to derive behavioral metrics including harsh braking frequency, rapid acceleration events, speed variance, trip duration, and driving pattern statistics.
- Trained and evaluated gradient boosting models (XGBoost / LightGBM) to classify driver risk levels and generate normalized risk scores.
- Improved processing efficiency by converting large raw telemetry datasets from CSV to Parquet columnar storage.
- Designed the pipeline for integration with downstream analytics systems and database-backed scoring workflows.
Tech Stack
Rust, Python, Pandas, XGBoost, LightGBM, PostgreSQL, Parquet
Links
Private – Developed for employer