Lead Machine Learning Engineer | NLP & LLM Systems | Computer Vision | Recommendation Systems
Lead Machine Learning Engineer and Data Scientist with 6+ years building production ML systems end-to-end, from problem framing and experimentation to deployment, monitoring, and iteration. I specialize in NLP, computer vision, recommendation systems, and predictive analytics.
Currently, I serve as Lead Data Scientist at Training Arc, where I build player-facing analytics using telemetry data and video-based computer vision pipelines. I also hold an appointment as Adjunct Professor, Faculty of Land and Food Systems at the University of British Columbia (UBC), teaching production ML workflows, validation strategies, and data analytics.
I received my MSc in Computer Science from the University of Manitoba, where my thesis focused on disentangled variational autoencoders for unsupervised anomaly detection. My research interests include anomaly detection, representation learning, and probabilistic modeling.
ML/DL Frameworks: PyTorch, TensorFlow, scikit-learn, XGBoost, LightGBM, Hugging Face Transformers
NLP & LLM Systems: RAG pipelines, embedding models, semantic search, LangChain, text classification, summarization, prompt engineering
Computer Vision: YOLO (fine-tuning & deployment), OpenCV, object detection, segmentation, keypoint localization, real-time inference optimization
Recommendation Systems: Two-tower retrieval, learning to rank, sequence transformers, embedding-based personalization, evaluation frameworks (NDCG, Recall@K, MAP)
Production Stack: FastAPI, Flask, REST APIs, Docker, Kubernetes, CI/CD pipelines, microservices, observability, incident triage
Cloud & Infrastructure: AWS, Azure, GCP, containerized deployments, batch/streaming inference, drift and data quality monitoring
Data Engineering: Snowflake, PostgreSQL, PySpark, Airflow, DBT, pandas, NumPy
Visualization & BI: Power BI, Tableau, Matplotlib, Seaborn, Plotly
Languages: Python, SQL, Bash, R, C++
Real-Time Video Analytics Pipeline
Built an end-to-end computer vision pipeline for frame extraction, object detection using YOLO-based models, temporal event tracking, and metric computation. Implemented timestamp alignment, config-driven execution, and structured output formats for downstream analytics.
Ranking and Personalization System
Designed a ranking service using gradient-boosted models and embedding features, served through a low-latency API with caching and feature hydration. Built offline evaluation framework (NDCG, precision@k) with drift monitoring and regression detection.
Customer Lifecycle Prediction
Developed predictive models for user behavior using engagement and transaction signals. Implemented feature pipelines with backtesting support, SHAP-based explainability for stakeholder reporting, and automated recalibration workflows.
Retrieval-Augmented Generation Assistant
Built a retrieval-based Q&A system using embedding pipelines and vector search with citation-style outputs. Developed rubric-driven evaluation harness with test suites, edge-case probes, and quality guardrails for production deployment.
Generative Models for Anomaly Detection
Implemented disentangled variational autoencoders (DCVAE) combining conditional structure with total-correlation regularization for unsupervised anomaly detection. Published benchmarking results across multiple datasets with reproducible training pipelines.
KPI Layer and Experimentation Platform
Built standardized metric definitions and A/B analysis workflows backed by warehouse-first architecture. Delivered executive dashboards with automated refresh, data quality gates, and anomaly alerting.
- [August 2025] Led a week-long MFRE bootcamp and workshop series on Python and R covering data access, visualization, and coding for economic analysis.
- [April 2025] Supervised graduate students in the UBC MFRE Summer Program.
- [November 2024] Paper titled "Disentangled Conditional Variational Autoencoder for Unsupervised Anomaly Detection" accepted at IEEE BigData 2024. IEEE Xplore
- [July 2024] Paper titled "A Comprehensive Study of Auto-Encoders for Anomaly Detection: Efficiency and Trade-offs" published in Machine Learning with Applications. ScienceDirect
- [June 2024] Received Research Dissemination Present and Publish Grant from Douglas College.
- [December 2023] Joined Douglas College as Full-time Regular Faculty Member.
See my Google Scholar for the complete list.
Disentangled Conditional Variational Autoencoder for Unsupervised Anomaly Detection
IEEE BigData 2024 | Paper | Code
A Comprehensive Study of Auto-Encoders for Anomaly Detection: Efficiency and Trade-offs
Machine Learning with Applications, 2024 | Paper | Code
Feature Extraction and Prediction of Combined Text and Survey Data using Two-Staged Modeling
IEEE ICDM Workshop, 2022 | Paper | Code
- Data Scaler Selector: Open-source library to select appropriate data scalers for ML models.
- Image to Sketch: Convert color or B&W images to pencil sketches.
- Data Preparer (In Progress): Clean and prepare datasets before applying ML models.




