I'm an ex-Apple ML engineer building production AI systems: from on-device NLP features used by 65M+ users to model evaluation infrastructure, RAG tooling, and multimodal data pipelines.
Interests
What I Build
ML systems end-to-end: model training, evaluation pipelines, RAG tooling, on-device inference, and the CI/CD infrastructure that gets it all to production reliably. At Apple, I shipped production ML across multiple iOS releases and built Python-based CI/CD systems for dataset validation, model regression testing, and release quality gates.
AI Evaluation + Multimodal Systems
- Multi-turn LLM safety evaluation, co-advised by Prof. Rosanna Bellini and Prof. Damon McCoy : built an automated harness to collect and evaluate chatbot behavior across thousands of multi-turn conversations, using LLM-as-judge scoring validated at Cohen's Kappa 0.80–0.85.
- Multimodal video and data pipelines for robotic policy learning with Prof. Lerrel Pinto, NYU CILVR: 3× dataset generation via diffusion-based augmentation, cross-modal grounding with CLIP and VLMs, imitation learning workflows in JAX.
How I Work
I care about the full path from model behavior to user impact; where latency hides, how failures are caught, what it takes to ship AI outside a notebook.
Education
NYU Tandon
Master's, Computer Science
May 2026 · New York
NIT Surathkal
Bachelor's, Computer Science & Engineering
May 2021 · India
Outside the Code
Basketball, table tennis, and photography, preferably in a city I've never been to before.

Languages
ML / AI
Infra / Tools
Up for talking ML systems, AI evaluation, or interesting research problems.
Contact Me