I am pursuing an MSc in Computing (Artificial Intelligence and Machine Learning) at Imperial College London, after completing a B.Tech in Computer Science and Engineering at IIT Jodhpur. My work focuses on multimodal AI, including vision-language models, audio-visual speech recognition, and speech tokenization. I have published at EMNLP and WACV, and I have experience with RL-based post-training, reproducible PyTorch pipelines, and applied model deployment. Before graduate study, I worked as a Research Engineer at MetaFusion, where I built and deployed vision-language systems for traffic analytics and automated monitoring.

Research Interests

  • Multimodal Learning
  • Reasoning Faithfulness in LLMs
  • Audio-Visual Speech Modeling
  • Efficient Speech Tokenization

Key Milestones

2025 - Present

Imperial College London

MSc in Computing (AI & ML)

2024 - 2025

Research Engineer - MetaFusion

Built and deployed vision-language systems for traffic and surveillance applications

2023

Mitacs Globalink Research Internship

University of Ottawa research internship on imbalanced learning methods

2020 - 2024

IIT Jodhpur

B.Tech in Computer Science and Engineering

Recent Highlights

  • Aug 2025: Paper accepted at EMNLP 2025: When Big Models Train Small Ones: Label-Free Model Parity Alignment for Efficient VQA
  • May 2024: Joined MetaFusion as Computer Vision Research Engineer
  • Nov 2023: Paper accepted at WACV 2024: Hybrid Sample Synthesis-Based Debiasing in Limited Data Settings

Education

Imperial College London

MSc Computing (Artificial Intelligence and Machine Learning)
2025 - Present London, UK Term 1 Grade: Distinction

Indian Institute of Technology Jodhpur

B.Tech Computer Science and Engineering
2020 - 2024 Jodhpur, India CGPA: 8.88/10
  • All India Rank 3316 in IIT-JEE 2020 among 1.3 million applicants
  • Selected for Mitacs Globalink Research Internship (2023)
  • Capstone project published at WACV 2024
  • Ranked among the top 5 students in the cohort