Research Experience
Improving Chain-of-Thought Faithfulness in LLMs
Oct 2025 - Feb 2026
Imperial College London
Supervisor: Dr. Oana-Maria Camburu
Developed CIRL-Faith, a concept-intervention based training algorithm that exposes and corrects reasoning chains misaligned with internal computations. Formulated RL-based post-training using Group Relative Policy Optimization (GRPO) with intervention and process driven reward signals. Evaluated across Qwen3-4B, Qwen3-8B, and Llama3.1-8B using established faithfulness metrics, achieving gains of up to 30 percent over respective base models.
- PyTorch
- GRPO
- LLM Evaluation
VIB-AVSR for Noise-Robust LLM-Based AVSR
Nov 2025 - Mar 2026
Imperial College London
Supervisors: Dr. Umberto Cappellazzo, Dr. Stavros Petridis, Dr. Maja Pantic
Developed VIB-AVSR, an LLM-based audio-visual speech recognition model connecting pre-trained audio-visual encoders to an LLM backbone. Integrated Variational Information Bottleneck layers at targeted positions within the LLM backbone to regularize representations. Reduced performance degradation across multiple SNR levels and noise types without architectural modifications or additional training data.
- PyTorch
- Audio-Visual Modeling
- Speech Recognition
StyleLip: Cross-Identity Speaking Style Transfer
Oct 2025 - Mar 2026
Imperial College London
Supervisors: Dr. Antoni Bigata, Dr. Stavros Petridis, Dr. Maja Pantic
Developed StyleLip, a few-shot framework for speaking style transfer in audio-driven lip-synchronization, capturing person-specific idiosyncratic lip movements across different identities using LoRA layers. Regularized training via synthetically generated multi-identity samples to prevent identity leakage. Introduced SSCS, a novel quantitative metric to measure speaking style transfer fidelity.
- LoRA
- Lip-Sync
- Style Transfer
When Big Models Train Small Ones
Dec 2024 - Mar 2025
IIT Jodhpur (VL2G Lab)
Supervisor: Dr. Anand Mishra
Designed Model Parity Alignment (MPA) for Visual Question Answering to train small VLMs (SmolVLM-500M, TinyLLaVA-2B, InternVL2-2B/4B) from large VLMs (Qwen2VL-7B, InternVL2-8B, GPT-4o) using unlabeled data. Developed disparity-aware training to mitigate hallucinations via automated QA pair generation and filtering. Achieved consistent accuracy gains up to 6 percent on VQA benchmarks including TextVQA, ST-VQA, OKVQA, MedicalVQA, and ChartQA.
- PyTorch
- VLMs
- VQA
Hybrid Sample Synthesis-Based Debiasing in Limited Data Settings
Dec 2022 - Mar 2023
IIT Jodhpur
Supervisor: Dr. Pratik Mazumder
Designed a hybrid sample synthesis method for low-data regimes, enabling ResNet classifiers to generalize under unknown biases. Built a dual-model framework to create bias-conflicting samples without relying on explicit bias annotations. Achieved up to 10 percent accuracy improvement over prior methods on benchmarks such as corrupted CIFAR-10, Colored MNIST, and BFFHQ.
- PyTorch
- ResNet
- Debiasing