Publications

* denotes equal contribution

When Big Models Train Small Ones: Label-Free Model Parity Alignment for Efficient Visual Question Answering with Small VLMs

A. Subramanyam*, N. Singh*, P. Arora*, A. Mishra

EMNLP 2025

Hybrid Sample Synthesis-Based Debiasing of Classifier in Limited Data Setting

P. Arora*, P. Mazumder*

WACV 2024

BLADE: Bias-Linked Adaptive DEbiasing

P. Arora*, N. Singh*, V. Diwan*, P. Mazumder

Under Review

Research Experience

Improving Chain-of-Thought Faithfulness in LLMs

Oct 2025 - Feb 2026

Imperial College London
Supervisor: Dr. Oana-Maria Camburu

Developed CIRL-Faith, a concept-intervention based training algorithm that exposes and corrects reasoning chains misaligned with internal computations. Formulated RL-based post-training using Group Relative Policy Optimization (GRPO) with intervention and process driven reward signals. Evaluated across Qwen3-4B, Qwen3-8B, and Llama3.1-8B using established faithfulness metrics, achieving gains of up to 30 percent over respective base models.

  • PyTorch
  • GRPO
  • LLM Evaluation

VIB-AVSR for Noise-Robust LLM-Based AVSR

Nov 2025 - Mar 2026

Imperial College London
Supervisors: Dr. Umberto Cappellazzo, Dr. Stavros Petridis, Dr. Maja Pantic

Developed VIB-AVSR, an LLM-based audio-visual speech recognition model connecting pre-trained audio-visual encoders to an LLM backbone. Integrated Variational Information Bottleneck layers at targeted positions within the LLM backbone to regularize representations. Reduced performance degradation across multiple SNR levels and noise types without architectural modifications or additional training data.

  • PyTorch
  • Audio-Visual Modeling
  • Speech Recognition

StyleLip: Cross-Identity Speaking Style Transfer

Oct 2025 - Mar 2026

Imperial College London
Supervisors: Dr. Antoni Bigata, Dr. Stavros Petridis, Dr. Maja Pantic

Developed StyleLip, a few-shot framework for speaking style transfer in audio-driven lip-synchronization, capturing person-specific idiosyncratic lip movements across different identities using LoRA layers. Regularized training via synthetically generated multi-identity samples to prevent identity leakage. Introduced SSCS, a novel quantitative metric to measure speaking style transfer fidelity.

  • LoRA
  • Lip-Sync
  • Style Transfer

When Big Models Train Small Ones

Dec 2024 - Mar 2025

IIT Jodhpur (VL2G Lab)
Supervisor: Dr. Anand Mishra

Designed Model Parity Alignment (MPA) for Visual Question Answering to train small VLMs (SmolVLM-500M, TinyLLaVA-2B, InternVL2-2B/4B) from large VLMs (Qwen2VL-7B, InternVL2-8B, GPT-4o) using unlabeled data. Developed disparity-aware training to mitigate hallucinations via automated QA pair generation and filtering. Achieved consistent accuracy gains up to 6 percent on VQA benchmarks including TextVQA, ST-VQA, OKVQA, MedicalVQA, and ChartQA.

  • PyTorch
  • VLMs
  • VQA

Hybrid Sample Synthesis-Based Debiasing in Limited Data Settings

Dec 2022 - Mar 2023

IIT Jodhpur
Supervisor: Dr. Pratik Mazumder

Designed a hybrid sample synthesis method for low-data regimes, enabling ResNet classifiers to generalize under unknown biases. Built a dual-model framework to create bias-conflicting samples without relying on explicit bias annotations. Achieved up to 10 percent accuracy improvement over prior methods on benchmarks such as corrupted CIFAR-10, Colored MNIST, and BFFHQ.

  • PyTorch
  • ResNet
  • Debiasing