Profile
I am currently pursuing an MSc in Computing (Artificial Intelligence and Machine Learning) at Imperial College London. I completed my B.Tech in Computer Science and Engineering at IIT Jodhpur. My research background spans multimodal learning, vision-language modeling, and audio-visual speech systems.
I have published work at EMNLP and WACV and contributed to under-review research in model debiasing, chain-of-thought faithfulness, and speaking style transfer. My work combines methodological research with reproducible engineering in PyTorch and GPU-based environments.
Current Focus
My current research at Imperial College London includes:
- CIRL-Faith: RL-based post-training with concept interventions to improve alignment between generated reasoning chains and internal model behavior.
- VIB-AVSR: LLM-based audio-visual speech recognition with variational information bottleneck regularization for stronger noise robustness.
- Matryoshka Neural Codec: Adaptive speech tokenization with binary spherical quantization for flexible rate-quality trade-offs.
- StyleLip: Few-shot speaking style transfer for lip synchronization using LoRA-based adaptation and a style-consistency metric.
Technical Focus
I work primarily with Python, C++, SQL, and Bash, and use PyTorch, TensorFlow, Hugging Face Transformers, OpenCV, and Scikit-learn. I also have experience with Docker, Kubernetes, and Linux-based deployment workflows.
Get in Touch
I welcome collaboration opportunities in multimodal AI, language model alignment, and speech technology. Please feel free to contact me by email or through the links below.
Email: piyush.arora25@imperial.ac.uk
Links:
Google Scholar ·
LinkedIn ·
GitHub