Research Philosophy
My research focuses on multimodal learning, developing algorithms and architectures that effectively integrate multiple modalities. I specifically address modality bias and preference, where models over-rely on certain modalities while neglecting others. My work explores architectural designs and training strategies that promote balanced multimodal fusion, ensuring systems leverage complementary strengths across all input modalities.
I am equally committed to interpretability, fairness, and safety in multimodal AI. My research investigates methods to make model decisions transparent across modalities, identify and mitigate biases in multimodal representations, and develop robust fairness evaluation frameworks. I aim to advance multimodal systems that are technically proficient, trustworthy, and aligned with responsible deployment principles.
Current Focus
At Imperial College London, I'm exploring:
- Chain-of-Thought Faithfulness: Exploring evaluation metrics and reinforcement learning approaches to improve alignment between reasoning traces and decision-making in large language models.
- Optimal Transport for Multimodal Fusion: Investigating transport-based algorithms to fuse audio and visual modalities in LLMs, with emphasis on robustness under varying acoustic conditions.
- Diffusion-Based Lip-Sync Personalization: Applying layer-specific LoRA fine-tuning to capture speaker-specific articulation patterns and exploring cross-lingual transfer of lip-sync traits across identities and languages.
Beyond Research
I like to play Clash Royale, Chess, and FIFA in my free time.
Get in Touch
I'm always interested in research collaborations, discussing new ideas, or opportunities to contribute to impactful AI research. Feel free to reach out!
Email: piyush.arora25@imperial.ac.uk
Links:
Google Scholar ·
LinkedIn ·
GitHub