Hi! I am a Senior Research Scientist at Nvidia. I currently work on multimodal language models (MLLMs) for data selection, generation, etc..., pursuit of hallucination free and robust MLLMs, and overall perception tasks. I received my Ph.D. from Carnegie Mellon University, where I was advised by Martial Hebert and Michael Tarr. My thesis was focused on computer vision and language models that can operate with real-world data distributions and applications.
Prior to my PhD, I completed my Masters at CMU, where I was advised by Abhinav Gupta.