Ciao! I am Roberto Amoroso, a Senior Research Engineer at NVIDIA in Munich, Germany 🇩🇪, where I pre-train and fine-tune Vision-Language Models (VLMs) for video understanding and large-scale multimodal retrieval, with a focus on Autonomous Vehicle applications.
My work centers on VLM-based retrieval architectures that align language with visual signals at scale, enabling users and AI agents to surface relevant moments, objects, and scenes across massive image and video collections.
Currently, I am training VLMs on fleet-scale video data to produce rich scene descriptions and retrieval-quality embeddings, enabling both search over large video collections and downstream agentic systems that reason about dynamic scenes.
I completed my PhD in AI and Computer Vision through the ELLIS program and the International Doctorate in ICT at the AImageLab research group of the University of Modena and Reggio Emilia (UNIMORE) 🇮🇹, under the supervision of Prof. Rita Cucchiara and Prof. Lorenzo Baraldi. My thesis explored multimodal attentive architectures for visual-semantic understanding.
During my PhD, I also completed an internship at LMU Munich, Germany 🇩🇪, working on Multimodal LLMs for Video Question Answering and Open-vocabulary Segmentation under the co-supervision of Prof. Volker Tresp.
Earlier, I was a Research Scholar at the Networking Research Group in Saint Louis, USA 🇺🇸, working on super-resolution techniques applied to Internet traffic.
Beyond my current focus, my past research has also covered the pre-training and optimization of Transformer architectures, image classification, self-supervised learning, deepfake detection of synthetic images, and image watermarking.
Feel free to reach out if you have any questions! :)
ELLIS PhD in AI and Computer Vision, 2024
UNIMORE, Italy 🇮🇹 | LMU, Germany 🇩🇪 | NVIDIA, Germany 🇩🇪
MS in Artificial Intelligence, 2020
UNIMORE, Italy 🇮🇹 | AGH, Poland 🇵🇱 | Saint Louis University, USA 🇺🇸
BS in Computer Engineering, 2018
UNIMORE, Italy 🇮🇹
HumanE-AI-NET project, funded by the EU Framework Programme for Research and Innovation Horizon 2020.