Scholar

Mingxuan Liu

Google Scholar ID: egL5-LsAAAAJ

University of Trento

Vision-LanguageOpen-vocabulary RecognitionNovel Class DiscoveryContinuous Learning

Citations & Impact

All-time

Citations

118

H-index

i10-index

Publications

Co-authors

list available

Contact

Publications

1 items

2026

Cited

Resume (English only)

Academic Achievements

- Paper 'UrbanVerse: Scaling Urban Simulation by Watching City-Tour Videos' accepted to arXiv
- Paper 'Organizing Unstructured Image Collections using Natural Language' accepted to arXiv
- Selected as Outstanding Reviewer for CVPR 2025
- Received $5,000 research funding from OpenAI
- Paper 'Incremental Novel Class Discovery with Large Scale Pre-trained Models' accepted as an Oral paper at ICPR 2024
- Filed first US Patent: 'A Method for Using Semantic Hierarchy Trees to Increase the Robustness of Open-vocabulary Object Detection Models'
- Paper 'Open-vocabulary Object Detection with Semantic Hierarchy' accepted as a Highlight paper at CVPR 2024, 2.8% acceptance rate
- Paper 'Discovering Fine-grained Semantic Concepts with LLMs' accepted to ICLR 2024

Research Experience

- PhD student at University of Trento, researching deep learning and computer vision
- Visiting Researcher at UCLA, working on automatic urban simulation scene creation, advised by Prof. Bolei ZHOU
- Visiting Researcher at NAVER LABS Europe, exploring open-vocabulary object detection, supervised by Gabriela CSURKA, Riccardo VOLPI, and Tyler L. HAYES, led by Diane Larlus
- Innovation Engineer at SIEMENS Smart Infrastructure Division, designing IoT-based automation solutions

Education

- PhD student at University of Trento, supervised by Prof. Elisa RICCI and Prof. Zhun ZHONG
- Master's degree in Intelligent Autonomous Systems from KTH Royal Institute of Technology (Sweden), Summa Cum Laude
- Master's degree in Mechatronics Engineering from University of Trento (Italy), Summa Cum Laude

Background

- PhD student in Deep Learning and Computer Vision
- Research interests: Training open-world machines to see, understand, and reason about our chaotic visual, semantic, and physical world
- Professional fields: Knowledge discovery, open-vocabulary recognition, vision and language, urban embodied AI simulation

Miscellany