DB-3DME: From Dataset to Benchmark for Human-aligned Automatic 3D Mesh Evaluation

📅 2026-06-08

📈 Citations: 0

✨ Influential: 0

🤖 AI Summary

This work addresses the lack of efficient, scalable, and human-aligned automatic evaluation methods for 3D assets. The authors introduce DB-3DME, a novel dataset comprising 2,619 synthetic 3D meshes paired with human quality ratings, establishing the first benchmark for human-aligned automatic assessment of 3D meshes. Through systematic evaluation of vision-language models (VLMs), they identify 3D visual encoding as a critical factor influencing performance. Building on this insight, they fine-tune only the visual encoder of Qwen-2.5-VL-7B while freezing its language components, significantly enhancing its alignment with human judgments across dimensions such as geometric quality and prompt adherence. This approach outperforms existing pretrained VLMs, and the dataset is publicly released to support future research.

📝 Abstract

Recent advances in 3D generation have led to substantial improvements in realism, controllability, and efficiency, yet the evaluation of 3D assets remains underexplored. Existing evaluation paradigms, including human evaluation, learned metrics, and vision-language models (VLMs) as judges, suffer from limitations in cost, scalability, resolution handling, or task-specific alignment. In this work, we focus on 3D mesh evaluation and introduce DB-3DME, the Dataset and Benchmark for 3D Mesh Evaluation. DB-3DME contains 2,619 synthetic 3D meshes paired with human ratings on Geometry and Prompt Adherence. Using this dataset, we systematically benchmark state-of-the-art VLMs and identify visual encoding of 3D representations as a key factor for human-aligned evaluation performance. Motivated by this finding, we fine-tune an open-weight VLM, Qwen-2.5-VL-7B, for 3D mesh evaluation by adapting the visual encoder while freezing the language model. The fine-tuned model substantially outperforms existing pre-trained VLMs across multiple evaluation dimensions, establishing a new benchmark for automatic 3D mesh evaluation. We publicly release the benchmark dataset on GitHub and Hugging Face to facilitate future research.

Problem

Research questions and friction points this paper is trying to address.

3D mesh evaluation

human-aligned evaluation

automatic evaluation

benchmark

3D generation

Innovation

Methods, ideas, or system contributions that make the work stand out.

3D mesh evaluation

vision-language models

human-aligned benchmark

visual encoder adaptation

automatic evaluation

🔎 Similar Papers

No similar papers found.