Naming Practices of Pre-Trained Models in Hugging Face

📅 2023-10-02

📈 Citations: 6

✨ Influential: 0

career value

183K/year

🤖 AI Summary

Inconsistent and nonstandard naming conventions for pre-trained models (PTMs) severely impede model discovery and reliable reuse, yet systematic empirical studies on PTM naming practices remain absent. Method: This paper presents the first large-scale analysis of PTM naming on Hugging Face, grounded in a survey of 108 engineers revealing significant mismatches between current naming practices and engineering requirements. We propose DARA, an automated framework that innovatively integrates deep neural network architectural metadata—such as layer count and number of attention heads—to quantify naming credibility. DARA further employs naming pattern mining and anomaly detection to identify structural errors and semantically misleading naming patterns. Contribution/Results: We introduce the first reusable, empirically grounded framework for assessing PTM naming quality, enabling improved model retrieval accuracy and cross-project reusability. The framework is open-sourced, providing both actionable insights and practical tooling for the ML community.

📝 Abstract

As innovation in deep learning continues, many engineers seek to adopt Pre-Trained Models (PTMs) as components in computer systems. Researchers publish PTMs, which engineers adapt for quality or performance prior to deployment. PTM authors should choose appropriate names for their PTMs, which would facilitate model discovery and reuse. However, prior research has reported that model names are not always well chosen - and are sometimes erroneous. The naming for PTM packages has not been systematically studied. In this paper, we frame and conduct the first empirical investigation of PTM naming practices in the Hugging Face PTM registry. We initiated our study with a survey of 108 Hugging Face users to understand the practices in PTM naming. From our survey analysis, we highlight discrepancies from traditional software package naming, and present findings on naming practices. Our findings indicate there is a great mismatch between engineers' preferences and practical practices of PTM naming. We also present practices on detecting naming anomalies and introduce a novel automated DNN ARchitecture Assessment technique (DARA), capable of detecting PTM naming anomalies. We envision future works on leveraging meta-features of PTMs to improve model reuse and trustworthiness.

Problem

Research questions and friction points this paper is trying to address.

Studying pre-trained model naming conventions and inconsistencies

Detecting model naming inaccuracies using architectural information

Improving model discovery, reuse, and supply chain security

Innovation

Methods, ideas, or system contributions that make the work stand out.

Empirical study of PTM naming conventions

DARA automated architecture assessment technique

Detects inconsistencies using architectural information

🔎 Similar Papers

Towards Semantic Versioning of Open Pre-trained Language Model Releases on Hugging Face