Hierarchical Multi-Label Generation with Probabilistic Level-Constraint

๐Ÿ“… 2025-04-30
๐Ÿ“ˆ Citations: 0
โœจ Influential: 0
๐Ÿ“„ PDF
๐Ÿค– AI Summary
Hierarchical Extreme Multi-Label Classification (HEMLC) faces significant challenges due to the complexity and scale of label taxonomies. To address this, we propose Hierarchical Multi-label Generation (HMG), a novel paradigm that reformulates HEMLC as end-to-end generation of cross-level relevant labels within a given taxonomy. We introduce the first Probabilistic Level Constraint (PLC) mechanism, explicitly controlling the number of generated labels, path length, and hierarchical depthโ€”enabling strong controllability without relying on clustering or other preprocessing steps. Our method jointly leverages taxonomy structural priors and a PLC-guided probabilistic loss, augmented by a taxonomy-aware decoding strategy. Evaluated on standard HEMLC benchmarks, HMG achieves new state-of-the-art performance, improving hierarchical compliance rate by 23.6% over prior methods while demonstrating superior controllability and generation quality.

Technology Category

Application Category

๐Ÿ“ Abstract
Hierarchical Extreme Multi-Label Classification poses greater difficulties compared to traditional multi-label classification because of the intricate hierarchical connections of labels within a domain-specific taxonomy and the substantial number of labels. Some of the prior research endeavors centered on classifying text through several ancillary stages such as the cluster algorithm and multiphase classification. Others made attempts to leverage the assistance of generative methods yet were unable to properly control the output of the generative model. We redefine the task from hierarchical multi-Label classification to Hierarchical Multi-Label Generation (HMG) and employ a generative framework with Probabilistic Level Constraints (PLC) to generate hierarchical labels within a specific taxonomy that have complex hierarchical relationships. The approach we proposed in this paper enables the framework to generate all relevant labels across levels for each document without relying on preliminary operations like clustering. Meanwhile, it can control the model output precisely in terms of count, length, and level aspects. Experiments demonstrate that our approach not only achieves a new SOTA performance in the HMG task, but also has a much better performance in constrained the output of model than previous research work.
Problem

Research questions and friction points this paper is trying to address.

Handling complex hierarchical label relationships in classification
Generating multi-label outputs without preliminary clustering steps
Precisely controlling model output count, length, and level
Innovation

Methods, ideas, or system contributions that make the work stand out.

Generative framework with Probabilistic Level Constraints
Generates hierarchical labels without clustering
Precisely controls output count, length, level
๐Ÿ”Ž Similar Papers
No similar papers found.
Linqing Chen
Linqing Chen
Patsnap
W
Weilei Wang
PatSnap Co., LTD.
W
Wentao Wu
PatSnap Co., LTD.
H
Hanmeng Zhong
PatSnap Co., LTD.