🤖 AI Summary
Solution recommendation for service-industry trouble tickets (e.g., telecom billing systems) faces four key challenges: data drift, feature sparsity, sparse historical solution annotations, and semantic redundancy/overlapping solutions induced by free-text inputs.
Method: We propose an end-to-end framework integrating unsupervised and few-shot learning. It couples LDA-based topic modeling with a Siamese network for semantic clustering; employs one-shot learning and index embedding to mitigate annotation scarcity; and deploys a lightweight NLP encoder tailored for short-ticket texts. The system is deployed on Kubernetes for high availability and includes a real-time dashboard.
Results: Evaluated on the open-source Bitext customer-service dataset and proprietary telecom ticket data, our approach achieves significantly higher accuracy than baselines, demonstrating robustness in few-shot and dynamically evolving data scenarios, as well as strong engineering deployability.
📝 Abstract
Resolution of incidents or problem tickets is a common theme in service industries in any sector, including billing and charging systems in telecom domain. Machine learning can help to identify patterns and suggest resolutions for the problem tickets, based on patterns in the historical data of the tickets. However, this process may be complicated due to a variety of phenomena such as data drift and issues such as missing data, lack of data pertaining to resolutions of past incidents, too many similar sounding resolutions due to free text and similar sounding text. This paper proposes a robust ML-driven solution employing clustering, supervised learning, and advanced NLP models to tackle these challenges effectively. Building on previous work, we demonstrate clustering-based resolution identification, supervised classification with LDA, Siamese networks, and One-shot learning, Index embedding. Additionally, we present a real-time dashboard and a highly available Kubernetes-based production deployment. Our experiments with both the open-source Bitext customer-support dataset and proprietary telecom datasets demonstrate high prediction accuracy.