🤖 AI Summary
Existing evaluations lack comprehensive benchmarks for assessing large language models (LLMs) on low-resource, non-English languages like Telugu, hindering understanding of their real-world interactive capabilities.
Method: We introduce the first Telugu-specific benchmark comprising 20 tasks spanning greetings, morphology, vocabulary, daily expressions, task completion, and situational reasoning. Using human annotation, qualitative analysis, and quantitative metrics, we conduct a controlled comparative evaluation of ChatGPT and Gemini.
Contribution/Results: Our analysis reveals significant disparities in multidimensional linguistic competence: Gemini demonstrates superior morphological accuracy and localization adaptability, whereas ChatGPT exhibits greater stability in long-horizon logical reasoning. Both models show notable gaps in dynamic situational inference robustness. This work establishes a reproducible, multilingual evaluation framework grounded in empirical evidence—advancing rigorous, locale-aware assessment of LLMs beyond English.
📝 Abstract
The growing prominence of large language models (LLMs) necessitates the exploration of their capabilities beyond English. This research investigates the Telugu language proficiency of ChatGPT and Gemini, two leading LLMs. Through a designed set of 20 questions encompassing greetings, grammar, vocabulary, common phrases, task completion, and situational reasoning, the study delves into their strengths and weaknesses in handling Telugu. The analysis aims to identify the LLM that demonstrates a deeper understanding of Telugu grammatical structures, possesses a broader vocabulary, and exhibits superior performance in tasks like writing and reasoning. By comparing their ability to comprehend and use everyday Telugu expressions, the research sheds light on their suitability for real-world language interaction. Furthermore, the evaluation of adaptability and reasoning capabilities provides insights into how each LLM leverages Telugu to respond to dynamic situations. This comparative analysis contributes to the ongoing discussion on multilingual capabilities in AI and paves the way for future research in developing LLMs that can seamlessly integrate with Telugu-speaking communities.