Beyond Text-to-Text: An Overview of Multimodal and Generative Artificial Intelligence for Education Using Topic Modeling

📅 2024-09-24

🏛️ arXiv.org

📈 Citations: 2

✨ Influential: 0

career value

186K/year

🤖 AI Summary

Current educational research on generative AI is heavily skewed toward text-to-text models (e.g., ChatGPT), while multimodal applications—such as text-to-image and text-to-audio—are markedly underexplored. Method: Drawing on 4,175 publications from the Dimensions database, this study employs a dual-topic modeling approach combining Latent Dirichlet Allocation (LDA) and Non-negative Matrix Factorization (NMF), integrated with systematic text preprocessing and human-validated interpretability checks, to identify 38 fine-grained themes clustered into 14 overarching thematic domains. Results: Text-to-text research constitutes over 68% of the corpus, whereas multimodal educational applications account for less than 5%. The study quantifies pronounced imbalances across modalities and educational levels (K–12 through higher education), and proposes the first empirically grounded, multimodal, and education-level-inclusive research framework—offering actionable evidence for policy formulation and AI development in education.

Technology Category

Application Category

📝 Abstract

Generative artificial intelligence (GenAI) can reshape education and learning. While large language models (LLMs) like ChatGPT dominate current educational research, multimodal capabilities, such as text-to-speech and text-to-image, are less explored. This study uses topic modeling to map the research landscape of multimodal and generative AI in education. An extensive literature search using Dimensions yielded 4175 articles. Employing a topic modeling approach, latent topics were extracted, resulting in 38 interpretable topics organized into 14 thematic areas. Findings indicate a predominant focus on text-to-text models in educational contexts, with other modalities underexplored, overlooking the broader potential of multimodal approaches. The results suggest a research gap, stressing the importance of more balanced attention across different AI modalities and educational levels. In summary, this research provides an overview of current trends in generative AI for education, underlining opportunities for future exploration of multimodal technologies to fully realize the transformative potential of artificial intelligence in education.

Problem

Research questions and friction points this paper is trying to address.

Exploring multimodal GenAI's understudied role in education

Mapping research gaps in non-text AI education applications

Identifying imbalance in educational AI modality focus

Innovation

Methods, ideas, or system contributions that make the work stand out.

Topic modeling maps multimodal AI research

Analyzes 4175 articles for latent topics

Highlights underexplored non-text AI modalities

🔎 Similar Papers

No similar papers found.