🤖 AI Summary
Generating high-quality, context-aware scientific ideas during early-stage discovery remains challenging. Method: This paper proposes a multi-round iterative framework for research idea generation. It integrates token-level and sentence-level embeddings and introduces a novel “Aha Moment” detection mechanism. A four-dimensional automated evaluation system is developed—assessing novelty, excitement, feasibility, and effectiveness—augmented with ethical constraints and human-AI collaboration protocols. Technically, the framework leverages large language models (e.g., GPT-4o/4.5, DeepSeek-32B/70B) and incorporates few-shot prompting, chain-of-thought reasoning, and multi-granularity semantic modeling. Contribution/Results: Experiments demonstrate that the framework achieves an average score of 6.85 (on a 1–10 scale) across four core metrics—significantly outperforming baseline methods—and validates its efficacy and innovation in generating high-quality, actionable scientific ideas.
📝 Abstract
Every scientific discovery starts with an idea inspired by prior work, interdisciplinary concepts, and emerging challenges. Recent advancements in large language models (LLMs) trained on scientific corpora have driven interest in AI-supported idea generation. However, generating context-aware, high-quality, and innovative ideas remains challenging. We introduce SCI-IDEA, a framework that uses LLM prompting strategies and Aha Moment detection for iterative idea refinement. SCI-IDEA extracts essential facets from research publications, assessing generated ideas on novelty, excitement, feasibility, and effectiveness. Comprehensive experiments validate SCI-IDEA's effectiveness, achieving average scores of 6.84, 6.86, 6.89, and 6.84 (on a 1-10 scale) across novelty, excitement, feasibility, and effectiveness, respectively. Evaluations employed GPT-4o, GPT-4.5, DeepSeek-32B (each under 2-shot prompting), and DeepSeek-70B (3-shot prompting), with token-level embeddings used for Aha Moment detection. Similarly, it achieves scores of 6.87, 6.86, 6.83, and 6.87 using GPT-4o under 5-shot prompting, GPT-4.5 under 3-shot prompting, DeepSeek-32B under zero-shot chain-of-thought prompting, and DeepSeek-70B under 5-shot prompting with sentence-level embeddings. We also address ethical considerations such as intellectual credit, potential misuse, and balancing human creativity with AI-driven ideation. Our results highlight SCI-IDEA's potential to facilitate the structured and flexible exploration of context-aware scientific ideas, supporting innovation while maintaining ethical standards.