RevMine: An LLM-Assisted Tool for Code Review Mining and Analysis Across Git Platforms

📅 2025-10-06

📈 Citations: 0

✨ Influential: 0

career value

221K/year

🤖 AI Summary

Empirical code review research is hindered by high technical barriers to cross-platform (e.g., GitHub/GitLab) data acquisition and analysis, heavy reliance on custom scripts, and poor reproducibility. To address these challenges, we propose the first LLM-integrated code review mining framework. It enables natural-language interactive querying, automatic API endpoint discovery, multi-source authentication management, and joint parsing of structured and unstructured review artifacts—including comments, patches, and metadata. The framework unifies support for both quantitative statistics and qualitative analysis, substantially reducing the need for manual script development. We implement and evaluate a prototype system across multiple platforms, demonstrating its feasibility for conducting efficient, low-barrier empirical software engineering studies. Results show improved reproducibility and broader academic accessibility of code review research.

Technology Category

Application Category

📝 Abstract

Empirical research on code review processes is increasingly central to understanding software quality and collaboration. However, collecting and analyzing review data remains a time-consuming and technically intensive task. Most researchers follow similar workflows - writing ad hoc scripts to extract, filter, and analyze review data from platforms like GitHub and GitLab. This paper introduces RevMine, a conceptual tool that streamlines the entire code review mining pipeline using large language models (LLMs). RevMine guides users through authentication, endpoint discovery, and natural language-driven data collection, significantly reducing the need for manual scripting. After retrieving review data, it supports both quantitative and qualitative analysis based on user-defined filters or LLM-inferred patterns. This poster outlines the tool's architecture, use cases, and research potential. By lowering the barrier to entry, RevMine aims to democratize code review mining and enable a broader range of empirical software engineering studies.

Problem

Research questions and friction points this paper is trying to address.

Automating code review data extraction from Git platforms

Reducing manual scripting for review mining and analysis

Enabling quantitative and qualitative review pattern analysis

Innovation

Methods, ideas, or system contributions that make the work stand out.

LLM-assisted tool for code review mining

Automates data collection across Git platforms

Supports quantitative and qualitative analysis

🔎 Similar Papers

CRScore: Grounding Automated Evaluation of Code Review Comments in Code Claims and Smells