A Blueprint for Trustworthy Code Annotation at Scale: An LLM-Powered Pipeline for Industrial Software Analytics (MSR 2026 - Industry Track)

Who

Ailon dos Santos Teixeira, Jaine Brito da Silva, Nikolas Rocha de Medeiros, Raimundo da Silva Barreto, José Reginaldo Hughes Carvalho, Alex Fernando Monteiro

Track

MSR 2026 Industry Track

Time Zone

The program is currently displayed in (GMT-03:00) Brasilia, Distrito Federal, Brazil.

Use conference time zone: (GMT-03:00) Brasilia, Distrito Federal, BrazilSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Mon 13 Apr 2026 12:25 - 12:30 at Oceania V - Session 1-A: AI Agents & Automation Chair(s): Matheus Paixao

Abstract

In modern software ecosystems like Kotlin/Android, data-driven quality assurance is often stalled by a critical bottleneck: the scarcity of high-quality, expertly labeled datasets. Manual annotation is economically unviable and non-scalable. This presentation introduces a validated, production-grade blueprint for automating code annotation, transforming noisy commit histories into high-confidence data for software analytics.

We present a practical two-stage pipeline acting as a “classifier-as-a-service.” First, we employ MSR-driven mining to filter candidate pools from over 75,000 industrial commits. Second, we utilize an Ensemble of Large Language Models (each exceeding 20 billion parameters) acting as a virtual expert panel. By leveraging Chain-of-Thought prompting and enforcing consensus logic, our approach mitigates individual model hallucinations and biases.

Key contributions include:

Empirical Reliability: Validation demonstrating 86.84% unanimous agreement between our LLM consensus and senior human experts.
Actionable Guidelines: Identification of a 20B-parameter performance threshold necessary for nuanced code analysis.
Reusable Methodology: A strategic engineering asset that organizations can adapt to build their own scalable software analytics capabilities.

This session provides MSR attendees with a reliable path to overcome the data bottleneck, moving beyond theoretical discussions to deliver a deployable solution for industrial challenges.

Ailon dos Santos Teixeira

UFAM

Jaine Brito da Silva

UFAM

Nikolas Rocha de Medeiros

UFAM

Raimundo da Silva Barreto

UFAM

José Reginaldo Hughes Carvalho

UFAM

Alex Fernando Monteiro

UFAM

Time Zone

The program is currently displayed in (GMT-03:00) Brasilia, Distrito Federal, Brazil.

Use conference time zone: (GMT-03:00) Brasilia, Distrito Federal, BrazilSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Mon 13 Apr
Displayed time zone: Brasilia, Distrito Federal, Brazil change

11:00 - 12:30	Session 1-A: AI Agents & AutomationTechnical Papers / Industry Track / MSR Program at Oceania V Chair(s): Matheus Paixao State University of Ceará

11:00 10m Talk		Toward Linking Declined Proposals and Source Code: An Exploratory Study on the Go Repository Technical Papers Sota Nakashima Kyushu University, Masanari Kondo Kyushu University, Mahmoud Alfadel University of Calgary, Aly Ahmad University of Calgary, Toshihiro Nakae DENSO CORPORATION, Hidenori Matsuzaki DENSO CORPORATION, Yasutaka Kamei Kyushu University Pre-print
11:10 10m Talk		IntelliSA: An Intelligent Static Analyzer for IaC Security Smell Detection Using Symbolic Rules and Neural Inference Technical Papers Qiyue Mei The University of Melbourne, Michael Fu The University of Melbourne Pre-print File Attached
11:20 10m Talk		Model See, Model Do? Exposure-Aware Evaluation of Bug-vs-Fix Preference in Code LLMs Technical Papers Ali Al-Kaswan Delft University of Technology, Netherlands, Claudio Spiess University of California, Davis, Prem Devanbu University of California at Davis, Arie van Deursen TU Delft, Mali Izadi TU Delft Pre-print
11:30 10m Talk		A Match Made in Heaven? AI-driven Matching of Vulnerabilities and Security Unit Tests Technical Papers Emanuele Iannone Hamburg University of Technology, Quang-Cuong Bui Hamburg University of Technology, Riccardo Scandariato Hamburg University of Technology Pre-print
11:40 10m Talk		PhantomRun: Auto Repair of Compilation Errors in Embedded Open Source Software Technical Papers Han Fu , Sigrid Eldh Ericsson AB, Mälardalen University, Carleton University, Kristian Wiklund Ericsson AB, Andreas Ermedahl Ericsson AB; KTH Royal Institute of Technology, Philipp Haller KTH Royal Institute of Technology, Cyrille Artho KTH Royal Institute of Technology, Sweden
11:50 10m Talk		Promises, Perils, and (Timely) Heuristics for Mining Coding Agent Activity Technical Papers Romain Robbes CNRS, LaBRI, University of Bordeaux, Théo Matricon Univ Rennes, INSA Rennes, Inria, CNRS, IRISA, Thomas Degueule CNRS, Andre Hora UFMG, Stefano Zacchiroli LTCI, Télécom Paris, Institut Polytechnique de Paris, Palaiseau, France Pre-print Media Attached
12:00 10m Talk		From Logic to Toolchains: An Empirical Study of Bugs in the TypeScript Ecosystem Technical Papers TianYi Tang Simon Fraser University, Saba Alimadadi Simon Fraser University, Nick Sumner Simon Fraser University Pre-print
12:10 10m Talk		Are We All Using Agents Now? An Empirical Study of Core and Peripheral Developers’ Use of Coding Agents Technical Papers Shamse Tasnim Cynthia University of Saskatchewan, Joy Krishan Das University of Saskatchewan, Banani Roy University of Saskatchewan Pre-print
12:20 5m Talk		Context Engineering for AI Agents in Open-Source Software Technical Papers Seyedmoein Mohsenimofidi Heidelberg University, Matthias Galster University of Canterbury, Christoph Treude Singapore Management University, Sebastian Baltes Heidelberg University Pre-print
12:25 5m Talk		A Blueprint for Trustworthy Code Annotation at Scale: An LLM-Powered Pipeline for Industrial Software Analytics Industry Track Ailon dos Santos Teixeira UFAM, Jaine Brito da Silva UFAM, Nikolas Rocha de Medeiros UFAM, Raimundo da Silva Barreto UFAM, José Reginaldo Hughes Carvalho UFAM, Alex Fernando Monteiro UFAM