MSR 2026
Mon 13 - Tue 14 April 2026 Rio de Janeiro, Brazil
co-located with ICSE 2026

This program is tentative and subject to change.

Tue 14 Apr 2026 14:00 - 14:10 at Oceania V - Session 2-A: Ecosystems & Methods

Scientific Workflow Management Systems (SWfMSs) such as Nextflow have become essential software frameworks for conducting reproducible, scalable, and portable computational analyses in data-intensive fields like genomics, transcriptomics, and proteomics. Building on Nextflow, the nf-core community curates standardized, peer-reviewed pipelines that follow strict testing, documentation, and governance guidelines. Despite its broad adoption, little is known about the challenges users face during the development and maintenance of these pipelines. This paper presents an empirical study of 25,173 issues and pull requests from these pipelines to uncover recurring challenges, management practices, and perceived difficulties. Using BERTopic modeling, we identify 13 key challenges, including pipeline development and integration, bug fixing, integrating genomic data, managing CI configurations, and handling version updates. We then examine issue resolution dynamics, showing that 89.38% of issues and pull requests are eventually closed, with half resolved within three days. Statistical analysis reveals that the presence of labels (large effect, $\delta$ = 0.94) and code snippets (medium effect, $\delta$ = 0.50) significantly improve resolution likelihood. Further analysis reveals that tool development and repository maintenance poses the most significant challenges, followed by testing pipelines and CI configurations, and debugging containerized pipelines. Overall, this study provides actionable insights into the collaborative development and maintenance of nf-core pipelines, highlighting opportunities to enhance their usability, sustainability, and reproducibility.

This program is tentative and subject to change.

Tue 14 Apr

Displayed time zone: Brasilia, Distrito Federal, Brazil change

14:00 - 15:30
14:00
10m
Talk
Analyzing GitHub Issues and Pull Requests in nf-core Pipelines: Insights into nf-core Pipeline Repositories
Technical Papers
Khairul Alam University of Saskatchewan, Banani Roy University of Saskatchewan
14:10
10m
Talk
Modeling Sampling Workflows for Code Repositories
Technical Papers
Romain Lefeuvre University of Rennes, Maiwenn Le Goasteller University of Rennes, Inria, CNRS, IRISA, Jessie Galasso-Carbonnel McGill University, Benoit Combemale Inria, Univ Rennes, CNRS, IRISA, Quentin Perez INSA Rennes, Houari Sahraoui DIRO, Université de Montréal
Pre-print
14:20
10m
Talk
Quantifying Competitive Relationships Among Open-Source Software Projects
Technical Papers
Yuki Takei Japan Advanced Institute of Science and Technology, Toshiaki Aoki JAIST, Chaiyong Rakhitwetsagul Mahidol University, Thailand
Pre-print
14:30
10m
Talk
Role of CI Adoption in Mobile App Success: An Empirical Study of Open-Source Android Projects
Technical Papers
xiaoxin zhou University of Toronto, Taher A. Ghaleb Trent University, Safwat Hassan University of Toronto
Pre-print
14:40
10m
Talk
ML in a Box: Analyzing Containerization Practices in Open Source ML Projects
Technical Papers
Faten Jebari Grand Valley State University, Emna Ksontini University of North Carolina Wilmington, Amine Barrak Oakland University, USA, Wael Kessentini DePaul University
14:50
10m
Talk
An Empirical Study of Policy as Code: Adoption, Purpose, and Maintenance
Technical Papers
Ruben Opdebeeck Vrije Universiteit Brussel, Mahmoud Alfadel University of Calgary, Akond Rahman Auburn University, Yutaro Kashiwa Nara Institute of Science and Technology, João F. Ferreira Faculty of Engineering, University of Porto & INESC-ID, Raula Gaikovina Kula The University of Osaka, Coen De Roover Vrije Universiteit Brussel
Pre-print
15:00
10m
Talk
Tracing Stereotypes in Pre-trained Transformers: From Biased Neurons to Fairer Models
Technical Papers
Gianmario Voria University of Salerno, Moses Openja Polytechnique Montreal, Foutse Khomh Polytechnique Montréal, Gemma Catolino University of Salerno, Fabio Palomba University of Salerno
Pre-print
15:10
5m
Industry talk
Can Data Mining Help to Survive the Annual Compiler Upgrade?
Industry Track
Gunnar Kudrjavets Amazon Web Services, USA, Aditya Kumar Google, Piotr Przymus Nicolaus Copernicus University in Toruń, Poland
Pre-print
15:15
5m
Talk
Underutilization in Research GPU Clusters: SE Challenges
Industry Track
Krzysztof Kaczmarski Warsaw University of Technology, Jakub Narębski Nicolaus Copernicus University in Toruń, Piotr Przymus Nicolaus Copernicus University in Toruń, Poland
15:20
10m
Talk
LILA: Decentralized Build Reproducibility Monitoring for the Functional Package Management Model
Data and Tool Showcase Track
Julien Malka LTCI, Télécom Paris, Institut Polytechnique de Paris, France, Arnout Engelen Independent