Call for Papers
The MSR Data and Tool Showcase Track aims to actively promote and recognize the creation of reusable datasets and tools that are designed and built not only for a specific research project but for the MSR community as a whole. These datasets and tools should enable other practitioners and researchers to jumpstart their research efforts, and also allow the reproducibility of earlier work. The MSR Data and Tool Showcase papers can be descriptions of datasets or tools built by the authors that can be used by other practitioners or researchers, and/or descriptions of the use of tools built by others to obtain specific research results.
MSR’26 Data and Tool Showcase Track will accept two types of submissions:
-
Data showcase submissions are expected to include:
- a description of the data source,
- a description of the methodology used to gather the data (including provenance and the tool used to create/generate/gather the data, if any),
- a description of the storage mechanism, including a schema if applicable,
- if the data has been used by the authors or others, a description of how this was done including references to previously published papers,
- a description of the originality of the dataset (that is, even if the dataset has been used in a published paper, its complete description must be unpublished) and similar existing datasets (if any),
- ideas for future research questions that could be answered using the dataset,
- ideas for further improvements that could be made to the dataset, and
- any limitations and/or challenges in creating or using the dataset.
-
Reusable Tool showcase submissions are expected to include:
- a description of the tool, which includes the background, motivation, novelty, overall architecture, detailed design, and preliminary evaluation of the tool, as well as the link to download or access the tool,
- a description of the design of the tool, and how to use the tool in practice, clear installation instructions and example datasets that allow the reviewers to run the tool,
- if the tool has been used by the authors or others, a description of how the tool was used, including references to previously published papers,
- ideas for future reusability or extensions of the tool, and
- any limitations or constraints that might exist when using the tool.
The dataset or tool should be made available at the time of submission of the paper for review but will be considered confidential until publication of the paper. The dataset or tool should include detailed instructions about how to set up the environment (e.g., by means of a requirements.txt file), how to use the dataset or tool (e.g., how to import the data or how to access the data once it has been imported, how to use the tool with a running example).
At a minimum, upon publication of the paper, the authors should archive the data or tool on a persistent, non-commercial repository that can provide a digital object identifier (DOI) such as osf.io, zenodo.org, figshare.com, or Archive.org. In addition, the DOI-based citation of the dataset or the tool should be included in the camera-ready version of the paper. GitHub provides an easy way to make source code citable (with third tools and with a CITATION file).
Data and Tool showcase submissions are neither empirical studies nor datasets that are based on poorly explained or untrustworthy heuristics for data collection, or results of trivial application of generic tools.
If custom tools have been used to create the dataset, the paper should be accompanied by the source code of the tools, along with clear documentation on how to run the tools to recreate the dataset. The tools should be open source, accompanied by an appropriate open source license; the source code should be citable, i.e., refer to a specific release and have a DOI. If you cannot provide the source code or the source code clause is not applicable (e.g., because the dataset consists of qualitative data), please provide a short explanation of why this is not possible.
Evaluation Criteria
The Review Criteria for the Data/Tool Showcase submissions are as follows:
- Value, usefulness, and reusability of the datasets or tools.
- Quality of the presentation, including clarity and correctness of language.
- Clarity of relation with related work and its relevance to mining software repositories.
- Availability and accessibility of the datasets or tools.
Important Dates
- Abstract Deadline: Wednesday, November 5, 2025
- Paper Deadline: Monday, November 10, 2025
- Acceptance Notification: Monday, January 5, 2026
- Camera Ready Deadline: Monday, January 23, 2026
Awards
The best dataset/tool paper(s) will be awarded with a Distinguished Paper Award.
Submission
Submit your paper (maximum 4 pages, plus 1 additional page of references) via the HotCRP submission site: https://msr2026-data-tool.hotcrp.com/.
Submissions will undergo single-anonymous peer review. The Data and Tool showcase track uses a single-anonymous peer review (i.e., authors’ names should be listed on the manuscript, as opposed to the double-anonymous peer review of the main track) due to the requirement above to describe the ways how data has been used in the previous studies, including the bibliographic reference to those studies. Such a reference is likely to disclose the authors’ identity.
To make research datasets and research software accessible and citable, we further expect authors to attend to the FAIR rules, i.e., data should be: Findable, Accessible, Interoperable, and Reusable.
Submissions must strictly conform to the ACM conference proceedings formatting instructions, using the official “ACM Primary Article Template”, which can be obtained from the ACM Proceedings Template page. Alterations of spacing, font sizes, and other changes that deviate from the instructions may result in desk rejection without further review. LaTeX users should use the sigconf option, as well as the review (to produce line numbers for easy reference by the reviewers) option. To that end, the following LaTeX code can be placed at the start of the LaTeX document:
\documentclass[sigconf,review]{acmart}
Papers submitted for consideration must not have been published elsewhere and may not be under review or submitted for review elsewhere for the duration of the review process until decision notification. ACM plagiarism policies and procedures shall be followed for cases of double submission. The submission must also comply with the IEEE Policy on Authorship. Please read the ACM Policy on Plagiarism, Misrepresentation, and Falsification and the IEEE - Introduction to the Guidelines for Handling Plagiarism Complaints before submitting. Submissions that do not comply with the above instructions may result in desk rejection without further review.
Upon notification of acceptance, all authors of accepted papers will be asked to complete a copyright form and will receive further instructions for preparing their camera-ready versions. At least one author of each paper must register and present the results at the MSR 2026 conference. All accepted contributions will be published in the conference’s electronic proceedings.