MSR 2026
Mon 13 - Tue 14 April 2026 Rio de Janeiro, Brazil
co-located with ICSE 2026

Understanding the causes of software defects is essential for reliable software maintenance and ecosystem stability. However, existing bug datasets do not distinguish between issues originating within a project from those caused by external dependencies or environ- mental factors. In this paper we present InEx-Bug, a manually annotated dataset of 377 GitHub issue from 103 NPM repositories, categorizing issues as Intrinsic (internal defect), Extrinsic (depen- dency/environment issue), Non-bug, or Unknown. Beyond labels, the dataset includes rich temporal and behavioral metadata such as maintainer participation, code changes, and reopening patterns. Analyses show Intrinsic bugs resolve faster (median 8.9 vs 10.2 days), are close more often (92% vs 78%), and require code changes more frequently (57% vs 28%) compared to Extrinsic bugs. While Extrinsic bugs exhibit higher reopen rates (12% vs 4%) and delayed recurrence (median 157 vs 87 days). The dataset provides a founda- tion for further studying Intrinsic and Extrinsic defects in the NPM ecosystem.