An Empirical Study on Line-Level Software Defect Prediction
This program is tentative and subject to change.
Line-level software defect prediction (LLDP) is crucial for locating and modifying defective code and has drawn increasing attention from both academic and industrial communities. Recently, a series of deep learning-based LLDP models have been proposed. However, there is no systematic comparison of these LLDP models. In this paper, we aim to compare the performance of these models, evaluate their consistency in different scenarios, and investigate whether fusing handcrafted features can improve the prediction performance. To this end, we conducted a comprehensive empirical study on eight recent state-of-the-art models using 32 benchmark datasets from nine software projects’ releases. We evaluated them in both cross-version and cross-project scenarios with four widely used performance metrics, including AUC, Recall@Top20%LOC, Effort@Top20%Recall, and IFA. Moreover, we investigated the impact of handcrafted features on LLDP models based on two different fusion strategies. The results show that 1) the difference among these models is statistically significant; 2) no model is always superior across all performance metrics but Bugsplorer generally performs best except IFA; 3) model performance rankings are inconsistent between cross-version and cross-project scenarios, revealing that a model’s effectiveness is highly scenario-dependent; 4) furthermore, fusing handcrafted features significantly improves model prediction performance, and the fusion strategy also matters. In conclusion, the selection of an LLDP model should be guided by specific scenarios and performance metrics, and the hybrid model that combines deep learning representations with handcrafted features is a promising alternative for LLDP.
This program is tentative and subject to change.
Mon 13 AprDisplayed time zone: Brasilia, Distrito Federal, Brazil change
11:00 - 12:30 | |||
11:00 10mResearch paper | Where Do Smart Contract Security Analyzers Fall Short? Technical Papers DOI Pre-print | ||
11:10 10mTalk | An Empirical Study of Vulnerabilities in Python Packages and Their Detection Technical Papers Haowei Quan Monash University, Junjie Wang Tianjin University, Xinzhe Li College of Intelligence and Computing, Tianjin University, Terry Yue Zhuo Monash University and CSIRO's Data61, Xiao Chen University of Newcastle, Xiaoning Du Monash University | ||
11:20 10mTalk | Does Programming Language Matter? An Empirical Study of Fuzzing Bug Detection Technical Papers Tatsuya Shirai Nara Institute of Science and Technology, Olivier Nourry The University of Osaka, Yutaro Kashiwa Nara Institute of Science and Technology, Kenji Fujiwara Nara Women’s University, Hajimu Iida Nara Institute of Science and Technology | ||
11:30 10mTalk | An Empirical Study on Line-Level Software Defect Prediction Technical Papers Enci Zhang Beijing Jiaotong University, Yutong Jiang Beijing Jiaotong University, Tianmeng Zhang Beijing Jiaotong University, Haonan Tong Beijing Jiaotong University | ||
11:40 10mTalk | Characterizing and Modeling the GitHub Security Advisories Review Pipeline Technical Papers Claudio Segal UFF, Paulo Segal UFF, Carlos Eduardo de Schuller Banjar UFRJ, Felipe Paixão Federal University of Bahia (UFBA), Hudson Silva Borges UFMS, Paulo Silveira Neto Federal University Rural of Pernambuco, Eduardo Santana de Almeida Federal University of Bahia, Joanna C. S. Santos University of Notre Dame, Anton Kocheturov Siemens Technology, Gaurav Kumar Srivastava Siemens, Daniel Sadoc Menasche UFRJ, Brazil Pre-print | ||
11:50 10mTalk | Linux Kernel Recency Matters, CVE Severity Doesn’t, and History Fades Technical Papers Piotr Przymus Nicolaus Copernicus University in Toruń, Poland, Witold Weiner Nicolaus Copernicus University in Toruń and Adtran Networks Sp. z o.o, Krzysztof Rykaczewski Nicolaus Copernicus University in Toruń, Poland, Gunnar Kudrjavets Amazon Web Services, USA Pre-print | ||
12:00 10mTalk | Beyond Single Code Changes: An Empirical Study of Topic-Based Code Review Practices in Gerrit for OpenStack Technical Papers Moataz Chouchen Concordia University, Mahi Begoug ETS Montreal, Ali Ouni Ecole de Technologie Superieure (ETS) | ||
12:10 10mTalk | LogSieve: Task-Aware CI Log Reduction for Sustainable LLM-Based Analysis Technical Papers Marcus Barnes University of Toronto, Taher A. Ghaleb Trent University, Safwat Hassan University of Toronto Pre-print | ||
12:20 5mTalk | Finding Important Stack Frames in Large Systems Industry Track Aleksandr Khvorov JetBrains; Constructor University Bremen, Yaroslav Golubev JetBrains Research, Denis Sushentsev JetBrains | ||
12:25 5mTalk | Stop Comparing Apples and Oranges: Matching for Better Results in Mining Software Repositories Studies Technical Papers Sabato Nocera University of Salerno, Nyyti Saarimäki University of Luxembourg, Valentina Lenarduzzi University of Southern Denmark, Davide Taibi University of Southern Denmark and University of Oulu, Sira Vegas Universidad Politecnica de Madrid | ||