Tracing Stereotypes in Pre-trained Transformers: From Biased Neurons to Fairer Models
This program is tentative and subject to change.
The advent of transformer-based language models has reshaped how AI systems process and generate text. In software engineering (SE), these models now support diverse activities, accelerating automation and decision-making. Yet, evidence shows that these models can reproduce or amplify social biases, raising fairness concerns. Recent work on neuron editing has shown that internal activations in pre-trained transformers can be traced and modified to alter model behavior. Building on the concept of \textit{knowledge neurons}-neurons that encode factual information-we hypothesize the existence of \textit{biased neurons} that capture stereotypical associations within pre-trained transformers.
To test this hypothesis, we build a dataset of \textit{biased relations}, i.e., triplets encoding stereotypes across nine bias types, and adapt neuron attribution strategies to trace and suppress biased neurons in \textit{BERT} models. We then assess the impact of suppression on SE tasks. Our findings show that biased knowledge is localized within small neuron subsets, and suppressing them substantially reduces bias with minimal performance loss. This demonstrates that bias in transformers can be traced and mitigated at the neuron level, offering an interpretable approach to fairness in SE.
This program is tentative and subject to change.
Tue 14 AprDisplayed time zone: Brasilia, Distrito Federal, Brazil change
14:00 - 15:30 | |||
14:00 10mTalk | Analyzing GitHub Issues and Pull Requests in nf-core Pipelines: Insights into nf-core Pipeline Repositories Technical Papers | ||
14:10 10mTalk | Modeling Sampling Workflows for Code Repositories Technical Papers Romain Lefeuvre University of Rennes, Maiwenn Le Goasteller University of Rennes, Inria, CNRS, IRISA, Jessie Galasso-Carbonnel McGill University, Benoit Combemale University of Rennes, Inria, CNRS, IRISA, Quentin Perez INSA Rennes, Houari Sahraoui DIRO, Université de Montréal | ||
14:20 10mTalk | Quantifying Competitive Relationships Among Open-Source Software Projects Technical Papers Yuki Takei Japan Advanced Institute of Science and Technology, Toshiaki Aoki JAIST, Chaiyong Rakhitwetsagul Mahidol University, Thailand Pre-print | ||
14:30 10mTalk | Role of CI Adoption in Mobile App Success: An Empirical Study of Open-Source Android Projects Technical Papers xiaoxin zhou University of Toronto, Taher A. Ghaleb Trent University, Safwat Hassan University of Toronto Pre-print | ||
14:40 10mTalk | ML in a Box: Analyzing Containerization Practices in Open Source ML Projects Technical Papers Faten Jebari Grand Valley State University, Emna Ksontini University of North Carolina Wilmington, Amine Barrak Oakland University, USA, Wael Kessentini DePaul University | ||
14:50 10mTalk | An Empirical Study of Policy as Code: Adoption, Purpose, and Maintenance Technical Papers Ruben Opdebeeck Vrije Universiteit Brussel, Mahmoud Alfadel University of Calgary, Akond Rahman Auburn University, Yutaro Kashiwa Nara Institute of Science and Technology, João F. Ferreira Faculty of Engineering, University of Porto & INESC-ID, Raula Gaikovina Kula The University of Osaka, Coen De Roover Vrije Universiteit Brussel Pre-print | ||
15:00 10mTalk | Tracing Stereotypes in Pre-trained Transformers: From Biased Neurons to Fairer Models Technical Papers Gianmario Voria University of Salerno, Moses Openja Polytechnique Montreal, Foutse Khomh Polytechnique Montréal, Gemma Catolino University of Salerno, Fabio Palomba University of Salerno Pre-print | ||
15:10 5mIndustry talk | Can Data Mining Help to Survive the Annual Compiler Upgrade? Industry Track Gunnar Kudrjavets Amazon Web Services, USA, Aditya Kumar Google, Piotr Przymus Nicolaus Copernicus University in Toruń, Poland Pre-print | ||
15:15 5mTalk | Underutilization in Research GPU Clusters: SE Challenges Industry Track Krzysztof Kaczmarski Warsaw University of Technology, Jakub Narębski Nicolaus Copernicus University in Toruń, Piotr Przymus Nicolaus Copernicus University in Toruń, Poland | ||