Confirmation Bias as a Human Aspect in Software Engineering

Dr. Gul Calikli, Data Science Lab, Department of Mechanical and Industrial Engineering, Ryerson University, Canada

Tuesday Mar. 26th, 12-1:30pm
Lassonde Building 3033 – LAS3033

Data mining methods are used in empirical software engineering research to predict, diagnose, and plan for various tasks during the software development process. Such prediction models enhance managerial decision-making. All the techniques so far used product and process related metrics in building predictive models.

Aims: Software is designed, implemented and tested by people. Therefore, it is important to gain insight about people’s thought processes and their problem solving skills in order to improve software quality. While solving problems during any phase of the Software Development Life Cycle (SDLC), software engineers employ some heuristics. These heuristics may result in “cognitive biases”, which are defined as patterned deviations of human thought from the laws of logic and mathematics. In this research, we focused on a specific cognitive bias called “confirmation bias”, which is defined as the tendency of people to seek evidence that verifies a hypothesis rather than seeking evidence to falsify a hypothesis.

Method: We defined a methodology to quantify/measure confirmation biases of software engineers by inheriting theories from the grounded work in cognitive psychology literature. We have come up with a “confirmation bias metrics set”.

Results: Our empirical results demonstrated that developers’ confirmation biases have a significant impact on the defect proneness of software. By using developers’ confirmation bias metrics values as input, we built learning-based models to predict defective parts of software, in addition to building models that are learned from static code and churn metrics. The performance of defect prediction models built using only confirmation bias metrics was found to be comparable with the performance of the models that use static code and/or churn metrics. By using confirmation bias metrics, we also built models to predict the defect rates of developer groups, which did not contribute to the development of the past or current releases of a software product, but which are likely to appear in the next releases. In the long run, enhanced form of our models may guide project managers in task assignment issues (i.e., which developers should touch the same source code files and which developers should not, so that software defects rates are minimized).
Conclusions: We believe that next generation of empirical research in software engineering will bring more value to practice through better understanding of developer characteristics. Tool support is also necessary to measure, store and analyze such characteristics.