On the diffuseness and the impact on maintainability of code smells: a large scale empirical investigation (pdf)

Article PDF cannot be displayed. You can download it here:

https://link.springer.com/content/pdf/10.1007%2Fs10664-017-9535-z.pdf

On the diffuseness and the impact on maintainability of code smells: a large scale empirical investigation

Empir Software Eng (2018) 23:11 88 – 122 1 DOI 10.1007/s10664-017-9535-z On the diffuseness and the impact on maintainability of code smells: a large scale empirical investigation Fabio Palomba1 · Gabriele Bavota2 · Massimiliano Di Penta3 · Fausto Fasano4 · Rocco Oliveto4 · Andrea De Lucia5 Published online: 7 August 2017 © The Author(s) 2017. This article is an open access publication Abstract Code smells are symptoms of poor design and implementation choices that may hinder code comprehensibility and maintainability. Despite the effort devoted by the research community in studying code smells, the extent to which code smells in software systems affect software maintainability remains still unclear. In this paper we present a large scale empirical investigation on the diffuseness of code smells and their impact on code change- and fault-proneness. The study was conducted across a total of 395 releases of 30 Communicated by: Ahmed Hassan Fabio Palomba Gabriele Bavota Massimiliano Di Penta Fausto Fasano Rocco Oliveto Andrea De Lucia 1 Delft University of Technology, Delft, The Netherlands 2 Università della Svizzera italiana (USI), Lugano, Switzerland 3 University of Sannio, Benevento, Italy 4 University of Molise, Campobasso, Italy 5 University of Salerno, Fisciano, Italy Empir Software Eng (2018) 23:11 88 – 122 1 1189 open source projects and considering 17,350 manually validated instances of 13 different code smell kinds. The results show that smells characterized by long and/or complex code (e.g., Complex Class) are highly diffused, and that smelly classes have a higher change- and fault-proneness than smell-free classes. Keywords Code smells · Empirical studies · Mining software repositories 1 Introduction Bad code smells (also known as “code smells” or “smells”) were defined as symptoms of poor design and implementation choices applied by programmers during the development of a software project (Fowler 1999). As a form of technical debt (Cunningham 1993), they could hinder the comprehensibility and maintainability of software systems (Kruchten et al. 2012). An example of code smell is the God Class, a large and complex class that centralizes the behavior of a portion of a system and only uses other classes as data holders. God Classes can rapidly grow out of control, making it harder and harder for developers to understand them, to fix bugs, and to add new features. The research community has been studying code smells from different perspectives. On the one side, researchers developed methods and tools to detect code smells. Such tools exploit different types of approaches, including metrics-based detection (Lanza and Marinescu 2010; Moha et al. 2010; Marinescu 2004; Munro 2005), graph-based techniques (Tsantalis and Chatzigeorgiou 2009), mining of code changes (Palomba et al. 2015a), textual analysis of source code (Palomba et al. 2016b), or search-based optimization techniques (Kessentini et al. 2010; Sahin et al. 2014). On the other side, researchers investigated how relevant code smells are for developers (Yamashita and Moonen 2013; Palomba et al. 2014), when and why they are introduced (Tufano et al. 2015), how they evolve over time (Arcoverde et al. 2011; Chatzigeorgiou and Manakos 2010; Lozano et al. 2007; Ratiu et al. 2004; Tufano et al. 2017), and whether they impact on software quality properties, such as program comprehensibility (Abbes et al. 2011), fault- and change-proneness (Khomh et al. 2012; Khomh et al. 2009a; D’Ambros et al. 2010), and code maintainability (Yamashita and Moonen 2012, 2013; Deligiannis et al. 2004; Li and Shatnawi 2007; Sjoberg et al. 2013). Similarly to some previous work (Khomh et al. 2012; Li and Shatnawi 2007; Olbrich et al. 2010; Gatrell and Counsell 2015) this paper investigates the relationship existing between the occurrence of code smells in software projects and software change- and faultproneness. Specifically, while previous work shows a significant correlation between smells and code change/fault-proneness, the empirical evidence provided so far is still limited because of: – – Limited size of previous studies: the study by Khomh et al. (2012) was conducted on four open source systems, while the study by D’Ambros et al. (2010) was performed on seven systems. Furthermore, the studies by Li and Shatnawi (2007), Olbrich et al. (2010), and Gatrell and Counsell (2015) were conducted considering the change history of only one software project. Detected smells vs. manually validated smells: Previous work studying the impact of code smells on change- and fault-proneness, including the one by Khomh et al. (2012), relied on data obtained from automatic smell detectors. Although such smell detectors are often able to achieve a good level of accuracy, it is still possible that their intrinsic imprecision affects the results of the study. 1190 – – – – Empir Software Eng (2018) 23:11 88 – 122 1 Lack of analysis of the magnitude of the observed phenomenon: previous work indicated that some smells can be more harmful than others, but the analysis did not take into account the magnitude of the observed phenomenon. For example, even if a specific smell type may be considered harmful when analyzing its impact on maintainability, this may not be relevant in case the number of occurrences of such a smell type in software projects is limited. Lack of analysis of the magnitude of the effect: Previous work indicated that classes affected by code smells have more chances to exhibit defects (or to undergo changes) than other classes. However, no study has observed the magnitude of such changes and defects, i.e., no study addressed the question: How many defects would exhibit on average a class affected by a code smell as compared to another class affected by a different kind of smell, or not affected by any smell at all? Lack of within-artifact analysis: sometimes, a class has intrinsically a very high change-proneness and/or fault-proneness, e.g., because it plays a core role in the system or because it implements a very complex feature. Hence, the class may be intrinsically “smelly”. Instead, there may be classes that become smelly during their lifetime because of maintenance activities (Tufano et al. 2017). Or else, classes where the smell was removed, possibly because of refactoring activities (Bavota et al. 2015). For such classes, it is of paramount importance to analyze the change- and fault-proneness of the class during its evolution, in order to better relate the cause (presence of smell) with the possible effect (change- or fault-proneness). Lack of a temporal relation analysis between smell presence and fault introduction: While previous work correlated the presence of code smells with high fault- and change-proneness, one may wonder whether the artifact was smelly when the fault was introduced, or whether the fault was introduced before the class became smelly. To cope with the aforementione (...truncated)