Empirical Software Engineering

http://link.springer.com/journal/10664

List of Papers (Total 54)

Managing the requirements flow from strategy to release in large-scale agile development: a case study at Ericsson

In a large organization, informal communication and simple backlogs are not sufficient for the management of requirements and development work. Many large organizations are struggling to successfully adopt agile methods, but there is still little scientific knowledge on requirements management in large-scale agile development organizations. We present an in-depth study of an ...

On negative results when using sentiment analysis tools for software engineering research

Recent years have seen an increasing attention to social aspects of software engineering, including studies of emotions and sentiments experienced and expressed by the software developers. Most of these studies reuse existing sentiment analysis tools such as SentiStrength and NLTK. However, these tools have been trained on product reviews and movie reviews and, therefore, their ...

The last line effect explained

Micro-clones are tiny duplicated pieces of code; they typically comprise only few statements or lines. In this paper, we study the “Last Line Effect,” the phenomenon that the last line or statement in a micro-clone is much more likely to contain an error than the previous lines or statements. We do this by analyzing 219 open source projects and reporting on 263 faulty micro-clones ...

An initial analysis of software engineers’ attitudes towards organizational change

Employees’ attitudes towards organizational change are a critical determinant in the change process. Researchers have therefore tried to determine what underlying concepts that affect them. These extensive efforts have resulted in the identification of several antecedents. However, no studies have been conducted in a software engineering context and the research has provided little ...

Recurring opinions or productive improvements—what agile teams actually discuss in retrospectives

Team-level retrospectives are widely used in agile and lean software development, yet little is known about what is actually discussed during retrospectives or their outcomes. In this paper, we synthesise the outcomes of sprint retrospectives in a large, distributed, agile software development organisation. This longitudinal case study analyses data from 37 team-level ...

Global vs. local models for cross-project defect prediction

Although researchers invested significant effort, the performance of defect prediction in a cross-project setting, i.e., with data that does not come from the same project, is still unsatisfactory. A recent proposal for the improvement of defect prediction is using local models. With local models, the available data is first clustered into homogeneous regions and afterwards ...

fine-GRAPE: fine-grained APi usage extractor – an approach and dataset to investigate API usage

An Application Programming Interface (API) provides a set of functionalities to a developer with the aim of enabling reuse. APIs have been investigated from different angles such as popularity usage and evolution to get a better understanding of their various characteristics. For such studies, software repositories are mined for API usage examples. However, many of the mining ...

Robust Statistical Methods for Empirical Software Engineering

There have been many changes in statistical theory in the past 30 years, including increased evidence that non-robust methods may fail to detect important results. The statistical advice available to software engineering researchers needs to be updated to address these issues. This paper aims both to explain the new results in the area of robust analysis methods and to provide a ...

An experimental search-based approach to cohesion metric evaluation

In spite of several decades of software metrics research and practice, there is little understanding of how software metrics relate to one another, nor is there any established methodology for comparing them. We propose a novel experimental technique, based on search-based refactoring, to ‘animate’ metrics and observe their behaviour in a practical setting. Our aim is to promote ...

A detailed investigation of the effectiveness of whole test suite generation

A common application of search-based software testing is to generate test cases for all goals defined by a coverage criterion (e.g., lines, branches, mutants). Rather than generating one test case at a time for each of these goals individually, whole test suite generation optimizes entire test suites towards satisfying all goals at the same time. There is evidence that the overall ...

On the detection of custom memory allocators in C binaries

Many reverse engineering techniques for data structures rely on the knowledge of memory allocation routines. Typically, they interpose on the system’s malloc and free functions, and track each chunk of memory thus allocated as a data structure. However, many performance-critical applications implement their own custom memory allocators. Examples include webservers, database ...

Scalable data structure detection and classification for C/C++ binaries

Many existing techniques for reversing data structures in C/C ++ binaries are limited to low-level programming constructs, such as individual variables or structs. Unfortunately, without detailed information about a program's pointer structures, forensics and reverse engineering are exceedingly hard. To fill this gap, we propose MemPick, a tool that detects and classifies ...

HAZOP-based identification of events in use cases

Completeness is one of the main quality attributes of requirements specifications. If functional requirements are expressed as use cases, one can be interested in event completeness. A use case is event complete if it contains description of all the events that can happen when executing the use case. Missing events in any use case can lead to higher project costs. Thus, the ...

Classification model for code clones based on machine learning

Results from code clone detectors may contain plentiful useless code clones, but judging whether each code clone is useful varies from user to user based on a user’s purpose for the clone. In this research, we propose a classification model that applies machine learning to the judgments of each individual user regarding the code clones. To evaluate the proposed model, 32 ...