Global vs. local models for cross-project defect prediction

Steffen Herbold, Alexander Trautsch, Jens Grabowski

Abstract

Although researchers invested significant effort, the performance of defect pre-diction in a cross-project setting, i.e., with data that does not come from the same project, is still unsatisfactory. A recent proposal for the improvement of defect prediction is using local models. With local models, the available data is first clustered into homogeneous regions and afterwards separate classifiers are trained for each homogeneous region. Since the main problem of cross-project defect prediction is data heterogeneity, the idea of local models is promising. Therefore, we perform a conceptual replication of the previous studies on local models with a focus on cross-project defect prediction. In a large case study, we evaluate the performance of local models and investigate their advantages and drawbacks for cross-project predictions. To this aim, we also compare the performance with a global model and a transfer learning technique designed for cross-project defect predictions. Our findings show that local models make only a minor difference in comparison to global models and transfer learning for cross-project defect prediction. While these results are negative, they provide valuable knowledge about the limitations of local models and increase the validity of previously gained research results.

Keywords:

Defect prediction, Cross-project, Local models

Document Type:

Journal Articles

Publisher:

Springer

Journal:

Empirical Software Engineering

Volume:

Number:

Pages:

1866-1902

Month:

Year:

2017

URL:

https://doi.org/10.1007/s10664-016-9468-y

DOI:

10.1007/s10664-016-9468-y

File:

paper_final.pdf

Search form

Global vs. local models for cross-project defect prediction

Abstract

Bibtex

Main menu 2

Search form

Global vs. local models for cross-project defect prediction

Abstract

Related Projects

Bibtex

Main menu 2