An Analysis of the Differences between Unit and Integration Tests

Fabian Trautsch


Context: In software testing, there are several concepts that were established over the years, including unit and integration testing. These concepts are defined in standards and used in software testing certifications, which underline their importance for research and industry. However, these concepts are decades old. Nowadays, we do not have any evidence that these concepts still apply for modern software systems. Objective: The purpose of this thesis is to evaluate, if the differences between unit and integration testing are still valid nowadays. To this aim, we analyze defined differences between these test levels to provide evidence, if these are still current in modern software. Method: We performed quantitative and qualitative analysis on differences between unit and integration tests. The quantitative analysis was performed via a case study including 27 Java and Python projects with more than 49000 tests. During this analysis we classified tests into unit and integration tests according to the definitions of the Institute of Electrical and Electronics Engineers (IEEE) and International Software Testing Qualification Board (ISTQB) and calculated several metrics for those tests. We then used these metrics to assess three differences between these levels. For the qualitative analysis we searched for relevant research literature, developer comments, and further information regarding differences be- tween unit and integration tests. The found resources are evaluated to gain an understanding of the research and industrial perspective on the differences, i.e., if they are existent and to which magnitude. Results: We found that more integration than unit tests are present in most projects, when classified according to the definitions of the IEEE and ISTQB. However, the exact numbers differ between these definitions. Based on the developer classification of tests, there is no significant difference in the number of unit and integration tests. Our quantitative analysis highlights that diverse defined differences are no longer existent. We found, that the defect types that are detected by both test types, do not differ from each other and that there are no significant differences in their execution time. However, we confirmed that unit tests are better able to pinpoint the source of a defect. Our qualitative analysis of research and industrial perspective shows, that both test types are executed automatically, that their test objectives mostly differ from each other, and that practitioners experienced that integration tests are more costly than unit tests. Conclusions: Our results suggest that the current definitions of unit and integration tests are outdated and need to be reconsidered as most of the differences are vanishing. One reason for this could be technological advancements in the area of software testing and software engineering. However, this needs to be further investigated.
unit testing; integration testing; empirical software engineering; mining software repositories
Document Type: 
Ph.D. Theses
2024 © Software Engineering For Distributed Systems Group

Main menu 2