An experimental evaluation of imbalanced learning and time-series validation in the context of CI/CD prediction
Liu, Bohan, Zhang, He, Yang, Lanxin, Dong, Liming, Shen, Haifeng and Song, Kaiwen. (2020) An experimental evaluation of imbalanced learning and time-series validation in the context of CI/CD prediction. EASE 2020, April 15-17, 2020, Trondheim, Norway. Norway: Association for Computing Machinery. pp. 21 - 30 https://doi.org/10.1145/3383219.3383222
|Authors||Liu, Bohan, Zhang, He, Yang, Lanxin, Dong, Liming, Shen, Haifeng and Song, Kaiwen|
Background: Machine Learning (ML) has been widely used as a powerful tool to support Software Engineering (SE). The fundamental assumptions of data characteristics required for specific ML methods have to be carefully considered prior to their applications in SE. Within the context of Continuous Integration (CI) and Continuous Deployment (CD) practices, there are two vital characteristics of data prone to be violated in SE research. First, the logs generated during CI/CD for training are imbalanced data, which is contrary to the principles of common balanced classifiers; second, these logs are also time-series data, which violates the assumption of cross-validation. Objective: We aim to systematically study the two data characteristics and further provide a comprehensive evaluation for predictive CI/CD with the data from real projects. Method: We conduct an experimental study that evaluates 67 CI/CD predictive models using both cross-validation and time-series-validation. Results: Our evaluation shows that cross-validation makes the evaluation of the models optimistic in most cases, there are a few counter-examples as well. The performance of the top 10 imbalanced models are better than the balanced models in the predictions of failed builds, even for balanced data. The degree of data imbalance has a negative impact on prediction performance. Conclusion: In research and practice, the assumptions of the various ML methods should be seriously considered for the validity of research. Even if it is used to compare the relative performance of models, cross-validation may not be applicable to the problems with time-series features. The research community need to revisit the evaluation results reported in some existing research.
|Keywords||continuous integration; continuous deployment; time-series-validation; cross-validation; imbalanced learning|
|Journal||EASE '20: Proceedings of the Evaluation and Assessment in Software Engineering|
|Publisher||Association for Computing Machinery|
|Digital Object Identifier (DOI)||https://doi.org/10.1145/3383219.3383222|
|Open access||Open access|
|Page range||21 - 30|
|Research Group||Peter Faber Business School|
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from email@example.com.
|Place of publication||Norway|
2views this month
3downloads this month