Wasif Afzal, PhD Student, BTH
Name: Wasif Afzal
Topic: Empirical evaluation of search-based software fault count prediction
PhD start: November 2007 Licentitate: Planned for mid 2009 PhD: Planned for 2011
Company contacts: Sauer-Danfoss, Rymdbolaget, ABB
Industry relevance:
Fault prediction models have attracted considerable interest, both in research and practice. From research point of view, new methods of fault prediction are regularly being proposed and their predictability assessed at varying levels of detail, thus improving our understanding. The practical aspect of such models has strong implications on the quality of the software project. The information gained from such models is an important decision making tool for the project managers to make better decisions in uncertain situations.
The number of faults in a software module or particular release of a software system represents a quantitative measure of software quality. A fault prediction model uses previous software quality data in the form of metrics to predict the number of software faults in a module or software release. Fault predictions for a software release are fundamental to the efforts of quantifying software quality.
A fault prediction model helps a software development team in prioritizing the effort to be spent on a software project. Since quality improvement activities consumes resources, proper resource allocation for quality improvement can give considerable savings for a software project. The development of large software systems is costly therefore even small gains in prediction accuracy should be appreciable. Apart from the efficiency gains, architectural improvements can be made by better designing high-risk segments of the system.
Research description:
The project started with a series of experiments using genetic programming for fault count predictions. We wanted to generalize previous results applying genetic programming for fault count predictions. We also wanted to show how the problem description could be represented in a different way to suit the application of genetic programming for fault count predictions in new situations. This led us to do experiments for cross-release prediction of fault count data in the context of multi-release projects; this time making use of data sets from both industrial and open source projects. The basic purpose of is to help develop empirical knowledge into innovative ways of predicting fault count data and to apply the resulting models in a manner which is suited to the current trends in software development.
After having done studies in this specialized domain or within an application area of search-based techniques, we wanted to further investigate the use of search-based techniques for an interesting but not extensively explored application area, i.e., search-based testing for non-functional system properties, while staying within the broader domain of software verification and validation. A systematic review was undertaken for this purpose, which showed the application of metaheuristic search techniques in testing different non-functional system properties. Our initial search results also showed studies, which, although, applied search-based techniques, were not related to test data generation. Examples of such studies included reliability modelling and test planning. These additional studies are more closely related to our earlier series of experiments using genetic programming for fault count predictions.
RQ.1: What is the prediction precision, general applicability and adaptability (goodness of fit) of genetic programming in modelling fault count data?
RQ.2: Is there evidence in the existing literature suggesting that symbolic regression can be an effective method for prediction and estimation in comparison with regression and machine learning models for software projects?
RQ.3: What is the current state of research into testing of non-functional system properties using search-based techniques?
RQ.4: What are the potential issues in introducing evolutionary computation in industry?
RQ.5: What are the efficiency gains if we introduce evolutionary computation in industry?
Publications
W.Afzal, R. Torkar, R. Feldt. A systematic mapping study on non-functional search-based software testing, 20th International Conference on Software Engineering and Knowledge Engineering (SEKE'08)
W. Afzal, R. Torkar. A comparative evaluation of using genetic programming for predicting fault count data, The 3rd IEEE International Conference on Software Engineering Advances (ICSEA'08)
W.Afzal, R. Torkar. Suitability of genetic programming for software reliability growth modelling, The 2008 IEEE International Symposium on Computer Science and its Applications (CSA'08)
W.Afzal, R. Torkar, R. Feldt. Prediction of fault count data using genetic programming, The 12th IEEE International Multitopic Conference (INMIC'08)
W.Afzal, R. Torkar. Lessons from applying experimentation in software engineering prediction systems, The 2nd International workshop on Software Productivity Analysis and Cost Estimation (SPACE'08), Collocated with 15th Asia-Pacific Software Engineering Conference (APSEC'08)
W.Afzal, R. Torkar. Incorporating Metrics in an Organizational Test Strategy, International Software Testing Standard Workshop, Collocated with 1st International Conference on Software Testing, Verification and Validation (ICST'08)
R. Feldt, R. Torkar, T. Gorschek, W.Afzal. Searching for cognitively diverse tests: towards universal test diversity metrics, 1st International Workshop on Search-based Software Testing, Collocated with 1st International Conference on Software Testing, Verification and Validation (ICST'08)
W.Afzal, R. Torkar, R. Feldt. A systematic review of search-based testing for non-functional system properties, In press, Information and Software Technology.