A Preliminary Study on Using Text- and Image-Based Machine Learning to Predict Software Maintainability

Authors: Markus Schnappinger, Simon Zachau, Arnaud Fietzke and Alexander Pretschner

Machine learning has emerged as a useful tool to aid software quality control. It can support identifying problematic code snippets or predicting maintenance eorts. The majority of these frameworks rely on code metrics as input. However, evidence suggests great potential for text- and image-based approaches to predict code quality as well. Using a manually labeled dataset, this preliminary study examines the use of five text- and two image-based algorithms to predict the readability, understandability, and complexity of source code. While the overall performance can still be improved, we find Support Vector Machines (SVM) outperform sophisticated text transformer models and image-based neural networks. Furthermore, text-based SVMs tend to perform well on predicting readability and understandability of code, while image-based SVMs can predict code complexity more accurately. Our study both shows the potential of text- and image-based algorithms for software quality prediction and outlines their weaknesses as a starting point for further research.

Presented by: Markus Schnappinger, Simon Zachau
Company: Technische Universität München

Talk language: English
Level: Advanced
Target group:

Partner der Konferenz 2022

ASQF e.V ATB - Austrian Testing Board coderskitchen dpunkt.verlag GmbH Fortiss GmbH GTB - German Testing Board Heise Medien GmbH & Co. KG iSQI GmbH IT Verlag für Informationstechnik GmbH SIGS DATACOM GmbH TU Wien, Institut für Information Systems Engineering, CDL-SQI WKO - Wirtschaftskammer