Automating Documentation of Complex Data Processing Flows with Large Language Models

Short description

Modern software systems increasingly consist of complex, multi-stage data processing flows that integrate artifacts such as code, database queries, and parameters. Maintaining accurate and up-to-date documentation of such systems is challenging due to the dynamic nature and system evolution. This article presents an LLM-based framework for automating documentation of complex data processing flows. The prototype system leverages modular agents and multi-level caching to generate both task-level and process-level documentation. Evaluation using the LLM-as-a-judge approach demonstrates accurate and coherent results on real-world data processing specifications.

Talk language: English
Level: Scientific
Target group:

Company:
Software Competence Center Hagenberg GmbH

Presented by:
DI(FH) MSc. Mario Winterer

DI(FH) MSc. Mario Winterer