Fine-Tuning Large Language Models in RAG Architecture: An Applied Approach for Failure Analysis

Tuesday, October 6, 2026: 1:30 PM
Mr. Markus Kofler , Universität Klagenfurt, Klagenfurt, Austria
Dr. Konstantin Schekotihin , Universität Klagenfurt, Klagenfurt, Carinthia, Austria

Summary:

Failure Analysis (FA) engineers must diagnose product malfunctions and identify root causes from large volumes of heterogeneous information, including prior debugging procedures and product documentation. This paper examines domain-specific fine-tuning of Large Language Models within a Retrieval-Augmented Generation (RAG) architecture for FA. We fine-tune both the embedding model used for retrieval and the completion model used for answer generation. Our pipeline combines synthetic dataset generation from FA data sources and an internal FA ontology with embedding optimization via Multiple Negatives Ranking Loss and completion-model adaptation through Parameter-Efficient Fine-Tuning and Group Relative Policy Optimization. Both models show significant improvements in measured performance, reaching results comparable with those of much larger models.