Towards certification: A complete statistical validation pipeline for supervised learning in industry

Lacasa, Lucas; Pardo, Abel; Arbelo, Pablo; Sánchez, Miguel; Bascones, Noelia; Yeste, Pablo; Martínez-Cavas, Alejandro; Rubio, Gonzalo; Gómez, Ignacio; Valero, Eusebio; De Vicente, Javier
Expert Systems With Applications , (2025)

Machine and Deep Learning methods are gradually being integrated into industrial operations, although at different speeds for different types of industries. The aerospace and aeronautical industries have been recently pushing to develop a roadmap for concepts of design assurance and integration of neural network-related technologies in the aeronautical sector. Within such
a roadmap, there is currently no clear definition of what would constitute a certifiable validation protocol for an industrial surrogate model. This paper aims to bridge this gap and contribute to the paradigm of AI-based certification in the context of supervised learning, by proposing a practical and comprehensible validation pipeline of a surrogate model that integrates
concepts and methods from deep learning, mathematical optimization and statistical data science. This pipeline is represented as a directed graph of ten logical steps/nodes, incorporating both linear-sequential procedures and feedback loops to iteratively refine the surrogate model. Each of these logical steps solves specific machine learning problems by merging previously
scattered key statistical methods with some novel algorithmic solutions. We provide details and best practices for each of the steps and we illustrate the application of this pipeline in a realistic supervised problem arising in aerostructural design: predicting the likelihood of different stress-related failure modes during different flight maneuvers based on a (large) set of features
characterizing the aircraft internal loads and geometric parameters.

