Machine Learning Regression-Based Prediction for Improving Performance and Energy Consumption in HPC Platforms

No Thumbnail Available
Date
2025
Authors
André Martins Pereira
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
High-performance computing is pivotal for processing large datasets and executing complex simulations, ensuring faster and more accurate results. Improving the performance of software and scientific workflows in such environments requires careful analysis of their computational behavior and energy consumption. Therefore, maximizing computational throughput in these environments, through adequate software configuration and resource allocation, is essential for improving performance. The work presented in this paper focuses on leveraging regression-based machine learning and decision trees to analyze and optimize resource allocation in high-performance computing environments based on application's performance and energy metrics. Applied to a bioinformatics case study, these models enable informed decision-making by selecting the appropriate computing resources to enhance the performance of a phylogenomics software. Our contribution is to better explore and understand the efficient resource management of supercomputers, namely Santos Dumont. We show that the predictions for application's execution time using the proposed method are accurate for various amounts of computing nodes, while energy consumption predictions are less precise. The application parameters most relevant for this work are identified and the relative importance of each application parameter to the accuracy of the prediction is analysed.
Description
Keywords
Citation