JOPT2025
HEC Montréal, 12 — 14 mai 2025
JOPT2025
HEC Montréal, 12 — 14 mai 2025

Machine Learning
14 mai 2025 13h20 – 15h00
Salle: Luc-Poirier (Verte)
Présidée par Hassan Dehghan Shoorkand
4 présentations
-
13h20 - 13h45
Training Set Reconstruction from Differentially Private Forests: How Effective is DP?
Recent research has shown that machine learning models are vulnerable to privacy attacks targeting their training data.
Differential privacy (DP) has become a widely adopted countermeasure, as it offers rigorous privacy protections.
In this work, we introduce a reconstruction attack targeting state-of-the-art DP random forests. By leveraging a constraint programming model that incorporates knowledge of the forest's structure and DP mechanism characteristics, our approach formally reconstructs the most likely dataset that could have produced a given forest.Through extensive computational experiments, we examine the interplay between model utility, privacy guarantees, and reconstruction accuracy across various configurations. Our results reveal that random forests trained with meaningful DP guarantees can still leak substantial portions of their training data. Specifically, while DP reduces the success of reconstruction attacks, the only forests fully robust to our attack exhibit predictive performance no better than a constant classifier. Building on these insights, we provide practical recommendations for the construction of DP random forests that are more resilient to reconstruction attacks and maintain non-trivial predictive performance.
-
13h45 - 14h10
Étude comparative de LSTM et XGBoost pour la prédiction de la durée de vie résiduelle à l’aide du jeu de données C-MAPSS
La prédiction de la durée de vie résiduelle (Remaining Useful Life, RUL) est une composante essentielle des stratégies modernes de maintenance prédictive. Dans le cadre de ce projet, nous avons développé et comparé deux modèles d'apprentissage automatique – LSTM (Long Short-Term Memory) et XGBoost – pour estimer la RUL de moteurs à réaction en fonctionnement sous différentes conditions d’utilisation et environnements opérationnels, à partir des données du jeu C-MAPSS-2, publié par la NASA en 2021.
Notre objectif était de tester la capacité de généralisation de ces modèles sur des séries temporelles de capteurs simulant la dégradation d’un moteur à réaction en conditions variables. L’analyse comparative repose principalement sur la métrique RMSE, où le modèle LSTM a obtenu un score de 7.02, surpassant XGBoost (RMSE = 9.34). Ces résultats sont cohérents avec ceux rapportés dans la littérature, où les modèles à base de réseaux de neurones récurrents sont souvent mieux adaptés à la nature séquentielle des données de santé des systèmes.
Ce travail s’inscrit dans une approche orientée optimisation de la maintenance industrielle, permettant de réduire les coûts d’immobilisation tout en améliorant la fiabilité des systèmes critiques. Une discussion est également proposée sur les méthodes d’amélioration des performances, telles que les architectures hybrides et l’intégration de mécanismes d’attention. -
14h10 - 14h35
Optimization models for Group Counterfactual Analysis
Counterfactual analysis has proven to be a powerful tool in the growing field of Explainable Artificial Intelligence. In supervised classification, the goal is to associate each record with a counterfactual explanation: an instance that is close—according to a given metric—to the original record, but whose probability of being classified into the opposite class by a given classifier is high. Finding counterfactual explanations is equivalent to solving an optimization model, the structure of which depends on several factors. This talk will illustrate various such models, including those designed for groups of instances, functional data, benchmarking models, and other extensions.
-
14h35 - 15h00
FI-LSTM prediction model for integrated production planning and predictive maintenance
In industrial operations, the integration of production planning and machine condition assessment can significantly enhance efficiency and productivity. With the emergence of Industry X.0 concepts, traditional model-based approaches are increasingly being replaced by data-driven approaches. Recent studies have explored Deep Learning (DL) techniques to predict the remaining useful life (RUL) of the machines involved and develop cost-effective production and predictive maintenance (PdM) plans. However, beyond the substantial contributions of these models, they often fail to address the black-box nature of prediction algorithms. This paper addresses this issue by proposing an integrated DL-mathematical model that combines production and PdM planning, leveraging on a counterfactual reasoning approach. The proposed model not only interprets the black-box nature of the prediction process but also identifies and emphasizes the importance of each feature in assessing system conditions. The integrated DL-mathematical model is validated using a publicly available NASA dataset. Sensitivity analysis demonstrates the effectiveness of the model, highlighting its potential to drive cost-efficient and reliable decision.