Marius Vuille

CryptoDB

Marius Vuille

Publications and invited talks

Year

Venue

Title

2025

CIC

XorSHAP: Privacy-Preserving Explainable AI for Decision Tree Models Abstract

Dimitar Jetchev Marius Vuille

<p>Explainable AI (XAI) refers to the development of AI systems and machine learning models in a way that humans can understand, interpret and trust the predictions, decisions and outputs of these models. A common approach to explainability is feature importance, that is, determining which input features of the model have the most significant impact on the model prediction. Two major techniques for computing feature importance are LIME (Local Interpretable Model-agnostic Explanations) and SHAP (SHapley Additive exPlanations). While very generic, these methods are computationally expensive even when the data is not encrypted. Applying them in the privacy-preserving setting when part or all of the input data is private is therefore a major computational challenge. In this paper, we present XorSHAP - the first practical data-oblivious algorithm for computing SHAP values for decision tree ensemble models. The algorithm is applicable in various privacy-preserving settings such as SMPC, FHE and differential privacy. Our algorithm has complexity $O(T \widetilde{M} D 2^D)$, where $T$ is the number of decision trees in the ensemble, $D$ is the depth of the decision trees and $\widetilde{M}$ is the maximum of the number of features $M$ and $2^D$ (the number of leaf nodes of a tree), and scales to real-world datasets. We implement the algorithm in the semi-honest Secure Multiparty Computation (SMPC) setting with full threshold using Inpher's Manticore framework. Our implementation simultaneously computes the SHAP values for 100 samples for an ensemble of $T = 60$ trees of depth $D = 4$ and $M = 100$ features in just 7.5 minutes, meaning that the SHAP values for a single prediction are computed in just 4.5 seconds for the same decision tree ensemble model. Additionally, it is parallelization-friendly, thus, enabling future work on massive hardware acceleration with GPUs. </p>

2023

JOFC

Manticore: A Framework for Efficient Multiparty Computation Supporting Real Number and Boolean Arithmetic Abstract

Mariya Georgieva Belorgey Sergiu Carpov Kevin Deforth Dimitar Jetchev Abson Sae-Tang Marius Vuille Nicolas Gama Jon Katz Iraklis Leontiadis Mohsen Mohammadi

We propose a novel framework, $$\texttt{Manticore}$$ Manticore , for multiparty computations, with full threshold and semi-honest security model, supporting a combination of real number arithmetic (arithmetic shares), Boolean arithmetic (Boolean shares) and garbled circuits (Yao shares). In contrast to prior work (Mohassel and Zhang, in 2017 IEEE symposium on security and privacy (SP), 2017; Mohassel and Rindal, in Proceedings of the 2018 ACM SIGSAC conference on computer and communications security, 2018), $$\texttt{Manticore}$$ Manticore mitigates overflows, which is of paramount importance for machine learning applications, without compromising efficiency or security. Compared to other overflow-free recent techniques such as MP-SPDZ (Escudero et al., in 40th annual international cryptology conference, CRYPTO. Lecture notes in computer science, 2020) that convert arithmetic to Boolean shares, $$\texttt{Manticore}$$ Manticore uses an efficient modular lifting/truncation method that allows for scalable high numerical precision computations with optimal numerical windows and hence, highly efficient online phases. We adapt basic MPC operations such as real-valued polynomial evaluation, division, logarithms, exponentials, Fourier series evaluations and oblivious comparisons to $$\texttt{Manticore}$$ Manticore by employing our modular lift in combination with existing efficient conversions between arithmetic, Boolean and Yao shares. We also describe a highly scalable computations of logistic regression models with real-world training data sizes and high numerical precision through PCA and blockwise variants (for memory and runtime optimizations) based on second-order optimization techniques. On a dataset of 50 M samples and 50 features distributed among two players, the online phase completes in 14.5 h with at least 10 decimal digits of precision compared to plaintext training. The setup phase of $$\texttt{Manticore}$$ Manticore is supported in both the trusted dealer and the interactive models allowing for tradeoffs between efficiency and stronger security. The highly efficient online phase makes the framework particularly suitable for MPC applications where the output of the setup phase is part of the input of the protocol (such as MPC-in-the-head or Prio ).