Publicaciones

Marcelo Arenas

RL2, Publisher: arXiv, Link>

AUTHORS

Marcelo Arenas, Mikaël Monet, Pablo Barceló, Leopoldo Bertossi

ABSTRACT

In Machine Learning, the SHAP-score is a version of the Shapley value that is used to explain the result of a learned model on a specific entity by assigning a score to every feature. While in general computing Shapley values is an intractable problem, we prove a strong positive result stating that the SHAP-score can be computed in polynomial time over deterministic and decomposable Boolean circuits. Such circuits are studied in the field of Knowledge Compilation and generalize a wide range of Boolean circuits and binary decision diagrams classes, including binary decision trees and Ordered Binary Decision Diagrams (OBDDs). We also establish the computational limits of the SHAP-score by observing that computing it over a class of Boolean models is always polynomially as hard as the model counting problem for that class. This implies that both determinism and decomposability are essential properties for the circuits that we consider. It also implies that computing SHAP-scores is intractable as well over the class of propositional formulas in DNF. Based on this negative result, we look for the existence of fully-polynomial randomized approximation schemes (FPRAS) for computing SHAP-scores over such class. In contrast to the model counting problem for DNF formulas, which admits an FPRAS, we prove that no such FPRAS exists for the computation of SHAP-scores. Surprisingly, this negative result holds even for the class of monotone formulas in DNF. These techniques can be further extended to prove another strong negative result: Under widely believed complexity assumptions, there is no polynomial-time algorithm that checks, given a monotone DNF formula φ and features x,y, whether the SHAP-score of x in φ is smaller than the SHAP-score of y in φ.


288 visualizaciones Ir a la publicación

RL2, Publisher: https://github.com/pdm-book/community

AUTHORS

Wim Martens, Marcelo Arenas, Pablo Barceló, Andreas Pieris, Leonid Libkin

Link>

AUTHORS

Wim Martens, Marcelo Arenas, Pablo Barceló, Andreas Pieris, Leonid Libkin

ABSTRACT

This is a release of parts 1, 2, and 4 of the upcoming book “Principles of Databases”, which will be about the foundational and mathematical principles of databases in its various forms. The first two parts focus on an overview of the relational model, and on processing some of the most commonly occurring relational queries. Forthcoming parts will focus on additional aspects of the relational model and will cover tree-structured and graph-structured data as well.


219 visualizaciones Ir a la publicación

RL2, Publisher: Proceedings of the AAAI Conference on Artificial Intelligence, Link>

AUTHORS

Mikaël Monet, Leopoldo Bertossi, Pablo Barceló, Marcelo Arenas

ABSTRACT

Scores based on Shapley values are widely used for providing explanations to classification results over machine learning models. A prime example of this is the influential SHAP-score, a version of the Shapley value that can help explain the result of a learned model on a specific entity by assigning a score to every feature. While in general computing Shapley values is a computationally intractable problem, it has recently been claimed that the SHAP-score can be computed in polynomial time over the class of decision trees. In this paper, we provide a proof of a stronger result over Boolean models: the SHAP-score can be computed in polynomial time over deterministic and decomposable Boolean circuits. Such circuits, also known as tractable Boolean circuits, generalize a wide range of Boolean circuits and binary decision diagrams classes, including binary decision trees, Ordered Binary Decision Diagrams (OBDDs) and Free Binary Decision Diagrams (FBDDs). We also establish the computational limits of the notion of SHAP-score by observing that, under a mild condition, computing it over a class of Boolean models is always polynomially as hard as the model counting problem for that class. This implies that both determinism and decomposability are essential properties for the circuits that we consider, as removing one or the other renders the problem of computing the SHAP-score intractable (namely, #P

AUTHORS

Mikaël Monet, Leopoldo Bertossi, Pablo Barceló, Marcelo Arenas

-hard).


281 visualizaciones Ir a la publicación

RL2, Publisher: Advances in Neural Information Processing Systems, Link>

AUTHORS

Marcelo Arenas, Pablo Barceló, Bernardo Subercaseaux, Jorge Pérez, Daniel Baez

ABSTRACT

Several queries and scores have recently been proposed to explain individual predictions over ML models. Examples include queries based on “anchors”, which are parts of an instance that are sufficient to justify its classification, and “feature-perturbation” scores such as SHAP. Given the need for flexible, reliable, and easy-to-apply interpretability methods for ML models, we foresee the need for developing declarative languages to naturally specify different explainability queries. We do this in a principled way by rooting such a language in a logic called FOIL, which allows for expressing many simple but important explainability queries, and might serve as a core for more expressive interpretability languages. We study the computational complexity of FOIL queries over two classes of ML models often deemed to be easily interpretable: decision trees and more general decision diagrams. Since the number of possible inputs for an ML model is exponential in its dimension, tractability of the FOIL evaluation problem is delicate but can be achieved by either restricting the structure of the models, or the fragment of FOIL being evaluated. We also present a prototype implementation of FOIL wrapped in a high-level declarative language and perform experiments showing that such a language can be used in practice.


134 visualizaciones Ir a la publicación

RL2, Publisher: Iscience, Link>

AUTHORS

Marcelo Arenas, Martín Ugarte, Nicolás Salas, Pablo Barceló, Pedro Bahamondes, Pedro Saa, Sandra Solari, Susana Eyheramendy, Luis Méndez, Andrés Finkelstein-Kulka, avier Pizarro-Berdichevsky, Eduardo Agosin, Eduardo A Undurraga, Carolina López, Carlos Valencia

ABSTRACT

The sudden loss of smell is among the earliest and most prevalent symptoms of COVID-19 when measured with a clinical psychophysical test. Research has shown the potential impact of frequent screening for olfactory dysfunction, but existing tests are expensive and time consuming. We developed a low-cost ($0.50/test) rapid psychophysical olfactory test (KOR) for frequent testing and a model-based COVID-19 screening framework using a Bayes Network symptoms model. We trained and validated the model on two samples: suspected COVID-19 cases in five healthcare centers (n = 926; 33% prevalence, 309 RT-PCR confirmed) and healthy miners (n = 1,365; 1.1% prevalence, 15 RT-PCR confirmed). The model predicted COVID-19 status with 76% and 96% accuracy in the healthcare and miners samples, respectively (healthcare: AUC = 0.79 [0.75–0.82], sensitivity: 59%, specificity: 87%; miners: AUC = 0.71 [0.63–0.79], sensitivity: 40%, specificity: 97%, at 0.50 infection probability threshold). Our results highlight the potential for low-cost, frequent, accessible, routine COVID-19 testing to support society's reopening.


119 visualizaciones Ir a la publicación

RL2, Publisher: Proceedings of the AAAI Conference on Artificial Intelligence, Link>

AUTHORS

Marcelo Arenas, Mikaël Monet, Pablo Barceló, Leopoldo Bertossi

ABSTRACT

Scores based on Shapley values are widely used for providing explanations to classification results over machine learning models. A prime example of this is the influential SHAP-score, a version of the Shapley value that can help explain the result of a learned model on a specific entity by assigning a score to every feature. While in general computing Shapley values is a computationally intractable problem, it has recently been claimed that the SHAP-score can be computed in polynomial time over the class of decision trees. In this paper, we provide a proof of a stronger result over Boolean models: the SHAP-score can be computed in polynomial time over deterministic and decomposable Boolean circuits. Such circuits, also known as tractable Boolean circuits, generalize a wide range of Boolean circuits and binary decision diagrams classes, including binary decision trees, Ordered Binary Decision Diagrams (OBDDs) and Free Binary Decision Diagrams (FBDDs). We also establish the computational limits of the notion of SHAP-score by observing that, under a mild condition, computing it over a class of Boolean models is always polynomially as hard as the model counting problem for that class. This implies that both determinism and decomposability are essential properties for the circuits that we consider, as removing one or the other renders the problem of computing the SHAP-score intractable (namely, #P

AUTHORS

Marcelo Arenas, Mikaël Monet, Pablo Barceló, Leopoldo Bertossi

-hard).


92 visualizaciones Ir a la publicación