Jorge Pérez

RL2, Publisher: Journal of Machine Learning Research, Link>


Pablo Barceló, Javier Marinkovic, Jorge Pérez


Alternatives to recurrent neural networks, in particular, architectures based on self-attention, are gaining momentum for processing input sequences. In spite of their relevance, the computational properties of such networks have not yet been fully explored. We study the computational power of the Transformer, one of the most paradigmatic architectures exemplifying self-attention. We show that the Transformer with hard-attention is Turing complete exclusively based on their capacity to compute and access internal dense representations of the data. Our study also reveals some minimal sets of elements needed to obtain this completeness result.

141 visualizaciones Ir a la publicación

RL2, Publisher: Advances in Neural Information Processing Systems, Link>


Daniel Baez, Marcelo Arenas, Pablo Barceló, Bernardo Subercaseaux, Jorge Pérez


Several queries and scores have recently been proposed to explain individual predictions over ML models. Examples include queries based on “anchors”, which are parts of an instance that are sufficient to justify its classification, and “feature-perturbation” scores such as SHAP. Given the need for flexible, reliable, and easy-to-apply interpretability methods for ML models, we foresee the need for developing declarative languages to naturally specify different explainability queries. We do this in a principled way by rooting such a language in a logic called FOIL, which allows for expressing many simple but important explainability queries, and might serve as a core for more expressive interpretability languages. We study the computational complexity of FOIL queries over two classes of ML models often deemed to be easily interpretable: decision trees and more general decision diagrams. Since the number of possible inputs for an ML model is exponential in its dimension, tractability of the FOIL evaluation problem is delicate but can be achieved by either restricting the structure of the models, or the fragment of FOIL being evaluated. We also present a prototype implementation of FOIL wrapped in a high-level declarative language and perform experiments showing that such a language can be used in practice.

72 visualizaciones Ir a la publicación

RL5, Publisher: arXiv, Link>


Aymé Arango, Barbara Poblete, Jorge Pérez


Automatic hate speech detection in online social networks is an important open problem in Natural Language Processing (NLP). Hate speech is a multidimensional issue, strongly dependant on language and cultural factors. Despite its relevance, research on this topic has been almost exclusively devoted to English. Most supervised learning resources, such as labeled datasets and NLP tools, have been created for this same language. Considering that a large portion of users worldwide speak in languages other than English, there is an important need for creating efficient approaches for multilingual hate speech detection. In this work we propose to address the problem of multilingual hate speech detection from the perspective of transfer learning. Our goal is to determine if knowledge from one particular language can be used to classify other language, and to determine effective ways to achieve this. We propose a hate specific data representation and evaluate its effectiveness against general-purpose universal representations most of which, unlike our proposed model, have been trained on massive amounts of data. We focus on a cross-lingual setting, in which one needs to classify hate speech in one language without having access to any labeled data for that language. We show that the use of our simple yet specific multilingual hate representations improves classification results. We explain this with a qualitative analysis showing that our specific representation is able to capture some common patterns in how hate speech presents itself in different languages. Our proposal constitutes, to the best of our knowledge, the first attempt for constructing multilingual specific-task representations. Despite its simplicity, our model outperformed the previous approaches for most of the experimental setups. Our findings can orient future solutions toward the use of domain-specific representations.

70 visualizaciones Ir a la publicación