Hans Löbel

Hans Löbel

Especialidad: Machine learning, reconocimiento visual, sistemas urbanos inteligentes.
Hans Lobel es Doctor en Ciencias de la Ingeniería por la Pontificia Universidad Católica de Chile, título que obtuvo en 2016. Ha sido Co-Investigador en el Fondecyt #1171049 sobre “High-Resolution Analysis of Fleet Operations in Home-delivery” y en el Fondecyt # 1251758 para modelar el comportamiento del transporte público. Participó como Investigador en el Fondef Minería #IT17M10011, un sistema de análisis de Big Data en refinerías electrolíticas. Fue Investigador Responsable en dos proyectos Fondecyt: el de Iniciación #11181152 sobre aprendizaje incremental de tareas para redes neuronales profundas, y el #1241772 enfocado en aprender representaciones transferibles de datos urbanos multi-fuente y multi-modal con IA. Dirigió el proyecto CORFO 24IAT-271908, que desarrolló un asistente inteligente multimodal para Viña Concha y Toro. Además, fue Investigador CENIA en el Contrato Tecnológico CENIA Desafíos Públicos Subsecretaría de Transportes #RP22I60007, una plataforma de monitoreo inteligente de movilidad urbana.

PUBLICACIONES

Publisher: Elsevier, Data in Brief  Link>

ABSTRACT

The COVID-19 pandemic has underlined the need for reliable information for clinical decision-making and public health policies. As such, evidence-based medicine (EBM) is essential in identifying and evaluating scientific documents pertinent to novel diseases, and the accurate classification of biomedical text is integral to this process. Given this context, we introduce a comprehensive, curated dataset composed of COVID-19-related documents.

This dataset includes 20,047 labeled documents that were meticulously classified into five distinct categories: systematic reviews (SR), primary study randomized controlled trials (PS-RCT), primary study non-randomized controlled trials (PS-NRCT), broad synthesis (BS), and excluded (EXC). The documents, labeled by collaborators from the Epistemonikos Foundation, incorporate information such as document type, title, abstract, and metadata, including PubMed id, authors, journal, and publication date.

Uniquely, this dataset has been curated by the Epistemonikos Foundation and is not readily accessible through conventional web-scraping methods, thereby attesting to its distinctive value in this field of research. In addition to this, the dataset also includes a vast evidence repository comprising 427,870 non-COVID-19 documents, also categorized into SR, PS-RCT, PS-NRCT, BS, and EXC. This additional collection can serve as a valuable benchmark for subsequent research. The comprehensive nature of this open-access dataset and its accompanying resources is poised to significantly advance evidence-based medicine and facilitate further research in the domain.


We introduce Quasiconformal Neural Networks (QNNs), a novel framework that integrates quasiconformal maps into neural architectures, providing a rigorous mathematical basis for handling non-Euclidean data. QNNs control geometric distortions using bounded maximal dilatation across network layers, preserving essential data structures. We present theoretical results that guarantee the stability and geometric consistency of QNNs. This work opens new avenues in geometric deep learning, particularly for applications involving complex topologies, with significant implications for fields such as image registration and medical imaging.

Publisher: Proceedings of Machine Learning Research Link>

ABSTRACT

During the last few years, the field of dynamical systems has been developing innovative tools to study the asymptotic behavior of different optimizers in the context of neural networks. In this work, we redefine an extensively studied optimizer, employing classical techniques from hyperbolic geometry. This new definition is linked to a non-linear differential equation as a continuous limit. Additionally, by utilizing Lyapunov stability concepts, we analyze the asymptotic behavior of its critical points.

The rapid growth of e-commerce and the increasing need for logistical optimization in highly congested urban environments require advanced models for vehicle speed prediction. Traditional models often overlook the influence of the geographic environment and rely solely on historical speed data, limiting their accuracy in dynamic scenarios. In addition, most approaches use square grid structures, which introduce spatial distortions and fail to capture the connectivity of road networks effectively. In this work, we propose a multimodal model that integrates spatio-temporal information from GPS sensors with satellite imagery, leveraging HexConvLSTM and MLP neural networks to enhance predictive robustness. Unlike conventional methods, our approach utilizes a hexagonal grid representation, which provides a more uniform spatial structure and improved neighborhood representation that aligns better with road topology than conventional square grids for modeling multidirectional traffic dynamics. This paper presents the implementation and evaluation of the model, highlighting its effectiveness in improving the accuracy of route planning for freight transportation in Santiago Centro. The results show that the multimodal approach significantly reduces the mean absolute error (MAE) to 2.296 in test dataset, outperforming a baseline model based solely on spatiotemporal data by 8.3%. This research validates the benefits of incorporating visual data and hexagonal grid-based spatial modeling into traffic prediction and suggests exploring its applicability in other urban settings.

Publisher:  Advances in Information Retrieval Link>

ABSTRACT

News media outlets disseminate information across various platforms. Often, these posts present complementary content and perspectives on the same news story. However, to compile a set of related news articles, users must thoroughly scour multiple sources and platforms, manually identifying which publications pertain to the same story. This tedious process hinders the speed at which journalists can perform essential tasks, notably fact-checking. To tackle this problem, we created a dataset containing both related and unrelated news pairs. This dataset allows us to develop information retrieval models grounded in the principle of binary relevance. Recognizing that many Transformer-based models might be suited for this task but could overemphasize relationships based on lexical connections, we tailored a dataset to fine-tune these models to focus on semantically relevant connections in the news domain. To craft this dataset, we introduced a methodology to identify pairs of news stories that are lexically similar yet refer to different events and pairs that discuss the same event but have distinct lexical structures. This design compels Transformers to recognize semantic connections between stories, even when their lexical similarities might be absent. Following a human-annotation assessment, we reveal that BERT outperformed other techniques, excelling even in challenging test cases. To ensure the reproducibility of our approach, we have made the dataset and top-performing models publicly available.

Publisher: IEEE Access, Link>

ABSTRACT

Continuous learning occurs naturally in human beings. However, Deep Learning methods suffer from a problem known as Catastrophic Forgetting (CF) that consists of a model drastically decreasing its performance on previously learned tasks when it is sequentially trained on new tasks. This situation, known as task interference, occurs when a network modifies relevant weight values as it learns a new task. In this work, we propose two main strategies to face the problem of task interference in convolutional neural networks. First, we use a sparse coding technique to adaptively allocate model capacity to different tasks avoiding interference between them. Specifically, we use a strategy based on group sparse regularization to specialize groups of parameters to learn each task. Afterward, by adding binary masks, we can freeze these groups of parameters, using the rest of the network to learn new tasks. Second, we use a meta learning technique to foster knowledge transfer among tasks, encouraging weight reusability instead of overwriting. Specifically, we use an optimization strategy based on episodic training to foster learning weights that are expected to be useful to solve future tasks. Together, these two strategies help us to avoid interference by preserving compatibility with previous and future weight values. Using this approach, we achieve state-of-the-art results on popular benchmarks used to test techniques to avoid CF. In particular, we conduct an ablation study to identify the contribution of each component of the proposed method, demonstrating its ability to avoid retroactive interference with previous tasks and to promote knowledge transfer to future tasks.


Publisher: CLEF2021 Working Notes, CEUR Workshop Proceedings, Link>

ABSTRACT

This article describes PUC Chile team’s participation in the Caption Prediction task of ImageCLEFmedical challenge 2021, which resulted in the team winning this task. We first show how a very simple approach based on statistical analysis of captions, without relying on images, results in a competitive baseline score. Then, we describe how to improve the performance of this preliminary submission by encoding the medical images with a ResNet CNN, pre-trained on ImageNet and later fine-tuned with the challenge dataset. Afterwards, we use this visual encoding as the input for a multi-label classification approach for caption prediction. W


Publisher: CEUR Workshop Proceedings, Link>

ABSTRACT

This article describes PUC Chile team’s participation in the Concept Detection task of ImageCLEFmedical challenge 2021, which resulted in the team earning the fourth place. We made two submissions, the first one based on a naive approach which resulted in a F-1 score of 0.141, and an improved version which leveraged the Perceptual Similarity among images and obtained a final F-1 score of 0.360. We describe in detail our data analysis, our different approaches, and conclude by discussing some ideas for future work


Vision Language Models (VLMs) are designed to extend Large Language Models (LLMs) with visual capabilities, yet in this work we observe a surprising phenomenon: VLMs can outperform their underlying LLMs on purely text-only tasks, particularly in long-context information retrieval. To investigate this effect, we build a controlled synthetic retrieval task and find that a transformer trained only on text achieves perfect in-distribution accuracy but fails to generalize out of distribution, while subsequent training on an image-tokenized version of the same task nearly doubles text-only OOD performance. Mechanistic interpretability reveals that visual training changes the model's internal binding strategy: text-only training encourages positional shortcuts, whereas image-based training disrupts them through spatial translation invariance, forcing the model to adopt a more robust symbolic binding mechanism that persists even after text-only examples are reintroduced. We further characterize how binding strategies vary across training regimes, visual encoders, and initializations, and show that analogous shifts occur during pretrained LLM-to-VLM transitions. Our findings suggest that cross-modal training can enhance reasoning and generalization even for tasks grounded in a single modality.

Recognizing variable stars is a task of interest in the astronomy community. Currently, this task has taken advantage of deep learning algorithms. However, these algorithms require a large amount of data to achieve high levels of precision. In this work, self-supervised learning is proposed to improve the classification of variable stars considering a reduced amount of data using recurrent networks. The experiments in Gaia dataset show that the proposed approach allows to improve performance, when compared with traditional initialization schemes, up to 7% and 13% in real databases in semi-supervised learning scenarios. In future work, we propose considering experiments with other variable star databases.

agencia nacional de investigación y desarrollo
Edificio de Innovación UC, Piso 2
Vicuña Mackenna 4860
Macul, Chile