Mircea Petrache

Mircea Petrache

Especialidad: Aprendizaje profundo geométrico, límites de generalización, redes neuronales equivalentes, análisis de datos topológicos, geometría de datos.
Mircea Petrache posee un PhD in Science de ETH Zurich, obtenido en 2013. Ha trabajado en el proyecto Redes neuronales equivariantes, donde caracterizó el trade-off entre mejoras de generalización y aproximación al imponer simetrías, implementando estas imposiciones y estudiando simetrías aproximadas. En Geometría de la información en aprendizaje probabilístico, aplicó modelos de Flow Matching y midió la curvatura de espacios de medidas para informar la optimización. En Frequency bias en redes neuronales artificiales, modeló la diferencia de velocidad de aprendizaje de frecuencias con PDEs. En Aprendizaje composicional, estudió modelos de juegos de señales con aprendizaje iterado. En modelamiento del aprendizaje en el cerebro y en infantes, modeló el rol de la aleatoriedad en el procesamiento de información por redes neuronales y el aprendizaje basado en homeostasis energética, además de modelamiento estadístico de datos de experimentos con infantes. En Química computacional con aprendizaje profundo, aplicó redes equivariantes para reemplazar cálculos de mecánica cuántica y estudió la expresividad geométrica de redes basadas en grafos para modelar moléculas.

PUBLICACIONES

A key question in the analysis of discrete models for material defects, such as vortices in spin systems and superconductors or isolated dislocations in metals, is whether information on boundary energy for a domain can be sufficient for controlling the number of defects in the interior. We present a general combinatorial dipole-removal argument for a large class of discrete models including XY systems and screw dislocation models, allowing to prove sharp conditions under which controlled flux and boundary energy guarantee that minimizers with zero or one charges in the interior exist. The argument uses the max-flow min-cut theorem in combination with an ad-hoc duality for planar graphs, and is robust with respect to changes of the function defining the interaction energies.

We investigate a new multi-marginal optimal transport problem arising from a dissociation model in the Strong Interaction Limit of Density Functional Theory. In this short note, we introduce such dissociation model, the corresponding optimal transport problem as well as show preliminary results on the existence and uniqueness of Monge solutions assuming absolute continuity of at least two of the marginals. Finally, we show that such marginal regularity conditions are necessary for the existence of an unique Monge solution.

Generative modeling over discrete data has recently seen numerous success stories, with applications spanning language modeling, biological sequence design, and graph-structured molecular data. The predominant generative modeling paradigm for discrete data is still autoregressive, with more recent alternatives based on diffusion or flow-matching falling short of their impressive performance in continuous data settings, such as image or video generation. In this work, we introduce Fisher-Flow, a novel flow-matching model for discrete data. Fisher-Flow takes a manifestly geometric perspectiveby considering categorical distributions over discrete data as points residing on a statistical manifold equipped with its natural Riemannian metric: the \emph{Fisher-Rao metric}. As a result, we demonstrate discrete data itself can be continuously reparameterised to points on the positive orthant of the d-hypersphere Sd+, which allows us to define flows that map any source distribution to target in a principled manner by transporting mass along (closed-form) geodesics of Sd+. Furthermore, the learned flows in Fisher-Flow can be further bootstrapped by leveraging Riemannian optimal transport leading to improved training dynamics. We prove that the gradient flow induced by Fisher-FLow is optimal in reducing the forward KL divergence. We evaluate Fisher-Flow on an array of synthetic and diverse real-world benchmarks, including designing DNA Promoter, and DNA Enhancer sequences. Empirically, we find that Fisher-Flow improves over prior diffusion and flow-matching models on these benchmarks.

We consider the problem of optimal approximation of a target measure by an atomic measure with atoms in branched optimal transport distance. This is a new branched transport version of optimal quantization problems. New difficulties arise, since in classical semidiscrete optimal transport with Wasserstein distance, the interfaces between cells associated with neighboring atoms have Voronoi structure and satisfy an explicit description. This description is missing for our problem, in which the cell interfaces are thought to have fractal boundary. We study the asymptotic behavior of optimal quantizers for absolutely continuous measures as the number of atoms grows to infinity. We compute the limit distribution of the corresponding point clouds and show in particular a branched transport version of Zador’s theorem. Moreover, we establish uniformity bounds of optimal quantizers in terms of separation distance and covering radius of the atoms, when the measure is -Ahlfors regular. A crucial technical tool is the uniform in Hölder regularity of the landscape function, a branched transport analogue to Kantorovich potentials in classical optimal transport.

Equivariant neural networks exploit underlying task symmetries to improve generalization, but strict equivariance constraints can induce more complex optimization dynamics that can hinder learning. Prior work addresses these limitations by relaxing strict equivariance during training, but typically relies on prespecified, explicit, or implicit target levels of relaxation for each network layer, which are task-dependent and costly to tune. We propose Recurrent Equivariant Constraint Modulation (RECM), a layer-wise constraint modulation mechanism that learns appropriate relaxation levels solely from the training signal and the symmetry properties of each layer's input-target distribution, without requiring any prior knowledge about the task-dependent target relaxation level. We demonstrate that under the proposed RECM update, the relaxation level of each layer provably converges to a value upper-bounded by its symmetry gap, namely the degree to which its input-target distribution deviates from exact symmetry. Consequently, layers processing symmetric distributions recover full equivariance, while those with approximate symmetries retain sufficient flexibility to learn non-symmetric solutions when warranted by the data. Empirically, RECM outperforms prior methods across diverse exact and approximate equivariant tasks, including the challenging molecular conformer generation on the GEOM-Drugs dataset.

How informative are preschoolers’ speech vocalizations? Preschoolers’ speech is often imprecise, highly variable and hard to interpret by humans and machines; consequently, its predictive value for later developmental outcomes remains quite underexplored. Here, we analyzed 6.595 brief vocalizations (0.5-5s) from 127 preschoolers aged 3–4 years, including 74 children with diagnosed language delay, recorded in naturalistic environments. The vocalization models robustly distinguished children with and without language delay (ROC-AUC 0.90), beyond the acoustic properties of the recordings (ROC-AUC: 0.62), and outperformed similar models analyzing metadata that literature reports as predictive factor for early language development (ROC-AUC: < 0.69 [95% CI: 0.08 - 0.15 to 0.48 - 0.73], P < 0.001]). This indicates that neural networks applied to foundational model audio vectorizations can extract meaningful developmental markers from brief samples of immature speech, to classify speech status, offering a promising, scalable approach for language abilities early screening.

agencia nacional de investigación y desarrollo
Edificio de Innovación UC, Piso 2
Vicuña Mackenna 4860
Macul, Chile