Mircea Petrache

Mircea Petrache

PUBLICATIONS

A key question in the analysis of discrete models for material defects, such as vortices in spin systems and superconductors or isolated dislocations in metals, is whether information on boundary energy for a domain can be sufficient for controlling the number of defects in the interior. We present a general combinatorial dipole-removal argument for a large class of discrete models including XY systems and screw dislocation models, allowing to prove sharp conditions under which controlled flux and boundary energy guarantee that minimizers with zero or one charges in the interior exist. The argument uses the max-flow min-cut theorem in combination with an ad-hoc duality for planar graphs, and is robust with respect to changes of the function defining the interaction energies.

Generative modeling over discrete data has recently seen numerous success stories, with applications spanning language modeling, biological sequence design, and graph-structured molecular data. The predominant generative modeling paradigm for discrete data is still autoregressive, with more recent alternatives based on diffusion or flow-matching falling short of their impressive performance in continuous data settings, such as image or video generation. In this work, we introduce Fisher-Flow, a novel flow-matching model for discrete data. Fisher-Flow takes a manifestly geometric perspectiveby considering categorical distributions over discrete data as points residing on a statistical manifold equipped with its natural Riemannian metric: the \emph{Fisher-Rao metric}. As a result, we demonstrate discrete data itself can be continuously reparameterised to points on the positive orthant of the d-hypersphere Sd+, which allows us to define flows that map any source distribution to target in a principled manner by transporting mass along (closed-form) geodesics of Sd+. Furthermore, the learned flows in Fisher-Flow can be further bootstrapped by leveraging Riemannian optimal transport leading to improved training dynamics. We prove that the gradient flow induced by Fisher-FLow is optimal in reducing the forward KL divergence. We evaluate Fisher-Flow on an array of synthetic and diverse real-world benchmarks, including designing DNA Promoter, and DNA Enhancer sequences. Empirically, we find that Fisher-Flow improves over prior diffusion and flow-matching models on these benchmarks.

agencia nacional de investigación y desarrollo
Edificio de Innovación UC, Piso 2
Vicuña Mackenna 4860
Macul, Chile