MoLF: Mixture-of-Latent-Flow

Pan-Cancer Spatial Gene Expression Prediction from Histology

1National Center for Tumor Disease Dresden (NCT), Germany    2German Cancer Research Center (DKFZ)    3Dresden University of Technology, Germany

TL;DR

We introduce MoLF (Mixture-of-Latent-Flow), a generative model for scalable pan-cancer histogenomics. By leveraging a Mixture-of-Experts (MoE) velocity field within a conditional Flow Matching framework, MoLF effectively decouples diverse tissue patterns. It establishes a new state-of-the-art performance, and demonstrates robust zero-shot generalization to cross-species data.

Overview of Framework

MoLF (Mixture-of-Latent-Flow) Overview

Motivation & Method

Experiments & Results

State-of-the-Art Performance

We evaluated MoLF on the HEST-1k pan-cancer benchmark. Our results show that MoLF consistently outperforms current state-of-the-art methods:

  • Highly Variable Genes (HVG): Achieves highest predictive correlation across distinct cancer types.
  • Hallmark Pathways: Superior performance across Low, Mid, and High variance gene tiers.
SOTA Performance Table

Mechanism Validation: Synthetic 8-Gaussian

To isolate inductive bias, we compared MoLF against a dense baseline on a synthetic 8-Gaussian distribution.

The Dense Baseline (Right) suffers from severe averaging artifacts due to shared parameterization.

In contrast, MoLF (Left) mitigates these artifacts by routing inputs to specialized experts, achieving significantly sharper mode separation.

Synthetic Gaussian Experiment

MoLF (Sharper Modes) vs. Dense Baseline (Artifacts)

Geometric Analysis of Generative Transport

We first visualize the global manifold alignment. The initial Gaussian noise (red, left) is transported in a single ODE step to a structured manifold (red, right) that well aligns with the ground truth gene distribution (blue). This confirms the model successfully approximates the target distribution.

Macro-scale Manifold Alignment

Macro-scale Manifold Alignment

At a finer granularity, tracing individual patch trajectories demonstrates micro-scale precision. The model learns specific paths transporting random noise to the correct target gene expression, effectively solving the conditional optimal transport problem.

Micro-scale Patch Trajectories

Micro-scale Trajectory Analysis

Expert Specialization Analysis

The UMAP visualization below shows latent gene embeddings colored by cancer type (Left) and by the active Expert (Right).

Notably, experts are not segregated by cancer type (e.g., there is no single "Lung Cancer Expert"). Instead, they work collaboratively, with multiple experts contributing to different regions of the latent space across all tissues. This allows MoLF to learn shared biological motifs rather than overfitting to specific datasets. Crucially, this strategy ensures that collaborative experts adapt to new cancer types better than rigid cancer specialists, as they can flexibly recombine learned biological primitives to characterize unseen tissues.

Expert Specialization UMAP

Distributed Expert Strategy

Zero-shot Cross-species Generalization

We evaluated MoLF on a zero-shot task: training on human data and testing on the HEST1k Mouse Melanoma dataset.

The Pan-Cancer MoLF significantly outperforms the tissue-aligned baseline. This validates our core hypothesis: the diversity of the pan-cancer dataset forces the Mixture-of-Experts to decompose visual signals into fundamental biological primitives. By learning these reusable, invariant motifs, MoLF maintains predictive stability even across species barriers.

zero-shot

Zero-shot Cross-species Generalization

BibTeX

@article{hu2026molf, title={MoLF: Mixture-of-Latent-Flow for Pan-Cancer Spatial Gene Expression Prediction from Histology}, author={Hu, Susu and Speidel, Stefanie}, journal={arXiv preprint arXiv:2602.02282}, url={https://arxiv.org/abs/2602.02282}, year={2026} }