Speech Enhancement Based on Drifting Models

IEEE Interspeech 2026 (under review)
1School of Engineering and Computer Science, Victoria University of Wellington, New Zealand
2GN Audio A/S, Denmark

We present DriftSE : Speech Enhancement based on Drifting Models (DriftSE), a novel generative framework that formulates denoising as an equilibrium problem. Rather than relying on iterative sampling, DriftSE natively achieves one-step inference by evolving the pushforward distribution of a mapping function to directly match the clean speech distribution. This evolution is driven by a Drifting Field, a learned correction vector that guides samples toward the high-density regions of the clean distribution, which naturally facilitates training on unpaired data by matching distributions rather than paired samples. We investigate the framework under two formulations: a direct mapping from the noisy observation, and a stochastic conditional generative model from a Gaussian prior. Experiments on the VoiceBank-DEMAND benchmark demonstrate that DriftSE achieves high-fidelity enhancement in a single step, outperforming multi-step diffusion baselines and establishing a new paradigm for speech enhancement.

Overview of the proposed DriftSE

Overview of the proposed DriftSE.

Summary of Results Using DriftSE

Summary of results using DriftSE on VB-DMD dataset.
Summary of results using DriftSE on DNS Challenge 2020 Blind Test Set.
Distributional convergence analysis.

VB-DMD Samples

Sample 1

Noisy Input

Clean Reference

SGMSE+(30 steps)

ROSE-CD

DriftSE

Sample 2

Noisy Input

Clean Reference

SGMSE+(30 steps)

ROSE-CD

DriftSE

Sample 3

Noisy Input

Clean Reference

SGMSE+(30 steps)

ROSE-CD

DriftSE

Unpaired Samples

Sample 1

Noisy Input

DriftSE (paired)

DriftSE (unpaired, map to VB-Female)

Sample 2

Noisy Input

DriftSE (paired)

DriftSE (unpaired, map to VB-Female)

Sample 3

Noisy Input

DriftSE (paired)

DriftSE (unpaired, map to VB-Female)

BibTeX

@article{xu2026driftse,
  author  = {Xu, Liang and Caviedes-Nozal, Diego and Kleijn, W. Bastiaan and Yan, Longfei Felix and Olsson, Rasmus Kongsgaard},
  title   = {Speech Enhancement Based on Drifting Models},
  journal = {arXiv preprint},
  year    = {2026},
  note    = {Under review}
}