Robust One-step Speech Enhancement via Consistency Distillation (ROSE-CD)

1School of Engineering and Computer Science, Victoria University of Wellington, New Zealand

We present ROSE-CD :Robust One-step Speech Enhancement via Consistency Distillation, a novel approach for distilling a one-step consistency model. Specifically, we introduce a randomized learning trajectory to improve the model’s robustness to noise. Furthermore, we jointly optimize the one-step model with two time-domain auxiliary losses, enabling it to recover from teacher-induced errors and surpass the teacher model in overall performance. This is the first pure one-step consistency distillation model for diffusion-based speech enhancement, achieving 54 times faster inference speed and superior performance compared to its 30-step teacher model. Experiments on the VoiceBank-DEMAND dataset demonstrate that the proposed model achieves state-of-the-art performance in terms of speech quality. Moreover, its generalization ability is validated on both an out-of-domain dataset and real-world noisy recordings.

Overview of the proposed robust consistency distillation (RCD)

Overview of the proposed robust consistency distillation (RCD).

Summary of Results Using ROSE-CD

Summary of results using ROSE-CD on VB-DMD dataset.
Summary of results using ROSE-CD on TIMIT+NOISE92 dataset.
Summary of results using ROSE-CD on DNS Challenge 2020 dataset.

VB-DMD Samples

Sample 1

Noisy Input

Clean Reference

SGMSE

ROSE-CD (PESQ only)

ROSE-CD (SI-SDR only)

ROSE-CD (Final)

Sample 2

Noisy Input

Clean Reference

SGMSE

ROSE-CD (PESQ only)

ROSE-CD (SI-SDR only)

ROSE-CD (Final)

Sample 3

Noisy Input

Clean Reference

SGMSE

ROSE-CD (PESQ only)

ROSE-CD (SI-SDR only)

ROSE-CD (Final)

DNS300 Samples

Sample 1

Noisy Input

SGMSE

ROSE-CD (PESQ only)

ROSE-CD (SI-SDR only)

ROSE-CD (Final)

Sample 2

Noisy Input

SGMSE

ROSE-CD (PESQ only)

ROSE-CD (SI-SDR only)

ROSE-CD (Final)

Sample 3

Noisy Input

SGMSE

ROSE-CD (PESQ only)

ROSE-CD (SI-SDR only)

ROSE-CD (Final)

BibTeX

@inproceedings{xu2025rosecd,
      title     = {Robust One-step Speech Enhancement via Consistency Distillation},
      author    = {Liang Xu and Longfei Felix Yan and W. Bastiaan Kleijn},
      booktitle = {IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA)},
      year      = {2025},
      organization = {IEEE}
    }