Annie Chu¹ ², Hugo Flores-García², Oriol Nieto¹, Justin Salamon¹, Bryan Pardo², Prem Seetharaman¹
¹ Adobe Research ² Northwestern University
This is the supplementary page containing listening examples for the ICASSP 2026 paper submission “Mix2Morph: Learning Sound Morphing from Noisy Mixes”.
We introduce Mix2Morph, a text-to-audio diffusion model fine-tuned to perform sound morphing without a dedicated morph dataset. By finetuning on noisy surrogate mixes at higher diffusion timesteps, Mix2Morph yields stable, perceptually coherent morphs that convincingly integrate qualities of both sources. We specifically target sound infusion, a practically and perceptually motivated subclass of morphing in which one sound acts as the dominant primary source, providing overall temporal and structural behavior, while a secondary sound is infused throughout, enriching its timbral and textural qualities. Objective evaluations and listening tests show that Mix2Morph outperforms prior baselines and produces high-quality sound infusions across diverse categories, representing a step toward more controllable and creative tools for sound design.

wav1: balls bouncing
wav2: 808s (bass)
None (Simple Mix)
+RMS-only
+Spectral-only
+Both
MIXtrickspec_mixed_eq_and_rms.wav
Generated Infusion
morph_viz_morph_ballboucinglike808s.wav
<aside> 🗣
Prompts are denoted by “behavior of [primary source] with timbre like [secondary source]”
Listener Likert Ratings below (if applicable)
</aside>
Infusion Prompt
Mix2Morph (ours)
Latent Granular ReSynthesis (LGrS)
MorphFader
Simple Mixing
SoundMorpher
Base Model (no fine-tuning)
behavior of monster growling with timbre like motorcycle revving
c2likec1_monster_growling_like_motorcycle_revving.wav
4.6/5
c2likec1_c1_motorcycle_revving_c2_monster_growling.wav
2.1/5
c2likec1_monster_growling_like_motorcycle_revving.wav
2.0/5
c2likec1_c1_motorcycle_revving_c2_monster_growling.wav
1.9/5
c1likec2_midpoint_0.7060546875.wav
c2likec1_monster_growling_like_motorcycle_revving.wav
behavior of water dripping with timbre like cowbell clang