[Submitted on 20 Oct 2025]
Abstract:We propose an extension of the decoder Transformer that conditions its generative process on random latent variables which are learned without supervision thanks to a variational procedure. Experimental evaluations show that allowing such a conditioning translates into substantial improvements on downstream tasks.Submission history
From: Francois Fleuret [view email]
[v1]
Mon, 20 Oct 2025 14:05:30 UTC (502 KB)