Goto

Collaborating Authors

 South America




2022DOPE

Neural Information Processing Systems

Ateachh2[H] inanepisodek, thealgorithmsh, k, selects ah, k h, k(sh, k, ), and costsrh(sh, k,ah, k)andch(sh, k,ah, k). Wewillalsoshowthat k from (10) (onceitbecomes feasible) willindeedbeasafepolicy (see Proposition 5).




VoiceMixer: AdversarialVoiceStyleMixup

Neural Information Processing Systems

In this paper, we present VoiceMixer which can effectively decompose and transfer voice style through a novel information bottleneck and adversarial feedback.





ZeroS: Zero-Sum Linear Attention for Efficient Transformers

arXiv.org Machine Learning

Linear attention methods offer Transformers $O(N)$ complexity but typically underperform standard softmax attention. We identify two fundamental limitations affecting these approaches: the restriction to convex combinations that only permits additive information blending, and uniform accumulated weight bias that dilutes attention in long contexts. We propose Zero-Sum Linear Attention (ZeroS), which addresses these limitations by removing the constant zero-order term $1/t$ and reweighting the remaining zero-sum softmax residuals. This modification creates mathematically stable weights, enabling both positive and negative values and allowing a single attention layer to perform contrastive operations. While maintaining $O(N)$ complexity, ZeroS theoretically expands the set of representable functions compared to convex combinations. Empirically, it matches or exceeds standard softmax attention across various sequence modeling benchmarks.