Not enough data to create a plot.
Try a different view from the menu above.
Efficient Neural Music Generation
Recent progress in music generation has been remarkably advanced by the stateof-the-art MusicLM, which comprises a hierarchy of three LMs, respectively, for semantic, coarse acoustic, and fine acoustic modelings. Yet, sampling with the MusicLM requires processing through these LMs one by one to obtain the fine-grained acoustic tokens, making it computationally expensive and prohibitive for a realtime generation. Efficient music generation with a quality on par with MusicLM remains a significant challenge. In this paper, we present MeLoDy(M for music; L for LM; D for diffusion), an LM-guided diffusion model that generates music audios of state-of-the-art quality meanwhile reducing 95.7% to 99.6% forward passes in MusicLM, respectively, for sampling 10s to 30s music. MeLoDy inherits the highest-level LM from MusicLM for semantic modeling, and applies a novel dual-path diffusion (DPD) model and an audio VAE-GAN to efficiently decode the conditioning semantic tokens into waveform. DPD is proposed to simultaneously model the coarse and fine acoustics by incorporating the semantic information into segments of latents effectively via cross-attention at each denoising step. Our experimental results suggest the superiority of MeLoDy, not only in its practical advantages on sampling speed and infinitely continuable generation, but also in its state-of-the-art musicality, audio quality, and text correlation.
Corruption-Robust Linear Bandits: Minimax Optimality and Gap-Dependent Misspecification
In linear bandits, how can a learner effectively learn when facing corrupted rewards? While significant work has explored this question, a holistic understanding across different adversarial models and corruption measures is lacking, as is a full characterization of the minimax regret bounds. In this work, we compare two types of corruptions commonly considered: strong corruption, where the corruption level depends on the learner's chosen action, and weak corruption, where the corruption level does not depend on the learner's chosen action. We provide a unified framework to analyze these corruptions. For stochastic linear bandits, we fully characterize the gap between the minimax regret under strong and weak corruptions.
Neural Hybrid Automata Supplementary Material 14 A.1 Neural Hybrid Automata: Modules and Hyperparameters 14 A.2 Gradient Pathologies
A.1 Neural Hybrid Automata: Modules and Hyperparameters We provide a notation and summary table for Neural Hybrid Automata (NHA). The table serves as a quick reference for the core concepts introduced in the main text. The only NHA hyperparameter beyond module architectural choices is m, or number of latent modes provided to the model at initialization. Performance effects of changing m have been explored in Section 5.2 and Appendix B.2. Appendix B.2 further provides analyzes potential techniques to prune additional modes. A.2 Gradient Pathologies We provide some theoretical insights on the phenomenon of gradient pathologies with the simple example of a one-dimensional linear hybrid system with two modes and one timed jump, { ax This, in turn, affects the gradients for b, which results different than 0 despite the fact that b, from (A.1) should not be affecting the solution at points t In nonlinear systems with multiple events (including stochasticity) these effects can have a great empirical effect on a training procedure.
Causal Discovery from Event Sequences by Local Cause-Effect Attribution
Sequences of events, such as crashes in the stock market or outages in a network, contain strong temporal dependencies, whose understanding is crucial to react to and influence future events. In this paper, we study the problem of discovering the underlying causal structure from event sequences. To this end, we introduce a new causal model, where individual events of the cause trigger events of the effect with dynamic delays. We show that in contrast to existing methods based on Granger causality, our model is identifiable for both instant and delayed effects. We base our approach on the Algorithmic Markov Condition, by which we identify the true causal network as the one that minimizes the Kolmogorov complexity. As the Kolmogorov complexity is not computable, we instantiate our model using Minimum Description Length and show that the resulting score identifies the causal direction.
Uncertainty of Thoughts: Uncertainty-Aware Planning Enhances Information Seeking in LLMs
In the face of uncertainty, the ability to seek information is of fundamental importance. In many practical applications, such as medical diagnosis and troubleshooting, the information needed to solve the task is not initially given, and has to be actively sought by asking follow-up questions (for example, a doctor asking a patient for more details about their symptoms). In this work, we introduce Uncertainty of Thoughts (UoT), an algorithm to augment large language models with the ability to actively seek information by asking effective questions. UoT combines 1) an uncertainty-aware simulation approach which enables the model to simulate possible future scenarios and how likely they are to occur, 2) uncertaintybased rewards motivated by information gain which incentivizes the model to seek information, and 3) a reward propagation scheme to select the optimal question to ask in a way that maximizes the expected reward. In experiments on medical diagnosis, troubleshooting and the '20 Questions' game, UoT achieves an average performance improvement of 38.1% in the rate of successful task completion across multiple LLMs compared with direct prompting, and also improves efficiency (i.e., the number of questions needed to complete the task).