conditioner
- Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
- Asia > Philippines > Luzon > National Capital Region > City of Manila (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
Generating Separated Singing Vocals Using a Diffusion Model Conditioned on Music Mixtures
Plaja-Roglans, Genís, Hung, Yun-Ning, Serra, Xavier, Pereira, Igor
Separating the individual elements in a musical mixture is an essential process for music analysis and practice. While this is generally addressed using neural networks optimized to mask or transform the time-frequency representation of a mixture to extract the target sources, the flexibility and generalization capabilities of generative diffusion models are giving rise to a novel class of solutions for this complicated task. In this work, we explore singing voice separation from real music recordings using a diffusion model which is trained to generate the solo vocals conditioned on the corresponding mixture. Our approach improves upon prior generative systems and achieves competitive objective scores against non-generative baselines when trained with supplementary data. The iterative nature of diffusion sampling enables the user to control the quality-efficiency trade-off, and also refine the output when needed. We present an ablation study of the sampling algorithm, highlighting the effects of the user-configurable parameters.
- Media (0.67)
- Leisure & Entertainment (0.67)
- Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
- Asia > Philippines > Luzon > National Capital Region > City of Manila (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
Local Mechanisms of Compositional Generalization in Conditional Diffusion
Conditional diffusion models appear capable of compositional generalization, i.e., generating convincing samples for out-of-distribution combinations of conditioners, but the mechanisms underlying this ability remain unclear. To make this concrete, we study length generalization, the ability to generate images with more objects than seen during training. In a controlled CLEVR setting (Johnson et al., 2017), we find that length generalization is achievable in some cases but not others, suggesting that models only sometimes learn the underlying compositional structure. We then investigate locality as a structural mechanism for compositional generalization. Prior works proposed score locality as a mechanism for creativity in unconditional diffusion models (Kamb & Ganguli, 2024; Niedoba et al., 2024), but did not address flexible conditioning or compositional generalization. In this paper, we prove an exact equivalence between a specific compositional structure ("conditional projective composition") (Bradley et al., 2025) and scores with sparse dependencies on both pixels and conditioners ("local conditional scores"). This theory also extends to feature-space compositionality. We validate our theory empirically: CLEVR models that succeed at length generalization exhibit local conditional scores, while those that fail do not. Furthermore, we show that a causal intervention explicitly enforcing local conditional scores restores length generalization in a previously failing model. Finally, we investigate feature-space compositionality in color-conditioned CLEVR, and find preliminary evidence of compositional structure in SDXL.
- Europe > Latvia > Lubāna Municipality > Lubāna (0.04)
- North America > United States > California > Santa Clara County > Cupertino (0.04)
The hottest deals on air conditioners to help you keep cool this summer
Cool off your home with one of these air conditioners. Summer weather is getting unbearable in many parts of the U.S., with record-high temperatures all over the country. To beat the heat, investing in an air conditioner is a must. We've lined up some top air conditioner deals, from budget-friendly options to portable solutions and high-tech, smart AC units. Cool large rooms with this portable option.
A Proposal for Networks Capable of Continual Learning
Erden, Zeki Doruk, Faltings, Boi
We analyze the ability of computational units to retain past responses after parameter updates, a key property for system-wide continual learning. Neural networks trained with gradient descent lack this capability, prompting us to propose Modelleyen, an alternative approach with inherent response preservation. We demonstrate through experiments on modeling the dynamics of a simple environment and on MNIST that, despite increased computational complexity and some representational limitations at its current stage, Modelleyen achieves continual learning without relying on sample replay or predefined task boundaries.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Switzerland > Vaud > Lausanne (0.04)
- Africa > South Sudan > Equatoria > Central Equatoria > Juba (0.04)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)
NeuMC -- a package for neural sampling for lattice field theories
Bialas, Piotr, Korcyl, Piotr, Stebel, Tomasz, Zapolski, Dawid
We present the \texttt{NeuMC} software package, based on \pytorch, aimed at facilitating the research on neural samplers in lattice field theories. Neural samplers based on normalizing flows are becoming increasingly popular in the context of Monte-Carlo simulations as they can effectively approximate target probability distributions, possibly alleviating some shortcomings of the Markov chain Monte-Carlo methods. Our package provides tools to create such samplers for two-dimensional field theories.
- Europe > Poland > Lesser Poland Province > Kraków (0.04)
- North America > United States > Illinois > Cook County > Chicago (0.04)
Agential AI for Integrated Continual Learning, Deliberative Behavior, and Comprehensible Models
Erden, Zeki Doruk, Faltings, Boi
Contemporary machine learning paradigm excels in statistical data analysis, solving problems that classical AI couldn't. However, it faces key limitations, such as a lack of integration with planning, incomprehensible internal structure, and inability to learn continually. We present the initial design for an AI system, Agential AI (AAI), in principle operating independently or on top of statistical methods, designed to overcome these issues. AAI's core is a learning method that models temporal dynamics with guarantees of completeness, minimality, and continual learning, using component-level variation and selection to learn the structure of the environment. It integrates this with a behavior algorithm that plans on a learned model and encapsulates high-level behavior patterns. Preliminary experiments on a simple environment show AAI's effectiveness and potential.
- Europe > Switzerland > Vaud > Lausanne (0.04)
- Africa > South Sudan > Equatoria > Central Equatoria > Juba (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- (5 more...)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
- Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
CA-SSLR: Condition-Aware Self-Supervised Learning Representation for Generalized Speech Processing
Lu, Yen-Ju, Liu, Jing, Thebaud, Thomas, Moro-Velazquez, Laureano, Rastrow, Ariya, Dehak, Najim, Villalba, Jesus
We introduce Condition-Aware Self-Supervised Learning Representation (CA-SSLR), a generalist conditioning model broadly applicable to various speech-processing tasks. Compared to standard fine-tuning methods that optimize for downstream models, CA-SSLR integrates language and speaker embeddings from earlier layers, making the SSL model aware of the current language and speaker context. This approach reduces the reliance on input audio features while preserving the integrity of the base SSLR. CA-SSLR improves the model's capabilities and demonstrates its generality on unseen tasks with minimal task-specific tuning. Our method employs linear modulation to dynamically adjust internal representations, enabling fine-grained adaptability without significantly altering the original model behavior. Experiments show that CA-SSLR reduces the number of trainable parameters, mitigates overfitting, and excels in under-resourced and unseen tasks. Specifically, CA-SSLR achieves a 10% relative reduction in LID errors, a 37% improvement in ASR CER on the ML-SUPERB benchmark, and a 27% decrease in SV EER on VoxCeleb-1, demonstrating its effectiveness.
- Research Report > New Finding (0.93)
- Research Report > Promising Solution (0.66)
Audio Classification Using Deep Learning
This dataset contains 8732 labeled sound excerpts ( 4s) of urban sounds from 10 classes: air_conditioner, car_horn, children_playing, dog_bark, drilling, enginge_idling, gun_shot, jackhammer, siren, and street_music. The classes are drawn from the urban sound taxonomy. All excerpts are taken from field recordings uploaded to www.freesound.org. The files are pre-sorted into ten folds (folders named fold1-fold10) to help in the reproduction of and comparison with the automatic classification results reported in the article above. AUDIO FILES INCLUDED 8732 audio files of urban sounds (see description above) in WAV format.