AITopics | remix

Collaborating Authors

remix

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Continual Memorization of Factoids in Large Language Models

Chen, Howard, Geng, Jiayi, Bhaskar, Adithya, Friedman, Dan, Chen, Danqi

arXiv.org Artificial IntelligenceNov-11-2024

Large language models can absorb a massive amount of knowledge through pretraining, but pretraining is inefficient for acquiring long-tailed or specialized facts. Therefore, fine-tuning on specialized or new knowledge that reflects changes in the world has become popular, though it risks disrupting the model's original capabilities. We study this fragility in the context of continual memorization, where the model is trained on a small set of long-tail factoids (factual associations) and must retain these factoids after multiple stages of subsequent training on other datasets. Through extensive experiments, we show that LLMs suffer from forgetting across a wide range of subsequent tasks, and simple replay techniques do not fully prevent forgetting, especially when the factoid datasets are trained in the later stages. We posit that there are two ways to alleviate forgetting: 1) protect the memorization process as the model learns the factoids, or 2) reduce interference from training in later stages. With this insight, we develop an effective mitigation strategy: REMIX (Random and Generic Data Mixing). REMIX prevents forgetting by mixing generic data sampled from pretraining corpora or even randomly generated word sequences during each stage, despite being unrelated to the memorized factoids in the first stage. REMIX can recover performance from severe forgetting, often outperforming replay-based methods that have access to the factoids from the first stage. We then analyze how REMIX alters the learning process and find that successful forgetting prevention is associated with a pattern: the model stores factoids in earlier layers than usual and diversifies the set of layers that store these factoids. The efficacy of REMIX invites further investigation into the underlying dynamics of memorization and forgetting, opening exciting possibilities for future research.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2411.07175

Country:

North America > United States > Kansas (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)

Genre: Research Report (0.82)

Industry: Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning > Rote Learning (1.00)

Add feedback

ReMix: Training Generalized Person Re-identification on a Mixture of Data

Mamedov, Timur, Konushin, Anton, Konushin, Vadim

arXiv.org Artificial IntelligenceOct-29-2024

Modern person re-identification (Re-ID) methods have a weak generalization ability and experience a major accuracy drop when capturing environments change. This is because existing multi-camera Re-ID datasets are limited in size and diversity, since such data is difficult to obtain. At the same time, enormous volumes of unlabeled single-camera records are available. Such data can be easily collected, and therefore, it is more diverse. Currently, single-camera data is used only for self-supervised pre-training of Re-ID methods. However, the diversity of single-camera data is suppressed by fine-tuning on limited multi-camera data after pre-training. In this paper, we propose ReMix, a generalized Re-ID method jointly trained on a mixture of limited labeled multi-camera and large unlabeled single-camera data. Effective training of our method is achieved through a novel data sampling strategy and new loss functions that are adapted for joint use with both types of data. Experiments show that ReMix has a high generalization ability and outperforms state-of-the-art methods in generalizable person Re-ID. To the best of our knowledge, this is the first work that explores joint training on a mixture of multi-camera and single-camera data in person Re-ID.

person re-identification, remix, single-camera data, (13 more...)

arXiv.org Artificial Intelligence

2410.21938

Country:

Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
Asia > Russia (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

The first Cadenza challenges: using machine learning competitions to improve music for listeners with a hearing loss

Dabike, Gerardo Roa, Akeroyd, Michael A., Bannister, Scott, Barker, Jon P., Cox, Trevor J., Fazenda, Bruno, Firth, Jennifer, Graetzer, Simone, Greasley, Alinka, Vos, Rebecca R., Whitmer, William M.

arXiv.org Artificial IntelligenceSep-8-2024

It is well established that listening to music is an issue for those with hearing loss, and hearing aids are not a universal solution. How can machine learning be used to address this? This paper details the first application of the open challenge methodology to use machine learning to improve audio quality of music for those with hearing loss. The first challenge was a stand-alone competition (CAD1) and had 9 entrants. The second was an 2024 ICASSP grand challenge (ICASSP24) and attracted 17 entrants. The challenge tasks concerned demixing and remixing pop/rock music to allow a personalised rebalancing of the instruments in the mix, along with amplification to correct for raised hearing thresholds. The software baselines provided for entrants to build upon used two state-of-the-art demix algorithms: Hybrid Demucs and Open-Unmix. Evaluation of systems was done using the objective metric HAAQI, the Hearing-Aid Audio Quality Index. No entrants improved on the best baseline in CAD1 because there was insufficient room for improvement. Consequently, for ICASSP24 the scenario was made more difficult by using loudspeaker reproduction and specified gains to be applied before remixing. This also made the scenario more useful for listening through hearing aids. 9 entrants scored better than the the best ICASSP24 baseline. Most entrants used a refined version of Hybrid Demucs and NAL-R amplification. The highest scoring system combined the outputs of several demixing algorithms in an ensemble approach. These challenges are now open benchmarks for future research with the software and data being freely available.

hearing aids, hearing loss, music, (15 more...)

arXiv.org Artificial Intelligence

2409.05095

Country:

Europe > United Kingdom > England > Nottinghamshire > Nottingham (0.14)
North America > United States > Rhode Island (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(8 more...)

Genre:

Research Report > New Finding (0.94)
Research Report > Experimental Study (0.68)

Industry: Health & Medicine > Therapeutic Area > Otolaryngology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

No, Drake's Cover of 'Hey There Delilah' Isn't AI

WIREDJun-7-2024, 11:00:00 GMT

As if he didn't have enough to deal with amid his beef with Kendrick Lamar (or perhaps to distract from it), Drake showed up on a remix of parody rapper Snowd4y's cover of Plain White T's "Hey There Delilah," called "Wah Gwan Delilah," that has everyone … perplexed? Let's walk through this together, it's a mess. It had what appeared to be Drake joining the comedian in a series of quips about women and name-checks of Toronto landmarks like the Yonge-Dundas Square mall. As the track spread, it made its way to the Plain White T's themselves, who posted a video on X and TikTok with the caption "too stunned to speak." Frontman Tom Higgenson also says "it's crazy that everybody thinks that it's real," seemingly referencing early rumors that Drake's lyrics were generated using artificial intelligence.

artificial intelligence, drake, social media, (5 more...)

WIRED

Country: North America > Canada > Ontario > Toronto (0.28)

Industry:

Media > Music (0.74)
Leisure & Entertainment (0.74)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence (1.00)

Add feedback

Music Enhancement with Deep Filters: A Technical Report for The ICASSP 2024 Cadenza Challenge

Shao, Keren, Chen, Ke, Dubnov, Shlomo

arXiv.org Artificial IntelligenceApr-17-2024

In this challenge, we disentangle the deep filters from the original DeepfilterNet and incorporate them into our Spec-UNet-based network to further improve a hybrid Demucs (hdemucs) based remixing pipeline. The motivation behind the use of the deep filter component lies at its potential in better handling temporal fine structures. We demonstrate an incremental improvement in both the Signal-to-Distortion Ratio (SDR) and the Hearing Aid Audio Quality Index (HAAQI) metrics when comparing the performance of hdemucs against different versions of our model.

cadenza challenge, icassp 2024, remix, (12 more...)

arXiv.org Artificial Intelligence

2404.11116

Country: North America > United States > California > San Diego County > San Diego (0.05)

Genre: Research Report (0.65)

Industry: Health & Medicine > Therapeutic Area (0.41)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Add feedback

From SMOTE to Mixup for Deep Imbalanced Classification

Cheng, Wei-Chao, Mai, Tan-Ha, Lin, Hsuan-Tien

arXiv.org Artificial IntelligenceNov-3-2023

Given imbalanced data, it is hard to train a good classifier using deep learning because of the poor generalization of minority classes. Traditionally, the well-known synthetic minority oversampling technique (SMOTE) for data augmentation, a data mining approach for imbalanced learning, has been used to improve this generalization. However, it is unclear whether SMOTE also benefits deep learning. In this work, we study why the original SMOTE is insufficient for deep learning, and enhance SMOTE using soft labels. Connecting the resulting soft SMOTE with Mixup, a modern data augmentation technique, leads to a unified framework that puts traditional and modern data augmentation techniques under the same umbrella. A careful study within this framework shows that Mixup improves generalization by implicitly achieving uneven margins between majority and minority classes. We then propose a novel margin-aware Mixup technique that more explicitly achieves uneven margins. Extensive experimental results demonstrate that our proposed technique yields state-of-the-art performance on deep imbalanced classification while achieving superior performance on extremely imbalanced data. The code is open-sourced in our developed package https://github.com/ntucllab/imbalanced-DL to foster future research in this direction.

minority class, mixup, smote, (14 more...)

arXiv.org Artificial Intelligence

2308.15457

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Taiwan (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > France (0.04)

Genre: Research Report (0.84)

Industry: Law Enforcement & Public Safety > Fraud (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Mo\^usai: Text-to-Music Generation with Long-Context Latent Diffusion

Schneider, Flavio, Kamal, Ojasv, Jin, Zhijing, Schölkopf, Bernhard

arXiv.org Artificial IntelligenceOct-23-2023

Recent years have seen the rapid development of large generative models for text; however, much less research has explored the connection between text and another "language" of communication -- music. Music, much like text, can convey emotions, stories, and ideas, and has its own unique structure and syntax. In our work, we bridge text and music via a text-to-music generation model that is highly efficient, expressive, and can handle long-term structure. Specifically, we develop Mo\^usai, a cascading two-stage latent diffusion model that can generate multiple minutes of high-quality stereo music at 48kHz from textual descriptions. Moreover, our model features high efficiency, which enables real-time inference on a single consumer GPU with a reasonable speed. Through experiments and property analyses, we show our model's competence over a variety of criteria compared with existing music generation models. Lastly, to promote the open-source culture, we provide a collection of open-source libraries with the hope of facilitating future work in the field. We open-source the following: Codes: https://github.com/archinetai/audio-diffusion-pytorch; music samples for this paper: http://bit.ly/44ozWDH; all music samples for all models: https://bit.ly/audio-diffusion.

deluxe, diffusion model, music, (15 more...)

arXiv.org Artificial Intelligence

2301.11757

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)
Europe > Austria (0.04)
(13 more...)

Genre: Research Report (0.50)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback

Semi-supervised Relation Extraction via Data Augmentation and Consistency-training

Teru, Komal K.

arXiv.org Artificial IntelligenceJun-16-2023

Due to the semantic complexity of the Relation extraction (RE) task, obtaining high-quality human labelled data is an expensive and noisy process. To improve the sample efficiency of the models, semi-supervised learning (SSL) methods aim to leverage unlabelled data in addition to learning from limited labelled data points. Recently, strong data augmentation combined with consistency-based semi-supervised learning methods have advanced the state of the art in several SSL tasks. However, adapting these methods to the RE task has been challenging due to the difficulty of data augmentation for RE. In this work, we leverage the recent advances in controlled text generation to perform high quality data augmentation for the RE task. We further introduce small but significant changes to model architecture that allows for generation of more training data by interpolating different data points in their latent space. These data augmentations along with consistency training result in very competitive results for semi-supervised relation extraction on four benchmark datasets.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2306.10153

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Dominican Republic (0.04)
Europe > Italy > Tuscany > Florence (0.04)
(6 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Towards Understanding How Data Augmentation Works with Imbalanced Data

Dablain, Damien A., Chawla, Nitesh V.

arXiv.org Artificial IntelligenceApr-12-2023

Data augmentation forms the cornerstone of many modern machine learning training pipelines; yet, the mechanisms by which it works are not clearly understood. Much of the research on data augmentation (DA) has focused on improving existing techniques, examining its regularization effects in the context of neural network over-fitting, or investigating its impact on features. Here, we undertake a holistic examination of the effect of DA on three different classifiers, convolutional neural networks, support vector machines, and logistic regression models, which are commonly used in supervised classification of imbalanced data. We support our examination with testing on three image and five tabular datasets. Our research indicates that DA, when applied to imbalanced data, produces substantial changes in model weights, support vectors and feature selection; even though it may only yield relatively modest changes to global metrics, such as balanced accuracy or F1 measure. We hypothesize that DA works by facilitating variances in data, so that machine learning models can associate changes in the data with labels. By diversifying the range of feature amplitudes that a model must recognize to predict a label, DA improves a model's capacity to generalize when learning with imbalanced data.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2304.05895

Country:

North America > United States > Nevada > Clark County > Las Vegas (0.04)
North America > United States > Indiana > St. Joseph County > Notre Dame (0.04)
Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Discord is adding an AI chatbot, moderator, and art

PCWorldMar-9-2023, 16:00:00 GMT

Discord is adding AI to its platform in the form of ChatGPT and generative art, which will manifest as a chatbot and options to manage chats and create custom avatar profiles. Discord plans to roll out a public ChatGPT-powered chatbot named "Clyde" beginning next week, alongside a new technology to summarize Discord chats in a sidebar, called conversation summaries. This Friday, Discord will update its AutoMod automatic moderation bot to include AI-powered moderation, examining the content of moderated chats to determine if a server's rules are being followed. All three are considered public experiments, with updated, further rollouts to come later. Discord also showed off early progress in two new features it hopes to add later: the ability to "remix" Discord avatars, as well as an updated real-time whiteboard feature that can take sketches and transform them into generative AI art, via a prompt.

chatbot, clyde, discord, (15 more...)

PCWorld

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.82)

Add feedback