AITopics | learning scenario

Collaborating Authors

learning scenario

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Mistake-bounded online learning with operation caps

Geneson, Jesse, Li, Meien, Tang, Linus

arXiv.org Artificial IntelligenceSep-5-2025

We investigate the mistake-bound model of online learning with caps on the number of arithmetic operations per round. We prove general bounds on the minimum number of arithmetic operations per round that are necessary to learn an arbitrary family of functions with finitely many mistakes. We solve a problem on agnostic mistake-bounded online learning with bandit feedback from (Filmus et al, 2024) and (Geneson \& Tang, 2024). We also extend this result to the setting of operation caps.

arithmetic operation, artificial intelligence, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2509.03892

Country: North America > United States (1.00)

Genre: Research Report (0.50)

Industry:

Education > Educational Setting > Online (1.00)
Government > Regional Government > North America Government > United States Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.96)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.92)

Add feedback

ChordPrompt: Orchestrating Cross-Modal Prompt Synergy for Multi-Domain Incremental Learning in CLIP

Wang, Zhiyuan, Chen, Bokui

arXiv.org Artificial IntelligenceSep-4-2025

Continual learning (CL) empowers pre-trained vision-language models to adapt effectively to novel or previously underrepresented data distributions without comprehensive retraining, enhancing their adaptability and efficiency. While vision-language models like CLIP show great promise, they struggle to maintain performance across domains in incremental learning scenarios. Existing prompt learning methods face two main limitations: 1) they primarily focus on class-incremental learning scenarios, lacking specific strategies for multi-domain task incremental learning; 2) most current approaches employ single-modal prompts, neglecting the potential benefits of cross-modal information exchange. To address these challenges, we propose the ChordPrompt framework, which facilitates a harmonious interplay between visual and textual prompts. ChordPrompt introduces cross-modal prompts to leverage interactions between visual and textual information. Our approach also employs domain-adaptive text prompts to select appropriate prompts for continual adaptation across multiple domains. Comprehensive experiments on multi-domain incremental learning benchmarks demonstrate that ChordPrompt outperforms state-of-the-art methods in zero-shot generalization and downstream task performance.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2506.19608

Genre: Research Report > Promising Solution (0.34)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.35)

Add feedback

MatWheel: Addressing Data Scarcity in Materials Science Through Synthetic Data

Li, Wentao, Chen, Yizhe, Qiu, Jiangjie, Wang, Xiaonan

arXiv.org Artificial IntelligenceApr-15-2025

Data scarcity and the high cost of annotation have long been persistent challenges in the field of materials science. Inspired by its potential in other fields like computer vision, we propose the MatWheel framework, which train the material property prediction model using the synthetic data generated by the conditional generative model. We explore two scenarios: fully-supervised and semi-supervised learning. Using CGCNN for property prediction and Con-CDVAE as the conditional generative model, experiments on two data-scarce material property datasets from Matminer database are conducted. Results show that synthetic data has potential in extreme data-scarce scenarios, achieving performance close to or exceeding that of real samples in all two tasks. We also find that pseudo-labels have little impact on generated data quality. Future work will integrate advanced models and optimize generation conditions to boost the effectiveness of the materials data flywheel.

artificial intelligence, conditional generative model, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2504.09152

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.97)

Add feedback

Continual Learning for Encoder-only Language Models via a Discrete Key-Value Bottleneck

Diera, Andor, Galke, Lukas, Karl, Fabian, Scherp, Ansgar

arXiv.org Artificial IntelligenceDec-11-2024

Continual learning remains challenging across various natural language understanding tasks. When models are updated with new training data, they risk catastrophic forgetting of prior knowledge. In the present work, we introduce a discrete key-value bottleneck for encoder-only language models, allowing for efficient continual learning by requiring only localized updates. Inspired by the success of a discrete key-value bottleneck in vision, we address new and NLP-specific challenges. We experiment with different bottleneck architectures to find the most suitable variants regarding language, and present a generic discrete key initialization technique for NLP that is task independent. We evaluate the discrete key-value bottleneck in four continual learning NLP scenarios and demonstrate that it alleviates catastrophic forgetting. We showcase that it offers competitive performance to other popular continual learning methods, with lower computational costs.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2412.08528

Country:

Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)
Europe > Denmark > Southern Denmark (0.04)

Genre: Research Report (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Add feedback

Can We Achieve High-quality Direct Speech-to-Speech Translation without Parallel Speech Data?

Fang, Qingkai, Zhang, Shaolei, Ma, Zhengrui, Zhang, Min, Feng, Yang

arXiv.org Artificial IntelligenceJun-11-2024

Recently proposed two-pass direct speech-to-speech translation (S2ST) models decompose the task into speech-to-text translation (S2TT) and text-to-speech (TTS) within an end-to-end model, yielding promising results. However, the training of these models still relies on parallel speech data, which is extremely challenging to collect. In contrast, S2TT and TTS have accumulated a large amount of data and pretrained models, which have not been fully utilized in the development of S2ST models. Inspired by this, in this paper, we first introduce a composite S2ST model named ComSpeech, which can seamlessly integrate any pretrained S2TT and TTS models into a direct S2ST model. Furthermore, to eliminate the reliance on parallel speech data, we propose a novel training method ComSpeech-ZS that solely utilizes S2TT and TTS data. It aligns representations in the latent space through contrastive learning, enabling the speech synthesis capability learned from the TTS data to generalize to S2ST in a zero-shot manner. Experimental results on the CVSS dataset show that when the parallel speech data is available, ComSpeech surpasses previous two-pass models like UnitY and Translatotron 2 in both translation quality and decoding speed. When there is no parallel speech data, ComSpeech-ZS lags behind \name by only 0.7 ASR-BLEU and outperforms the cascaded models.

latexit sha1, learning scenario, translation, (15 more...)

arXiv.org Artificial Intelligence

2406.07289

Country:

North America > Canada > Ontario > Toronto (0.04)
Asia > South Korea > Incheon > Incheon (0.04)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
(10 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Personalized LoRA for Human-Centered Text Understanding

Zhang, You, Wang, Jin, Yu, Liang-Chih, Xu, Dan, Zhang, Xuejie

arXiv.org Artificial IntelligenceMar-10-2024

Effectively and efficiently adapting a pre-trained language model (PLM) for human-centered text understanding (HCTU) is challenging since user tokens are million-level in most personalized applications and do not have concrete explicit semantics. A standard and parameter-efficient approach (e.g., LoRA) necessitates memorizing numerous suits of adapters for each user. In this work, we introduce a personalized LoRA (PLoRA) with a plug-and-play (PnP) framework for the HCTU task. PLoRA is effective, parameter-efficient, and dynamically deploying in PLMs. Moreover, a personalized dropout and a mutual information maximizing strategies are adopted and hence the proposed PLoRA can be well adapted to few/zero-shot learning scenarios for the cold-start issue. Experiments conducted on four benchmark datasets show that the proposed method outperforms existing methods in full/few/zero-shot learning scenarios for the HCTU task, even though it has fewer trainable parameters. For reproducibility, the code for this paper is available at: https://github.com/yoyo-yun/PLoRA.

learning scenario, plora, scenario, (15 more...)

arXiv.org Artificial Intelligence

2403.06208

Country:

Asia > Taiwan (0.04)
Asia > China (0.04)
North America > United States (0.04)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Successive Model-Agnostic Meta-Learning for Few-Shot Fault Time Series Prognosis

Su, Hai, Hu, Jiajun, Yu, Songsen

arXiv.org Artificial IntelligenceNov-3-2023

Fault prediction in time series data is a vital machine learning task with extensive industrial applications, yet it faces challenges such as data scarcity and frequency mismatch. Meta-learning has emerged as a promising approach to address these issues, leveraging cross-task similarities and differences to effectively adapt to novel time series fault prediction tasks. It empowers deep learning models to rapidly adjust to new time series data with few or even no samples, capitalizing on the similarities and differences among time series data from various domains and scenarios to enhance generalization capabilities([35], [2]). Meta-learning enables a machine learning algorithm to'learn to learn', enhancing the universality and adaptability of knowledge. In the realm of time series fault prediction, the efficacy of meta-learning hinges on the nuanced calibration of several task-distribution-dependent factors, of which researchers identify four key aspects: data representation, meta-learner design, meta-learning algorithms, and pseudo meta-task division. It's noteworthy that the first three aspects require different adjustments based on the specific task distribution, whereas the division of pseudo meta-tasks is not dependent on task distribution [24]. Therefore, to enhance the adaptability of meta-learning in fault prediction, this paper primarily refines the division method of pseudo meta-tasks.

meta-learning, prediction, scenario, (14 more...)

arXiv.org Artificial Intelligence

2311.023

Country:

North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
Asia > China (0.04)
Europe > Hungary > Budapest > Budapest (0.04)
Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)

Genre: Research Report > Promising Solution (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Improving Performance in Continual Learning Tasks using Bio-Inspired Architectures

Madireddy, Sandeep, Yanguas-Gil, Angel, Balaprakash, Prasanna

arXiv.org Artificial IntelligenceAug-8-2023

The ability to learn continuously from an incoming data stream without catastrophic forgetting is critical to designing intelligent systems. Many approaches to continual learning rely on stochastic gradient descent and its variants that employ global error updates, and hence need to adopt strategies such as memory buffers or replay to circumvent its stability, greed, and short-term memory limitations. To address this limitation, we have developed a biologically inspired lightweight neural network architecture that incorporates synaptic plasticity mechanisms and neuromodulation and hence learns through local error signals to enable online continual learning without stochastic gradient descent. Our approach leads to superior online continual learning performance on Split-MNIST, Split-CIFAR-10, and Split-CIFAR-100 datasets compared to other memory-constrained learning approaches and matches that of the state-of-the-art memory-intensive replay-based approaches. We further demonstrate the effectiveness of our approach by integrating key design concepts into other backpropagation-based continual learning algorithms, significantly improving their accuracy. Our results provide compelling evidence for the importance of incorporating biological principles into machine learning models and offer insights into how we can leverage them to design more efficient and robust systems for online continual learning. Online continual learning addresses the scenario where a system has to learn and process data that are continuously streamed, often without restrictions in terms of the distribution of data within and across tasks and without clearly identified task boundaries Mai et al. (2021); Chen et al. (2020); Aljundi et al. (2019a). Online continual learning algorithms seek to mitigate catastrophic forgetting at both the data-instance and task level Chen et al. (2020). In some cases, however, such as on-chip learning at the edge, additional considerations such as resource limitations in the hardware, data privacy, or data security are also important for online continual learning. A key challenge of online continual learning is that it runs counter to the optimal conditions required for optimization using stochastic gradient descent (SGD) Parisi et al. (2019), which struggles with non-stationary data streams Lindsey & Litwin-Kumar (2020). On the contrary, biological systems excel at online continual learning. Inspired by the structure and functionality of the mammal brain, several approaches have adopted replay strategies to counteract catastrophic forgetting during non-stationary tasks.

artificial intelligence, learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2308.04539

Country:

North America > United States > Illinois > Cook County > Lemont (0.04)
North America > United States > Tennessee > Anderson County > Oak Ridge (0.04)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)

Genre:

Research Report (1.00)
Instructional Material > Online (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Laplacian Score for Feature Selection

Neural Information Processing SystemsApr-6-2023, 15:28:17 GMT

In supervised learning scenarios, feature selection has been studied widely in the literature. Selecting features in unsupervised learning scenarios is a much harder problem, due to the absence of class labels that would guide the search for relevant information. And, almost all of previous unsupervised feature selection methods are "wrapper" techniques that require a learning algorithm to evaluate the candidate feature subsets. In this paper, we propose a "filter" method for feature selection which is independent of any learning algorithm. Our method can be performed in either supervised or unsupervised fashion.

algorithm, feature selection, laplacian score, (1 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Lifelong Learning Metrics

New, Alexander, Baker, Megan, Nguyen, Eric, Vallabha, Gautam

arXiv.org Artificial IntelligenceJan-20-2022

The DARPA Lifelong Learning Machines (L2M) program seeks to yield advances in artificial intelligence (AI) systems so that they are capable of learning (and improving) continuously, leveraging data on one task to improve performance on another, and doing so in a computationally sustainable way. Performers on this program developed systems capable of performing a diverse range of functions, including autonomous driving, real-time strategy, and drone simulation. These systems featured a diverse range of characteristics (e.g., task structure, lifetime duration), and an immediate challenge faced by the program's testing and evaluation team was measuring system performance across these different settings. This document, developed in close collaboration with DARPA and the program performers, outlines a formalism for constructing and characterizing the performance of agents performing lifelong learning scenarios. In Section 2, we introduce the general form of a lifelong learning scenario.

agent, metric, scenario, (14 more...)

arXiv.org Artificial Intelligence

2201.08278

Country: North America > United States > Maryland > Prince George's County > Laurel (0.04)

Genre:

Instructional Material (0.83)
Research Report (0.50)

Industry:

Education > Educational Setting > Continuing Education (1.00)
Government > Regional Government > North America Government > United States Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.34)

Add feedback