AITopics

Industry: Education > Educational Setting (0.62)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.42)

arXiv.org Artificial IntelligenceDec-1-2025

A transfer learning approach for automatic conflicts detection in software requirement sentence pairs based on dual encoders

Wang, Yizheng, Jiang, Tao, Bai, Jinyan, Zou, Zhengbin, Xue, Tiancheng, Zhang, Nan, Luan, Jie

Software Requirement Document (RD) typically contain tens of thousands of individual requirements, and ensuring consistency among these requirements is critical for the success of software engineering projects. Automated detection methods can significantly enhance efficiency and reduce costs; however, existing approaches still face several challenges, including low detection accuracy on imbalanced data, limited semantic extraction due to the use of a single encoder, and suboptimal performance in cross-domain transfer learning. To address these issues, this paper proposes a Transferable Software Requirement Conflict Detection Framework based on SBERT and SimCSE, termed TSRCDF-SS. First, the framework employs two independent encoders, Sentence-BERT (SBERT) and Simple Contrastive Sentence Embedding (SimCSE), to generate sentence embeddings for requirement pairs, followed by a six-element concatenation strategy. Furthermore, the classifier is enhanced by a two-layer fully connected feedforward neural network (FFNN) with a hybrid loss optimization strategy that integrates a variant of Focal Loss, domain-specific constraints, and a confidence-based penalty term. Finally, the framework synergistically integrates sequential and cross-domain transfer learning. Experimental results demonstrate that the proposed framework achieves a 10.4% improvement in both macro-F1 and weighted-F1 scores in in-domain settings, and an 11.4% increase in macro-F1 in cross-domain scenarios.

large language model, machine learning, natural language, (23 more...)

2511.23007

Country:

Europe > Croatia (0.28)
Asia (0.28)

Genre: Research Report > New Finding (0.88)

Industry: Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)

Neural Information Processing SystemsSep-30-2025, 11:02:06 GMT

Sequential Transfer in Multi-armed Bandit with Finite Set of Models

artificial intelligence, data mining, machine learning, (6 more...)

Industry: Education > Educational Setting (0.62)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.78)

Mohammad Gheshlaghi azar, Alessandro Lazaric, Emma Brunskill

Sequential Transfer in Multi-armed Bandit with Finite Set of Models

Neural Information Processing SystemsOct-6-2024, 10:56:40 GMT

Learning from prior tasks and transferring that experience to improve future performance is critical for building lifelong learning agents. Although results in supervised and reinforcement learning show that transfer may significantly improve the learning performance, most of the literature on transfer is focused on batch learning tasks. In this paper we study the problem of sequential transfer in online learning, notably in the multi-armed bandit framework, where the objective is to minimize the total regret over a sequence of tasks by transferring knowledge from prior tasks. We introduce a novel bandit algorithm based on a method-of-moments approach for estimating the possible tasks and derive regret bounds for it.

algorithm, knowledge, umucb, (13 more...)

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Workflow (0.48)
Research Report (0.46)
Instructional Material (0.34)

Industry: Education > Educational Setting > Online (0.48)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Dukić, David, Gashteovski, Kiril, Glavaš, Goran, Šnajder, Jan

Leveraging Open Information Extraction for Improving Few-Shot Trigger Detection Domain Transfer

arXiv.org Artificial IntelligenceMay-23-2023

Event detection is a crucial information extraction task in many domains, such as Wikipedia or news. The task typically relies on trigger detection (TD) -- identifying token spans in the text that evoke specific events. While the notion of triggers should ideally be universal across domains, domain transfer for TD from high- to low-resource domains results in significant performance drops. We address the problem of negative transfer for TD by coupling triggers between domains using subject-object relations obtained from a rule-based open information extraction (OIE) system. We demonstrate that relations injected through multi-task training can act as mediators between triggers in different domains, enhancing zero- and few-shot TD domain transfer and reducing negative transfer, in particular when transferring from a high-resource source Wikipedia domain to a low-resource target news domain. Additionally, we combine the extracted relations with masked language modeling on the target domain and obtain further TD performance gains. Finally, we demonstrate that the results are robust to the choice of the OIE system.

artificial intelligence, natural language, relation, (16 more...)

2305.14163

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(10 more...)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)

Paulsen, Sean, Casey, Michael

Sequential Transfer Learning to Decode Heard and Imagined Timbre from fMRI Data

arXiv.org Artificial IntelligenceMay-22-2023

We present a sequential transfer learning framework for transformers on functional Magnetic Resonance Imaging (fMRI) data and demonstrate its significant benefits for decoding musical timbre. In the first of two phases, we pre-train our stacked-encoder transformer architecture on Next Thought Prediction, a self-supervised task of predicting whether or not one sequence of fMRI data follows another. This phase imparts a general understanding of the temporal and spatial dynamics of neural activity, and can be applied to any fMRI dataset. In the second phase, we fine-tune the pre-trained models and train additional fresh models on the supervised task of predicting whether or not two sequences of fMRI data were recorded while listening to the same musical timbre. The fine-tuned models achieve significantly higher accuracy with shorter training times than the fresh models, demonstrating the efficacy of our framework for facilitating transfer learning on fMRI data. Additionally, our fine-tuning task achieves a level of classification granularity beyond standard methods. This work contributes to the growing literature on transformer architectures for sequential transfer learning on fMRI data, and provides evidence that our framework is an improvement over current methods for decoding timbre.

artificial intelligence, deep learning, machine learning, (16 more...)

2305.13226

Country:

North America > United States > New Hampshire > Grafton County > Hanover (0.04)
Europe > United Kingdom > England (0.04)
Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.04)
(4 more...)

Genre: Research Report (0.84)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area > Neurology > Attention Deficit/Hyperactivity Disorder (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsApr-6-2023, 14:48:49 GMT

Transfer Learning using Kolmogorov Complexity: Basic Theory and Empirical Evaluations

In transfer learning we aim to solve new problems using fewer examples using information gained from solving related problems. Transfer learning has been successful in practice, and extensive PAC analysis of these methods has been de- veloped. However it is not yet clear how to define relatedness between tasks. This is considered as a major problem as it is conceptually troubling and it makes it unclear how much information to transfer and when and how to transfer it. In this paper we propose to measure the amount of information one task contains about another using conditional Kolmogorov complexity between the tasks.

basic theory and empirical evaluation, information, kolmogorov complexity, (3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory > Minimum Complexity Machines (0.70)

Namazifar, Mahdi, Papangelis, Alexandros, Tur, Gokhan, Hakkani-Tür, Dilek

Language Model is All You Need: Natural Language Understanding as Question Answering

arXiv.org Artificial IntelligenceNov-5-2020

Different flavors of transfer learning have shown tremendous impact in advancing research and applications of machine learning. In this work we study the use of a specific family of transfer learning, where the target domain is mapped to the source domain. Specifically we map Natural Language Understanding (NLU) problems to QuestionAnswering (QA) problems and we show that in low data regimes this approach offers significant improvements compared to other approaches to NLU. Moreover we show that these gains could be increased through sequential transfer learning across NLU problems from different domains. We show that our approach could reduce the amount of required data for the same performance by up to a factor of 10.

qanlu, question and answer, squad2, (17 more...)

2011.03023

Country: North America > United States > Pennsylvania (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Understanding (0.61)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.51)

arXiv.org Machine LearningJul-1-2020

A Survey on Self-supervised Pre-training for Sequential Transfer Learning in Neural Networks

Mao, Huanru Henry

Deep neural networks are typically trained under a supervised learning framework where a model learns a single task using labeled data. Instead of relying solely on labeled data, practitioners can harness unlabeled or related data to improve model performance, which is often more accessible and ubiquitous. Self-supervised pre-training for transfer learning is becoming an increasingly popular technique to improve state-of-the-art results using unlabeled data. It involves first pre-training a model on a large amount of unlabeled data, then adapting the model to target tasks of interest. In this review, we survey self-supervised learning methods and their applications within the sequential transfer learning framework. We provide an overview of the taxonomy for self-supervised learning and transfer learning, and highlight some prominent methods for designing pre-training tasks across different domains. Finally, we discuss recent trends and suggest areas for future investigation.

artificial intelligence, inductive learning, machine learning, (15 more...)

arXiv.org Machine Learning

2007.008

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
(9 more...)

Genre: Overview (1.00)

Industry: Leisure & Entertainment (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Tirinzoni, Andrea, Poiani, Riccardo, Restelli, Marcello

Sequential Transfer in Reinforcement Learning with a Generative Model

arXiv.org Machine LearningJul-1-2020

We are interested in how to design reinforcement learning agents that provably reduce the sample complexity for learning new tasks by transferring knowledge from previously-solved ones. The availability of solutions to related problems poses a fundamental trade-off: whether to seek policies that are expected to achieve high (yet sub-optimal) performance in the new task immediately or whether to seek information to quickly identify an optimal solution, potentially at the cost of poor initial behavior. In this work, we focus on the second objective when the agent has access to a generative model of state-action pairs. First, given a set of solved tasks containing an approximation of the target one, we design an algorithm that quickly identifies an accurate solution by seeking the state-action pairs that are most informative for this purpose. We derive PAC bounds on its sample complexity which clearly demonstrate the benefits of using this kind of prior knowledge. Then, we show how to learn these approximate tasks sequentially by reducing our transfer setting to a hidden Markov model and employing spectral methods to recover its parameters. Finally, we empirically verify our theoretical findings in simple simulated domains.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Machine Learning

2007.00722

Country:

Europe > Austria > Vienna (0.14)
Europe > Sweden > Stockholm > Stockholm (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > Italy > Lombardy > Milan (0.04)

Genre:

Research Report (1.00)
Workflow (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)