AITopics | Supervised Learning

Collaborating Authors

Supervised Learning

Supervised learning is the machine learning task of learning a function that maps an input to an output based on example input-output pairs. (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Learn it or Leave it: Module Composition and Pruning for Continual Learning

Wang, Mingyang, Adel, Heike, Lange, Lukas, Strötgen, Jannik, Schütze, Hinrich

arXiv.org Artificial IntelligenceJun-26-2024

In real-world environments, continual learning is essential for machine learning models, as they need to acquire new knowledge incrementally without forgetting what they have already learned. While pretrained language models have shown impressive capabilities on various static tasks, applying them to continual learning poses significant challenges, including avoiding catastrophic forgetting, facilitating knowledge transfer, and maintaining parameter efficiency. In this paper, we introduce MoCL-P, a novel lightweight continual learning method that addresses these challenges simultaneously. Unlike traditional approaches that continuously expand parameters for newly arriving tasks, MoCL-P integrates task representation-guided module composition with adaptive pruning, effectively balancing knowledge integration and computational overhead. Our evaluation across three continual learning benchmarks with up to 176 tasks shows that MoCL-P achieves state-of-the-art performance and improves parameter efficiency by up to three times, demonstrating its potential for practical applications where resource requirements are constrained.

benchmark, continual learning, module, (12 more...)

arXiv.org Artificial Intelligence

2406.18708

Country:

North America > Canada > Ontario > Toronto (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
(7 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.31)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.30)

Add feedback

DefSent+: Improving sentence embeddings of language models by projecting definition sentences into a quasi-isotropic or isotropic vector space of unlimited dictionary entries

Liu, Xiaodong

arXiv.org Artificial IntelligenceJun-19-2024

This paper presents a significant improvement on the previous conference paper known as DefSent. The prior study seeks to improve sentence embeddings of language models by projecting definition sentences into the vector space of dictionary entries. We discover that this approach is not fully explored due to the methodological limitation of using word embeddings of language models to represent dictionary entries. This leads to two hindrances. First, dictionary entries are constrained by the single-word vocabulary, and thus cannot be fully exploited. Second, semantic representations of language models are known to be anisotropic, but pre-processing word embeddings for DefSent is not allowed because its weight is frozen during training and tied to the prediction layer. In this paper, we propose a novel method to progressively build entry embeddings not subject to the limitations. As a result, definition sentences can be projected into a quasi-isotropic or isotropic vector space of unlimited dictionary entries, so that sentence embeddings of noticeably better quality are attainable. We abbreviate our approach as DefSent+ (a plus version of DefSent), involving the following strengths: 1) the task performance on measuring sentence similarities is significantly improved compared to DefSent; 2) when DefSent+ is used to further train data-augmented models like SIMCSE, SNCSE, and SynCSE, state-of-the-art performance on measuring sentence similarities can be achieved among the approaches without using manually labeled datasets; 3) DefSent+ is also competitive in feature-based transfer for NLP downstream tasks.

computational linguistic, encoder, vector space, (14 more...)

arXiv.org Artificial Intelligence

2405.16153

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Hong Kong (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(20 more...)

Genre: Research Report > Promising Solution (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.86)

Add feedback

Structured Prediction in Online Learning

Boudart, Pierre, Rudi, Alessandro, Gaillard, Pierre

arXiv.org Machine LearningJun-18-2024

We study a theoretical and algorithmic framework for structured prediction in the online learning setting. The problem of structured prediction, i.e. estimating function where the output space lacks a vectorial structure, is well studied in the literature of supervised statistical learning. We show that our algorithm is a generalisation of optimal algorithms from the supervised learning setting, and achieves the same excess risk upper bound also when data are not i.i.d. Moreover, we consider a second algorithm designed especially for non-stationary data distributions, including adversarial data. We bound its stochastic regret in function of the variation of the data distributions.

algorithm, eff, preprint, (16 more...)

arXiv.org Machine Learning

2406.12366

Country:

Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)
Asia > Middle East > Jordan (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry: Education > Educational Setting > Online (0.61)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.81)

Add feedback

Gram2Vec: An Interpretable Document Vectorizer

Zeng, Peter, Sclafani, Eric, Rambow, Owen

arXiv.org Artificial IntelligenceJun-17-2024

We present Gram2Vec, a grammatical style embedding algorithm that embeds documents into a higher dimensional space by extracting the normalized relative frequencies of grammatical features present in the text. Compared to neural approaches, Gram2Vec offers inherent interpretability based on how the feature vectors are generated. In our demo, we present a way to visualize a mapping of authors to documents based on their Gram2Vec vectors and highlight the ability to drop or add features to view which authors make certain linguistic choices. Next, we use authorship attribution as an application to show how Gram2Vec can explain why a document is attributed to a certain author, using cosine similarities between the Gram2Vec feature vectors to calculate the distances between candidate documents and a query document.

author 1, author 2, text message, (15 more...)

arXiv.org Artificial Intelligence

2406.12131

Country:

North America > United States > New York > Suffolk County > Stony Brook (0.04)
North America > United States > New Mexico > Santa Fe County > Santa Fe (0.04)
North America > Dominican Republic (0.04)
(3 more...)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.34)

Add feedback

Optimizing Automatic Speech Assessment: W-RankSim Regularization and Hybrid Feature Fusion Strategies

Wu, Chung-Wen, Chen, Berlin

arXiv.org Artificial IntelligenceJun-16-2024

Automatic Speech Assessment (ASA) has seen notable advancements with the utilization of self-supervised features (SSL) in recent research. However, a key challenge in ASA lies in the imbalanced distribution of data, particularly evident in English test datasets. To address this challenge, we approach ASA as an ordinal classification task, introducing Weighted Vectors Ranking Similarity (W-RankSim) as a novel regularization technique. W-RankSim encourages closer proximity of weighted vectors in the output layer for similar classes, implying that feature vectors with similar labels would be gradually nudged closer to each other as they converge towards corresponding weighted vectors. Extensive experimental evaluations confirm the effectiveness of our approach in improving ordinal classification performance for ASA. Furthermore, we propose a hybrid model that combines SSL and handcrafted features, showcasing how the inclusion of handcrafted features enhances performance in an ASA system.

batch size, hybrid model, w-ranksim, (13 more...)

arXiv.org Artificial Intelligence

2406.10873

Country: Asia > Taiwan (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.34)

Add feedback

We're Calling an Intervention: Exploring the Fundamental Hurdles in Adapting Language Models to Nonstandard Text

Srivastava, Aarohi, Chiang, David

arXiv.org Artificial IntelligenceJun-15-2024

We present a suite of experiments that allow us to understand the underlying challenges of language model adaptation to nonstandard text. We do so by designing interventions that approximate several types of linguistic variation and their interactions with existing biases of language models. Applying our interventions during language model adaptation with varying size and nature of training data, we gain important insights into when knowledge transfer can be successful, as well as the aspects of linguistic variation that are particularly difficult for language models to deal with. For instance, on text with character-level variation, performance improves with even a few training examples but approaches a plateau, suggesting that more data is not the solution. In contrast, on text with variation involving new words or meanings, far more data is needed, but it leads to a massive breakthrough in performance. Our findings reveal that existing models lack the necessary infrastructure to handle diverse forms of nonstandard text and linguistic variation, guiding the development of more resilient language modeling techniques for the future. We make the code for our interventions, which can be applied to any English text data, publicly available.

experiment, intervention, variation, (16 more...)

arXiv.org Artificial Intelligence

2404.07304

Country:

North America > United States > Indiana > St. Joseph County > Notre Dame (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.49)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.54)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.34)

Add feedback

Positive-Unlabelled Learning for Identifying New Candidate Dietary Restriction-related Genes among Ageing-related Genes

Paz-Ruza, Jorge, Freitas, Alex A., Alonso-Betanzos, Amparo, Guijarro-Berdiñas, Bertha

arXiv.org Artificial IntelligenceJun-14-2024

Dietary Restriction (DR) is one of the most popular anti-ageing interventions, prompting exhaustive research into genes associated with its mechanisms. Recently, Machine Learning (ML) has been explored to identify potential DR-related genes among ageing-related genes, aiming to minimize costly wet lab experiments needed to expand our knowledge on DR. However, to train a model from positive (DR-related) and negative (non-DR-related) examples, existing ML methods naively label genes without known DR relation as negative examples, assuming that lack of DR-related annotation for a gene represents evidence of absence of DR-relatedness, rather than absence of evidence; this hinders the reliability of the negative examples (non-DR-related genes) and the method's ability to identify novel DR-related genes. This work introduces a novel gene prioritization method based on the two-step Positive-Unlabelled (PU) Learning paradigm: using a similarity-based, KNN-inspired approach, our method first selects reliable negative examples among the genes without known DR associations. Then, these reliable negatives and all known positives are used to train a classifier that effectively differentiates DR-related and non-DR-related genes, which is finally employed to generate a more reliable ranking of promising genes for novel DR-relatedness. Our method significantly outperforms the existing state-of-the-art non-PU approach for DR-relatedness prediction in three relevant performance metrics. In addition, curation of existing literature finds support for the top-ranked candidate DR-related genes identified by our model.

arXiv.org Artificial Intelligence

2406.09898

Country:

North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Spain (0.04)
Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Consumer Health (1.00)
Education > Health & Safety > School Nutrition (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)

Add feedback

Neural networks in non-metric spaces

Galimberti, Luca

arXiv.org Artificial IntelligenceJun-13-2024

Leveraging the infinite dimensional neural network architecture we proposed in arXiv:2109.13512v4 and which can process inputs from Fr\'echet spaces, and using the universal approximation property shown therein, we now largely extend the scope of this architecture by proving several universal approximation theorems for a vast class of input and output spaces. More precisely, the input space $\mathfrak X$ is allowed to be a general topological space satisfying only a mild condition ("quasi-Polish"), and the output space can be either another quasi-Polish space $\mathfrak Y$ or a topological vector space $E$. Similarly to arXiv:2109.13512v4, we show furthermore that our neural network architectures can be projected down to "finite dimensional" subspaces with any desirable accuracy, thus obtaining approximating networks that are easy to implement and allow for fast computation and fitting. The resulting neural network architecture is therefore applicable for prediction tasks based on functional data. To the best of our knowledge, this is the first result which deals with such a wide class of input/output spaces and simultaneously guarantees the numerical feasibility of the ensuing architectures. Finally, we prove an obstruction result which indicates that the category of quasi-Polish spaces is in a certain sense the correct category to work with if one aims at constructing approximating architectures on infinite-dimensional spaces $\mathfrak X$ which, at the same time, have sufficient expressive power to approximate continuous functions on $\mathfrak X$, are specified by a finite number of parameters only and are "stable" with respect to these parameters.

architecture, neural network, topology, (14 more...)

arXiv.org Artificial Intelligence

2406.0931

Country:

North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.73)

Add feedback

Young woman breaks fishing record set in place for nearly half a century

FOX NewsJun-12-2024, 20:43:28 GMT

Fishing enthusiast Hunter Ham recently captured footage of an alligator on a Texas beach eating a bull redfish. Gators are primarily freshwater creatures. A 21-year-old woman from Georgia recently broke a statewide fishing record, officials say. The Georgia Department of Natural Resources announced the new state record in a press release on June 5. St. Marys resident Lauren E. Harden caught a 33-pound crevalle jack on May 24 while fishing on Cumberland Island.

artificial intelligence, fishing record, machine learning, (11 more...)

FOX News

Country:

North America > United States > Texas (0.26)
North America > United States > Oregon (0.06)
Atlantic Ocean (0.06)

Industry:

Government > Regional Government (0.42)
Media > News (0.36)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.40)

Add feedback

Hyperbolic Benchmarking Unveils Network Topology-Feature Relationship in GNN Performance

Aliakbarisani, Roya, Jankowski, Robert, Serrano, M. Ángeles, Boguñá, Marián

arXiv.org Artificial IntelligenceJun-4-2024

Graph Neural Networks (GNNs) have excelled in predicting graph properties in various applications ranging from identifying trends in social networks to drug discovery and malware detection. With the abundance of new architectures and increased complexity, GNNs are becoming highly specialized when tested on a few well-known datasets. However, how the performance of GNNs depends on the topological and features properties of graphs is still an open question. In this work, we introduce a comprehensive benchmarking framework for graph machine learning, focusing on the performance of GNNs across varied network structures. Utilizing the geometric soft configuration model in hyperbolic space, we generate synthetic networks with realistic topological properties and node feature vectors. This approach enables us to assess the impact of network properties, such as topology-feature correlation, degree distributions, local density of triangles (or clustering), and homophily, on the effectiveness of different GNN architectures. Our results highlight the dependency of model performance on the interplay between network structure and node features, providing insights for model selection in various scenarios. This study contributes to the field by offering a versatile tool for evaluating GNNs, thereby assisting in developing and selecting suitable models based on specific data characteristics.

correlation, degree distribution, node, (16 more...)

arXiv.org Artificial Intelligence

2406.02772

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > District of Columbia > Washington (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.87)

Industry: Information Technology > Security & Privacy (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.34)

Add feedback