AITopics | Supervised Learning

Collaborating Authors

Supervised Learning

Supervised learning is the machine learning task of learning a function that maps an input to an output based on example input-output pairs. (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Predicting Drug-Gene Relations via Analogy Tasks with Word Embeddings

Yamagiwa, Hiroaki, Hashimoto, Ryoma, Arakane, Kiwamu, Murakami, Ken, Soeda, Shou, Oyama, Momose, Okada, Mariko, Shimodaira, Hidetoshi

arXiv.org Artificial IntelligenceJun-3-2024

Natural language processing (NLP) is utilized in a wide range of fields, where words in text are typically transformed into feature vectors called embeddings. BioConceptVec is a specific example of embeddings tailored for biology, trained on approximately 30 million PubMed abstracts using models such as skip-gram. Generally, word embeddings are known to solve analogy tasks through simple vector arithmetic. For instance, $\mathrm{\textit{king}} - \mathrm{\textit{man}} + \mathrm{\textit{woman}}$ predicts $\mathrm{\textit{queen}}$. In this study, we demonstrate that BioConceptVec embeddings, along with our own embeddings trained on PubMed abstracts, contain information about drug-gene relations and can predict target genes from a given drug through analogy computations. We also show that categorizing drugs and genes using biological pathways improves performance. Furthermore, we illustrate that vectors derived from known relations in the past can predict unknown future relations in datasets divided by year.

analogy task, drug-gene relation, relation, (17 more...)

arXiv.org Artificial Intelligence

2406.00984

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.04)
(9 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Government > Regional Government > North America Government > United States Government > FDA (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.45)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.34)

Add feedback

Adaptive Discretization-based Non-Episodic Reinforcement Learning in Metric Spaces

Kar, Avik, Singh, Rahul

arXiv.org Artificial IntelligenceMay-29-2024

We study non-episodic Reinforcement Learning for Lipschitz MDPs in which state-action space is a metric space, and the transition kernel and rewards are Lipschitz functions. We develop computationally efficient UCB-based algorithm, $\textit{ZoRL-}\epsilon$ that adaptively discretizes the state-action space and show that their regret as compared with $\epsilon$-optimal policy is bounded as $\mathcal{O}(\epsilon^{-(2 d_\mathcal{S} + d^\epsilon_z + 1)}\log{(T)})$, where $d^\epsilon_z$ is the $\epsilon$-zooming dimension. In contrast, if one uses the vanilla $\textit{UCRL-}2$ on a fixed discretization of the MDP, the regret w.r.t. a $\epsilon$-optimal policy scales as $\mathcal{O}(\epsilon^{-(2 d_\mathcal{S} + d + 1)}\log{(T)})$ so that the adaptivity gains are huge when $d^\epsilon_z \ll d$. Note that the absolute regret of any 'uniformly good' algorithm for a large family of continuous MDPs asymptotically scales as at least $\Omega(\log{(T)})$. Though adaptive discretization has been shown to yield $\mathcal{\tilde{O}}(H^{2.5}K^\frac{d_z + 1}{d_z + 2})$ regret in episodic RL, an attempt to extend this to the non-episodic case by employing constant duration episodes whose duration increases with $T$, is futile since $d_z \to d$ as $T \to \infty$. The current work shows how to obtain adaptivity gains for non-episodic RL. The theoretical results are supported by simulations on two systems where the performance of $\textit{ZoRL-}\epsilon$ is compared with that of '$\textit{UCRL-C}$,' the fixed discretization-based extension of $\textit{UCRL-}2$ for systems with continuous state-action spaces.

algorithm, diam, transition kernel, (14 more...)

arXiv.org Artificial Intelligence

2405.18793

Country:

Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.49)

Industry:

Media > Television (0.46)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Add feedback

MLB records set for major shakeup as Negro Leagues stats set to be officially recognized: report

FOX NewsMay-28-2024, 23:15:15 GMT

Major League Baseball's record books will look a lot different later this week. The league is reportedly set to officially recognize statistics from the Negro Leagues and incorporate them into its own data. MLB elevated Negro League stats as "major league" in 2020, a move they said was "a longtime oversight in the game's history." Since then, MLB has been working with the Elias Sports Bureau in order to figure out a way to incorporate them into MLB's history. One player in particular, one you may have never heard of, will now be considered one of MLB's all-time greats.

artificial intelligence, inductive learning, machine learning, (11 more...)

FOX News

Country:

North America > United States > Ohio > Hamilton County > Cincinnati (0.06)
North America > United States > Minnesota (0.06)

Industry: Leisure & Entertainment > Sports > Baseball (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.40)

Add feedback

Bundle Neural Networks for message diffusion on graphs

Bamberger, Jacob, Barbero, Federico, Dong, Xiaowen, Bronstein, Michael

arXiv.org Artificial IntelligenceMay-24-2024

The dominant paradigm for learning on graph-structured data is message passing. Despite being a strong inductive bias, the local message passing mechanism suffers from pathological issues such as over-smoothing, over-squashing, and limited node-level expressivity. To address these limitations we propose Bundle Neural Networks (BuNN), a new type of GNN that operates via message diffusion over flat vector bundles - structures analogous to connections on Riemannian manifolds that augment the graph by assigning to each node a vector space and an orthogonal map. A BuNN layer evolves the features according to a diffusion-type partial differential equation. When discretized, BuNNs are a special case of Sheaf Neural Networks (SNNs), a recently proposed MPNN capable of mitigating over-smoothing. The continuous nature of message diffusion enables BuNNs to operate on larger scales of the graph and, therefore, to mitigate over-squashing. Finally, we prove that BuNN can approximate any feature transformation over nodes on any (potentially infinite) family of graphs given injective positional encodings, resulting in universal node-level expressivity. We support our theory via synthetic experiments and showcase the strong empirical performance of BuNNs over a range of real-world tasks, achieving state-of-the-art results on several standard benchmarks in transductive and inductive settings.

bundle, bunn, graph, (16 more...)

arXiv.org Artificial Intelligence

2405.1554

Country:

Asia > Macao (0.14)
Asia > China (0.04)
North America > United States > Pennsylvania (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.34)

Add feedback

Canonical Variates in Wasserstein Metric Space

Li, Jia, Lin, Lin

arXiv.org Machine LearningMay-24-2024

In this paper, we address the classification of instances each characterized not by a singular point, but by a distribution on a vector space. We employ the Wasserstein metric to measure distances between distributions, which are then used by distance-based classification algorithms such as k-nearest neighbors, k-means, and pseudo-mixture modeling. Central to our investigation is dimension reduction within the Wasserstein metric space to enhance classification accuracy. We introduce a novel approach grounded in the principle of maximizing Fisher's ratio, defined as the quotient of between-class variation to within-class variation. The directions in which this ratio is maximized are termed discriminant coordinates or canonical variates axes. In practice, we define both between-class and within-class variations as the average squared distances between pairs of instances, with the pairs either belonging to the same class or to different classes. This ratio optimization is achieved through an iterative algorithm, which alternates between optimal transport and maximization steps within the vector space. We conduct empirical studies to assess the algorithm's convergence and, through experimental validation, demonstrate that our dimension reduction technique substantially enhances classification performance. Moreover, our method outperforms well-established algorithms that operate on vector representations derived from distributional data. It also exhibits robustness against variations in the distributional representations of data clouds.

algorithm, fisher, wasserstein distance, (15 more...)

arXiv.org Machine Learning

2405.15768

Country: North America > United States > Pennsylvania (0.04)

Genre: Research Report > Experimental Study (0.67)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Can We Treat Noisy Labels as Accurate?

Zheng, Yuxiang, Han, Zhongyi, Yin, Yilong, Gao, Xin, Liu, Tongliang

arXiv.org Artificial IntelligenceMay-21-2024

Noisy labels significantly hinder the accuracy and generalization of machine learning models, particularly due to ambiguous instance features. Traditional techniques that attempt to correct noisy labels directly, such as those using transition matrices, often fail to address the inherent complexities of the problem sufficiently. In this paper, we introduce EchoAlign, a transformative paradigm shift in learning from noisy labels. Instead of focusing on label correction, EchoAlign treats noisy labels ($\tilde{Y}$) as accurate and modifies corresponding instance features ($X$) to achieve better alignment with $\tilde{Y}$. EchoAlign's core components are (1) EchoMod: Employing controllable generative models, EchoMod precisely modifies instances while maintaining their intrinsic characteristics and ensuring alignment with the noisy labels. (2) EchoSelect: Instance modification inevitably introduces distribution shifts between training and test sets. EchoSelect maintains a significant portion of clean original instances to mitigate these shifts. It leverages the distinct feature similarity distributions between original and modified instances as a robust tool for accurate sample selection. This integrated approach yields remarkable results. In environments with 30% instance-dependent noise, even at 99% selection accuracy, EchoSelect retains nearly twice the number of samples compared to the previous best method. Notably, on three datasets, EchoAlign surpasses previous state-of-the-art techniques with a substantial improvement.

information, noise, noisy label, (16 more...)

arXiv.org Artificial Intelligence

2405.12969

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.66)

Add feedback

Learning Intersections of Halfspaces with Distribution Shift: Improved Algorithms and SQ Lower Bounds

Klivans, Adam R., Stavropoulos, Konstantinos, Vasilyan, Arsen

arXiv.org Artificial IntelligenceMay-20-2024

Recent work of Klivans, Stavropoulos, and Vasilyan initiated the study of testable learning with distribution shift (TDS learning), where a learner is given labeled samples from training distribution $\mathcal{D}$, unlabeled samples from test distribution $\mathcal{D}'$, and the goal is to output a classifier with low error on $\mathcal{D}'$ whenever the training samples pass a corresponding test. Their model deviates from all prior work in that no assumptions are made on $\mathcal{D}'$. Instead, the test must accept (with high probability) when the marginals of the training and test distributions are equal. Here we focus on the fundamental case of intersections of halfspaces with respect to Gaussian training distributions and prove a variety of new upper bounds including a $2^{(k/\epsilon)^{O(1)}} \mathsf{poly}(d)$-time algorithm for TDS learning intersections of $k$ homogeneous halfspaces to accuracy $\epsilon$ (prior work achieved $d^{(k/\epsilon)^{O(1)}}$). We work under the mild assumption that the Gaussian training distribution contains at least an $\epsilon$ fraction of both positive and negative examples ($\epsilon$-balanced). We also prove the first set of SQ lower-bounds for any TDS learning problem and show (1) the $\epsilon$-balanced assumption is necessary for $\mathsf{poly}(d,1/\epsilon)$-time TDS learning for a single halfspace and (2) a $d^{\tilde{\Omega}(\log 1/\epsilon)}$ lower bound for the intersection of two general halfspaces, even with the $\epsilon$-balanced assumption. Our techniques significantly expand the toolkit for TDS learning. We use dimension reduction and coverings to give efficient algorithms for computing a localized version of discrepancy distance, a key metric from the domain adaptation literature.

artificial intelligence, inductive learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2404.02364

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry: Education (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.34)

Add feedback

Frugal Algorithm Selection

Kuş, Erdem, Akgün, Özgür, Dang, Nguyen, Miguel, Ian

arXiv.org Artificial IntelligenceMay-17-2024

When solving decision and optimisation problems, many competing algorithms (model and solver choices) have complementary strengths. Typically, there is no single algorithm that works well for all instances of a problem. Automated algorithm selection has been shown to work very well for choosing a suitable algorithm for a given instance. However, the cost of training can be prohibitively large due to running candidate algorithms on a representative set of training instances. In this work, we explore reducing this cost by choosing a subset of the training instances on which to train. We approach this problem in three ways: using active learning to decide based on prediction uncertainty, augmenting the algorithm predictors with a timeout predictor, and collecting training data using a progressively increasing timeout. We evaluate combinations of these approaches on six datasets from ASLib and present the reduction in labelling cost achieved by each option.

algorithm, algorithm selection, selection, (13 more...)

arXiv.org Artificial Intelligence

2405.11059

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
Europe > Italy (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)
Europe > Czechia > Prague (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.68)

Add feedback

Targeted Augmentation for Low-Resource Event Extraction

Wang, Sijia, Huang, Lifu

arXiv.org Artificial IntelligenceMay-14-2024

Addressing the challenge of low-resource information extraction remains an ongoing issue due to the inherent information scarcity within limited training examples. Existing data augmentation methods, considered potential solutions, struggle to strike a balance between weak augmentation (e.g., synonym augmentation) and drastic augmentation (e.g., conditional generation without proper guidance). This paper introduces a novel paradigm that employs targeted augmentation and back validation to produce augmented examples with enhanced diversity, polarity, accuracy, and coherence. Extensive experimental results demonstrate the effectiveness of the proposed paradigm. Furthermore, identified limitations are discussed, shedding light on areas for future improvement.

computational linguistic, event structure, extraction, (14 more...)

arXiv.org Artificial Intelligence

2405.08729

Country:

North America > United States > Nevada (0.05)
North America > Dominican Republic (0.04)
North America > Canada > Ontario > Toronto (0.04)
(8 more...)

Genre: Research Report > New Finding (0.48)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
(2 more...)

Add feedback

Batched Stochastic Bandit for Nondegenerate Functions

Liu, Yu, Shu, Yunlu, Wang, Tianyu

arXiv.org Machine LearningMay-9-2024

This paper studies batched bandit learning problems for nondegenerate functions. We introduce an algorithm that solves the batched bandit problem for nondegenerate functions near-optimally. More specifically, we introduce an algorithm, called Geometric Narrowing (GN), whose regret bound is of order $\widetilde{{\mathcal{O}}} ( A_{+}^d \sqrt{T} )$. In addition, GN only needs $\mathcal{O} (\log \log T)$ batches to achieve this regret. We also provide lower bound analysis for this problem. More specifically, we prove that over some (compact) doubling metric space of doubling dimension $d$: 1. For any policy $\pi$, there exists a problem instance on which $\pi$ admits a regret of order ${\Omega} ( A_-^d \sqrt{T})$; 2. No policy can achieve a regret of order $ A_-^d \sqrt{T} $ over all problem instances, using less than $ \Omega ( \log \log T ) $ rounds of communications. Our lower bound analysis shows that the GN algorithm achieves near optimal regret with minimal number of batches.

algorithm, bandit, nondegenerate function, (16 more...)

arXiv.org Machine Learning

2405.05733

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)

Genre: Research Report (1.00)

Industry: Education (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.35)

Add feedback