AITopics | Supervised Learning

Collaborating Authors

Supervised Learning

Supervised learning is the machine learning task of learning a function that maps an input to an output based on example input-output pairs. (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Graded Neural Networks

Shaska, Tony

arXiv.org Artificial IntelligenceFeb-24-2025

This paper presents a novel framework for graded neural networks (GNNs) built over graded vector spaces $\V_\w^n$, extending classical neural architectures by incorporating algebraic grading. Leveraging a coordinate-wise grading structure with scalar action $\lambda \star \x = (\lambda^{q_i} x_i)$, defined by a tuple $\w = (q_0, \ldots, q_{n-1})$, we introduce graded neurons, layers, activation functions, and loss functions that adapt to feature significance. Theoretical properties of graded spaces are established, followed by a comprehensive GNN design, addressing computational challenges like numerical stability and gradient scaling. Potential applications span machine learning and photonic systems, exemplified by high-speed laser-based implementations. This work offers a foundational step toward graded computation, unifying mathematical rigor with practical potential, with avenues for future empirical and hardware exploration.

artificial intelligence, machine learning, vector space, (17 more...)

arXiv.org Artificial Intelligence

2502.17751

Country:

North America > United States > New York (0.04)
North America > United States > Michigan > Oakland County > Rochester (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.39)

Add feedback

Volume Optimality in Conformal Prediction with Structured Prediction Sets

Gao, Chao, Shan, Liren, Srinivas, Vaidehi, Vijayaraghavan, Aravindan

arXiv.org Machine LearningFeb-23-2025

Conformal Prediction is a widely studied technique to construct prediction sets of future observations. Most conformal prediction methods focus on achieving the necessary coverage guarantees, but do not provide formal guarantees on the size (volume) of the prediction sets. We first prove an impossibility of volume optimality where any distribution-free method can only find a trivial solution. We then introduce a new notion of volume optimality by restricting the prediction sets to belong to a set family (of finite VC-dimension), specifically a union of $k$-intervals. Our main contribution is an efficient distribution-free algorithm based on dynamic programming (DP) to find a union of $k$-intervals that is guaranteed for any distribution to have near-optimal volume among all unions of $k$-intervals satisfying the desired coverage property. By adopting the framework of distributional conformal prediction (Chernozhukov et al., 2021), the new DP based conformity score can also be applied to achieve approximate conditional coverage and conditional restricted volume optimality, as long as a reasonable estimator of the conditional CDF is available. While the theoretical results already establish volume-optimality guarantees, they are complemented by experiments that demonstrate that our method can significantly outperform existing methods in many settings.

artificial intelligence, machine learning, prediction, (16 more...)

arXiv.org Machine Learning

2502.16658

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Illinois > Cook County > Chicago (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.82)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.64)

Add feedback

Improved Error Bounds for Tree Representations of Metric Spaces

Neural Information Processing SystemsFeb-11-2025, 20:13:10 GMT

Estimating optimal phylogenetic trees or hierarchical clustering trees from metric data is an important problem in evolutionary biology and data analysis. Intuitively, the goodness-of-fit of a metric space to a tree depends on its inherent treeness, as well as other metric properties such as intrinsic dimension. Existing algorithms for embedding metric spaces into tree metrics provide distortion bounds depending on cardinality. Because cardinality is a simple property of any set, we argue that such bounds do not fully capture the rich structure endowed by the metric. We consider an embedding of a metric space into a tree proposed by Gromov.

improved error bound, metric space, tree representation, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (1.00)

Add feedback

Joint Metric Space Embedding by Unbalanced OT with Gromov-Wasserstein Marginal Penalization

Beier, Florian, Piening, Moritz, Beinert, Robert, Steidl, Gabriele

arXiv.org Artificial IntelligenceFeb-11-2025

We propose a new approach for unsupervised alignment of heterogeneous datasets, which maps data from two different domains without any known correspondences to a common metric space. Our method is based on an unbalanced optimal transport problem with Gromov-Wasserstein marginal penalization. It can be seen as a counterpart to the recently introduced joint multidimensional scaling method. We prove that there exists a minimizer of our functional and that for penalization parameters going to infinity, the corresponding sequence of minimizers converges to a minimizer of the so-called embedded Wasserstein distance. Our model can be reformulated as a quadratic, multi-marginal, unbalanced optimal transport problem, for which a bi-convex relaxation admits a numerical solver via block-coordinate descent. We provide numerical examples for joint embeddings in Euclidean as well as non-Euclidean spaces.

artificial intelligence, joint metric space embedding, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2502.0751

Country:

North America > United States > Rhode Island > Providence County > Providence (0.04)
Europe > Switzerland > Basel-City > Basel (0.04)
Europe > Germany > Berlin (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.64)

Add feedback

Learning Distributed Representations for Structured Output Prediction

Vivek Srikumar, Christopher D. Manning

Neural Information Processing SystemsFeb-9-2025, 22:59:00 GMT

In recent years, distributed representations of inputs have led to performance gains in many applications by allowing statistical information to be shared across inputs. However, the predicted outputs (labels, and more generally structures) are still treated as discrete objects even though outputs are often not discrete units of meaning. In this paper, we present a new formulation for structured prediction where we represent individual labels in a structure as dense vectors and allow semantically similar labels to share parameters. We extend this representation to larger structures by defining compositionality using tensor products to give a natural generalization of standard structured prediction approaches. We define a learning objective for jointly learning the model parameters and the label vectors and propose an alternating minimization algorithm for learning. We show that our formulation outperforms structural SVM baselines in two tasks: multiclass document classification and part-of-speech tagging.

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Utah (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Industry: Government (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.91)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.89)

Add feedback

Predicting Useful Neighborhoods for Lazy Local Learning

Aron Yu, Kristen Grauman

Neural Information Processing SystemsFeb-9-2025, 16:04:11 GMT

Lazy local learning methods train a classifier "on the fly" at test time, using only a subset of the training instances that are most relevant to the novel test example. The goal is to tailor the classifier to the properties of the data surrounding the test example. Existing methods assume that the instances most useful for building the local model are strictly those closest to the test example. However, this fails to account for the fact that the success of the resulting classifier depends on the full distribution of selected training instances. Rather than simply gathering the test example's nearest neighbors, we propose to predict the subset of training data that is jointly relevant to training its local model. We develop an approach to discover patterns between queries and their "good" neighborhoods using large-scale multilabel classification with compressed sensing. Given a novel test point, we estimate both the composition and size of the training subset likely to yield an accurate local model. We demonstrate the approach on image classification tasks on SUN and aPascal and show its advantages over traditional global and local approaches.

artificial intelligence, machine learning, neighborhood, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.47)

Add feedback

Submodular meets Structured: Finding Diverse Subsets in Exponentially-Large Structured Item Sets

Adarsh Prasad, Stefanie Jegelka, Dhruv Batra

Neural Information Processing SystemsFeb-9-2025, 12:39:21 GMT

To cope with the high level of ambiguity faced in domains such as Computer Vision or Natural Language processing, robust prediction methods often search for a diverse set of high-quality candidate solutions or proposals. In structured prediction problems, this becomes a daunting task, as the solution space (image labelings, sentence parses, etc.) is exponentially large. We study greedy algorithms for finding a diverse subset of solutions in structured-output spaces by drawing new connections between submodular functions over combinatorial item sets and High-Order Potentials (HOPs) studied for graphical models. Specifically, we show via examples that when marginal gains of submodular diversity functions allow structured representations, this enables efficient (sub-linear time) approximate maximization by reducing the greedy augmentation step to inference in a factor graph with appropriately constructed HOPs. We discuss benefits, tradeoffs, and show that our constructions lead to significantly better proposals.

machine learning, natural language, segmentation, (17 more...)

Neural Information Processing Systems

Country: North America > United States > Virginia (0.04)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.35)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.34)

Add feedback

Structure Regularization for Structured Prediction

Xu Sun

Neural Information Processing SystemsFeb-9-2025, 10:30:48 GMT

While there are many studies on weight regularization, the study on structure regularization is rare. Many existing systems on structured prediction focus on increasing the level of structural dependencies within the model. However, this trend could have been misdirected, because our study suggests that complex structures are actually harmful to generalization ability in structured prediction. To control structure-based overfitting, we propose a structure regularization framework via structure decomposition, which decomposes training samples into mini-samples with simpler structures, deriving a model with better generalization power. We show both theoretically and empirically that structure regularization can effectively control overfitting risk and lead to better accuracy. As a by-product, the proposed method can also substantially accelerate the training speed. The method and the theoretical results can apply to general graphical models with arbitrary structures. Experiments on well-known tasks demonstrate that our method can easily beat the benchmark systems on those highly-competitive tasks, achieving record-breaking accuracies yet with substantially faster training speed.

artificial intelligence, inductive learning, machine learning, (17 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Add feedback

Weakly-supervised Discovery of Visual Pattern Configurations

Hyun Oh Song, Yong Jae Lee, Stefanie Jegelka, Trevor Darrell

Neural Information Processing SystemsFeb-8-2025, 19:22:38 GMT

The prominence of weakly labeled data gives rise to a growing demand for object detection methods that can cope with minimal supervision. We propose an approach that automatically identifies discriminative configurations of visual patterns that are characteristic of a given object class. We formulate the problem as a constrained submodular optimization problem and demonstrate the benefits of the discovered configurations in remedying mislocalizations and finding informative positive and negative training examples.

artificial intelligence, inductive learning, machine learning, (20 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Yolo County > Davis (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.87)

Add feedback

Clone-Resistant Weights in Metric Spaces: A Framework for Handling Redundancy Bias

Berriaud, Damien, Wattenhofer, Roger

arXiv.org Artificial IntelligenceFeb-5-2025

We are given a set of elements in a metric space. The distribution of the elements is arbitrary, possibly adversarial. Can we weigh the elements in a way that is resistant to such (adversarial) manipulations? This problem arises in various contexts. For instance, the elements could represent data points, requiring robust domain adaptation. Alternatively, they might represent tasks to be aggregated into a benchmark; or questions about personal political opinions in voting advice applications. This article introduces a theoretical framework for dealing with such problems. We propose clone-proof representation functions as a solution concept. These functions distribute importance across elements of a set such that similar objects (``clones'') share (some of) their weights, thus avoiding a potential bias introduced by their multiplicity. Our framework extends the maximum uncertainty principle to accommodate general metric spaces and includes a set of axioms - symmetry, continuity, and clone-proofness - that guide the construction of representation functions. Finally, we address the existence of representation functions satisfying our axioms in the significant case of Euclidean spaces and propose a general method for their construction.

artificial intelligence, machine learning, representation function, (17 more...)

arXiv.org Artificial Intelligence

2502.03576

Country:

North America > United States > New York (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > California > Santa Clara County > Stanford (0.04)
(3 more...)

Genre: Research Report (0.50)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.91)

Add feedback