AITopics | Supervised Learning

Collaborating Authors

Supervised Learning

Supervised learning is the machine learning task of learning a function that maps an input to an output based on example input-output pairs. (Wikipedia)

News Overviews Instructional Materials AI-Alerts Classics

Developing a Foundation of Vector Symbolic Architectures Using Category Theory

Shaw, Nolan P, Furlong, P Michael, Anderson, Britt, Orchard, Jeff

arXiv.org Artificial IntelligenceJan-9-2025

At the risk of overstating the case, connectionist approaches to machine learning, i.e. neural networks, are enjoying a small vogue right now. However, these methods require large volumes of data and produce models that are uninterpretable to humans. An alternative framework that is compatible with neural networks and gradient-based learning, but explicitly models compositionality, is Vector Symbolic Architectures (VSAs). VSAs are a family of algebras on high-dimensional vector representations. They arose in cognitive science from the need to unify neural processing and the kind of symbolic reasoning that humans perform. While machine learning methods have benefited from category theoretical analyses, VSAs have not yet received similar treatment. In this paper, we present a first attempt at applying category theory to VSAs. Specifically, we conduct a brief literature survey demonstrating the lacking intersection of these two topics, provide a list of desiderata for VSAs, and propose that VSAs may be understood as a (division) rig in a category enriched over a monoid in Met (the category of Lawvere metric spaces). This final contribution suggests that VSAs may be generalised beyond current implementations. It is our hope that grounding VSAs in category theory will lead to more rigorous connections with other research, both within and beyond, learning and cognition.

opération, vector, vsa, (12 more...)

arXiv.org Artificial Intelligence

2501.05368

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > California > Alameda County > Berkeley (0.04)
North America > Mexico (0.04)
(3 more...)

Genre: Overview (0.88)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.35)

Add feedback

Distributionally Robust Optimization via Iterative Algorithms in Continuous Probability Spaces

Zhu, Linglingzhi, Xie, Yao

arXiv.org Machine LearningDec-29-2024

We consider a minimax problem motivated by distributionally robust optimization (DRO) when the worst-case distribution is continuous, leading to significant computational challenges due to the infinite-dimensional nature of the optimization problem. Recent research has explored learning the worst-case distribution using neural network-based generative models to address these computational challenges but lacks algorithmic convergence guarantees. This paper bridges this theoretical gap by presenting an iterative algorithm to solve such a minimax problem, achieving global convergence under mild assumptions and leveraging technical tools from vector space minimax optimization and convex analysis in the space of continuous probability densities. In particular, leveraging Brenier's theorem, we represent the worst-case distribution as a transport map applied to a continuous reference measure and reformulate the regularized discrepancy-based DRO as a minimax problem in the Wasserstein space. Furthermore, we demonstrate that the worst-case distribution can be efficiently computed using a modified Jordan-Kinderlehrer-Otto (JKO) scheme with sufficiently large regularization parameters for commonly used discrepancy functions, linked to the radius of the ambiguity set. Additionally, we derive the global convergence rate and quantify the total number of subgradient and inexact modified JKO iterations required to obtain approximate stationary points. These results are potentially applicable to nonconvex and nonsmooth scenarios, with broad relevance to modern machine learning applications.

algorithm, artificial intelligence, machine learning, (19 more...)

arXiv.org Machine Learning

2412.20556

Country: Asia > Middle East > Jordan (0.24)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Add feedback

Experimental Machine Learning with Classical and Quantum Data via NMR Quantum Kernels

Sabarad, Vivek, Mahesh, T. S.

arXiv.org Artificial IntelligenceDec-12-2024

Kernel methods map data into high-dimensional spaces, enabling linear algorithms to learn nonlinear functions without explicitly storing the feature vectors. Quantum kernel methods promise efficient learning by encoding feature maps into exponentially large Hilbert spaces inherent in quantum systems. In this work we implement quantum kernels on a 10-qubit star-topology register in a nuclear magnetic resonance (NMR) platform. We experimentally encode classical data in the evolution of multiple quantum coherence orders using data-dependent unitary transformations and then demonstrate one-dimensional regression and two-dimensional classification tasks. By extending the register to a double-layered star configuration, we propose an extended quantum kernel to handle non-parametrized operator inputs. By numerically simulating the extended quantum kernel, we show classification of entangling and nonentangling unitaries. These results confirm that quantum kernels exhibit strong capabilities in classical as well as quantum machine learning tasks.

artificial intelligence, kernel, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2412.09557

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > Berlin (0.04)
Asia > India (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.34)

Add feedback

SVGFusion: Scalable Text-to-SVG Generation via Vector Space Diffusion

Xing, Ximing, Hu, Juncheng, Zhang, Jing, Xu, Dong, Yu, Qian

arXiv.org Artificial IntelligenceDec-11-2024

The generation of Scalable Vector Graphics (SVG) assets from textual data remains a significant challenge, largely due to the scarcity of high-quality vector datasets and the limitations in scalable vector representations required for modeling intricate graphic distributions. This work introduces SVGFusion, a Text-to-SVG model capable of scaling to real-world SVG data without reliance on a text-based discrete language model or prolonged SDS optimization. The essence of SVGFusion is to learn a continuous latent space for vector graphics with a popular Text-to-Image framework. Specifically, SVGFusion consists of two modules: a Vector-Pixel Fusion Variational Autoencoder (VP-VAE) and a Vector Space Diffusion Transformer (VS-DiT). VP-VAE takes both the SVGs and corresponding rasterizations as inputs and learns a continuous latent space, whereas VS-DiT learns to generate a latent code within this space based on the text prompt. Based on VP-VAE, a novel rendering sequence modeling strategy is proposed to enable the latent space to embed the knowledge of construction logics in SVGs. This empowers the model to achieve human-like design capabilities in vector graphics, while systematically preventing occlusion in complex graphic compositions. Moreover, our SVGFusion's ability can be continuously improved by leveraging the scalability of the VS-DiT by adding more VS-DiT blocks. A large-scale SVG dataset is collected to evaluate the effectiveness of our proposed method. Extensive experimentation has confirmed the superiority of our SVGFusion over existing SVG generation methods, achieving enhanced quality and generalizability, thereby establishing a novel framework for SVG content creation. Code, model, and data will be released at: \href{https://ximinng.github.io/SVGFusionProject/}{https://ximinng.github.io/SVGFusionProject/}

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2412.10437

Country: Asia > China > Hong Kong (0.04)

Genre: Research Report > Promising Solution (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.61)

Add feedback

Using Images to Find Context-Independent Word Representations in Vector Space

Kumar, Harsh

arXiv.org Artificial IntelligenceNov-28-2024

Many methods have been proposed to find vector representation for words, but most rely on capturing context from the text to find semantic relationships between these vectors. We propose a novel method of using dictionary meanings and image depictions to find word vectors independent of any context. We use auto-encoder on the word images to find meaningful representations and use them to calculate the word vectors. We finally evaluate our method on word similarity, concept categorization and outlier detection tasks. Our method performs comparably to context-based methods while taking much less training time.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2412.03592

Country:

Europe > Bulgaria > Sofia City Province > Sofia (0.04)
North America > United States > Oregon > Multnomah County > Portland (0.04)
North America > United States > New Mexico > Santa Fe County > Santa Fe (0.04)
(7 more...)

Genre: Research Report (0.71)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.43)

Add feedback

Emergence of Self-Identity in AI: A Mathematical Framework and Empirical Study with Generative Large Language Models

Lee, Minhyeok

arXiv.org Artificial IntelligenceNov-27-2024

This paper introduces a mathematical framework for defining and quantifying self-identity in artificial intelligence (AI) systems, addressing a critical gap in the theoretical foundations of artificial consciousness. While existing approaches to artificial self-awareness often rely on heuristic implementations or philosophical abstractions, we present a formal framework grounded in metric space theory, measure theory, and functional analysis. Our framework posits that self-identity emerges from two mathematically quantifiable conditions: the existence of a connected continuum of memories $C \subseteq \mathcal{M}$ in a metric space $(\mathcal{M}, d_{\mathcal{M}})$, and a continuous mapping $I: \mathcal{M} \to \mathcal{S}$ that maintains consistent self-recognition across this continuum, where $(\mathcal{S}, d_{\mathcal{S}})$ represents the metric space of possible self-identities. To validate this theoretical framework, we conducted empirical experiments using the Llama 3.2 1B model, employing Low-Rank Adaptation (LoRA) for efficient fine-tuning. The model was trained on a synthetic dataset containing temporally structured memories, designed to capture the complexity of coherent self-identity formation. Our evaluation metrics included quantitative measures of self-awareness, response consistency, and linguistic precision. The experimental results demonstrate substantial improvements in measurable self-awareness metrics, with the primary self-awareness score increasing from 0.276 to 0.801. This enables the structured creation of AI systems with validated self-identity features. The implications of our study are immediately relevant to the fields of humanoid robotics and autonomous systems.

agent, continuum, identity recognition function, (15 more...)

arXiv.org Artificial Intelligence

2411.1853

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.75)

Add feedback

Unlocking Transfer Learning for Open-World Few-Shot Recognition

Kim, Byeonggeun, Lee, Juntae, Shim, Kyuhong, Chang, Simyung

arXiv.org Artificial IntelligenceNov-15-2024

Few-Shot Open-Set Recognition (FSOSR) targets a critical real-world challenge, aiming to categorize inputs into known categories, termed closed-set classes, while identifying open-set inputs that fall outside these classes. Although transfer learning where a model is tuned to a given few-shot task has become a prominent paradigm in closed-world, we observe that it fails to expand to open-world. To unlock this challenge, we propose a two-stage method which consists of open-set aware meta-learning with open-set free transfer learning. In the open-set aware meta-learning stage, a model is trained to establish a metric space that serves as a beneficial starting point for the subsequent stage. During the open-set free transfer learning stage, the model is further adapted to a specific target task through transfer learning. Additionally, we introduce a strategy to simulate open-set examples by modifying the training dataset or generating pseudo open-set examples. The proposed method achieves state-of-the-art performance on two widely recognized benchmarks, miniImageNet and tieredImageNet, with only a 1.5\% increase in training effort. Our work demonstrates the effectiveness of transfer learning in FSOSR.

artificial intelligence, machine learning, oal-ofl-lite, (16 more...)

arXiv.org Artificial Intelligence

2411.09986

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.34)

Add feedback

Gini Coefficient as a Unified Metric for Evaluating Many-versus-Many Similarity in Vector Spaces

Fauber, Ben

arXiv.org Artificial IntelligenceNov-12-2024

We demonstrate that Gini coefficients can be used as unified metrics to evaluate many-versus-many (all-to-all) similarity in vector spaces. Our analysis of various image datasets shows that images with the highest Gini coefficients tend to be the most similar to one another, while images with the lowest Gini coefficients are the least similar. We also show that this relationship holds true for vectorized text embeddings from various corpuses, highlighting the consistency of our method and its broad applicability across different types of data. Additionally, we demonstrate that selecting machine learning training samples that closely match the distribution of the testing dataset is far more important than ensuring data diversity. Selection of exemplary and iconic training samples with higher Gini coefficients leads to significantly better model performance compared to simply having a diverse training set with lower Gini coefficients. Thus, Gini coefficients can serve as effective criteria for selecting machine learning training samples, with our selection method outperforming random sampling methods in very sparse information settings.

artificial intelligence, gini coefficient, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2411.07983

Country: North America > United States (0.68)

Genre: Research Report (0.65)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

You are out of context!

Cobino, Giancarlo, Farci, Simone

arXiv.org Machine LearningNov-4-2024

This research proposes a novel drift detection methodology for machine learning (ML) models based on the concept of ''deformation'' in the vector space representation of data. Recognizing that new data can act as forces stretching, compressing, or twisting the geometric relationships learned by a model, we explore various mathematical frameworks to quantify this deformation. We investigate measures such as eigenvalue analysis of covariance matrices to capture global shape changes, local density estimation using kernel density estimation (KDE), and Kullback-Leibler divergence to identify subtle shifts in data concentration. Additionally, we draw inspiration from continuum mechanics by proposing a ''strain tensor'' analogy to capture multi-faceted deformations across different data types. This requires careful estimation of the displacement field, and we delve into strategies ranging from density-based approaches to manifold learning and neural network methods. By continuously monitoring these deformation metrics and correlating them with model performance, we aim to provide a sensitive, interpretable, and adaptable drift detection system capable of distinguishing benign data evolution from true drift, enabling timely interventions and ensuring the reliability of machine learning systems in dynamic environments. Addressing the computational challenges of this methodology, we discuss mitigation strategies like dimensionality reduction, approximate algorithms, and parallelization for real-time and large-scale applications. The method's effectiveness is demonstrated through experiments on real-world text data, focusing on detecting context shifts in Generative AI. Our results, supported by publicly available code, highlight the benefits of this deformation-based approach in capturing subtle drifts that traditional statistical methods often miss. Furthermore, we present a detailed application example within the healthcare domain, showcasing the methodology's potential in diverse fields. Future work will focus on further improving computational efficiency and exploring additional applications across different ML domains.

data distribution, deformation, new data, (13 more...)

arXiv.org Machine Learning

2411.02464

Genre: Research Report (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning > Representation Of Examples (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Controllable Game Level Generation: Assessing the Effect of Negative Examples in GAN Models

Bazzaz, Mahsa, Cooper, Seth

arXiv.org Artificial IntelligenceOct-30-2024

Generative Adversarial Networks (GANs) are unsupervised models designed to learn and replicate a target distribution. The vanilla versions of these models can be extended to more controllable models. Conditional Generative Adversarial Networks (CGANs) extend vanilla GANs by conditioning both the generator and discriminator on some additional information (labels). Controllable models based on complementary learning, such as Rumi-GAN, have been introduced. Rumi-GANs leverage negative examples to enhance the generator's ability to learn positive examples. We evaluate the performance of two controllable GAN variants, CGAN and Rumi-GAN, in generating game levels targeting specific constraints of interest: playability and controllability. This evaluation is conducted under two scenarios: with and without the inclusion of negative examples. The goal is to determine whether incorporating negative examples helps the GAN models avoid generating undesirable outputs. Our findings highlight the strengths and weaknesses of each method in enforcing the generation of specific conditions when generating outputs based on given positive and negative examples.

constraint, negative example, playable segment, (17 more...)

arXiv.org Artificial Intelligence

2410.23108

Genre: Research Report > New Finding (0.66)

Industry: Leisure & Entertainment > Games > Computer Games (0.70)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)

Add feedback