AITopics | random base

Collaborating Authors

random base

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

An Empirical Study on The Properties of Random Bases for Kernel Methods

Neural Information Processing SystemsNov-21-2025, 15:49:21 GMT

Kernel machines as well as neural networks possess universal function approximation properties. Nevertheless in practice their ways of choosing the appropriate function class differ. Specifically neural networks learn a representation by adapting their basis functions to the data and the task at hand, while kernel methods typically use a basis that is not adapted during training. In this work, we contrast random features of approximated kernel machines with learned features of neural networks. Our analysis reveals how these random and adaptive basis functions affect the quality of learning. Furthermore, we present basis adaptation schemes that allow for a more compact representation, while retaining the generalization properties of kernel machines.

empirical study, name change, random base, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.77)

Add feedback

8dcf2420e78a64333a59674678fb283b-Supplemental.pdf

Neural Information Processing SystemsAug-15-2025, 02:03:04 GMT

dimensionality, pure rbd training, pure sgd training, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.97)

Add feedback

RandLoRA: Full-rank parameter-efficient fine-tuning of large models

Albert, Paul, Zhang, Frederic Z., Saratchandran, Hemanth, Rodriguez-Opazo, Cristian, Hengel, Anton van den, Abbasnejad, Ehsan

arXiv.org Artificial IntelligenceFeb-2-2025

Low-Rank Adaptation (LoRA) and its variants have shown impressive results in reducing the number of trainable parameters and memory requirements of large transformer networks while maintaining fine-tuning performance. This raises a critical question: when a performance gap between LoRA and standard fine-tuning is observed, is it due to the reduced number of trainable parameters or the rank deficiency? This paper aims to answer this question by introducing RandLoRA, a parameter-efficient method that performs full-rank updates using a learned linear combinations of low-rank, non-trainable random matrices. Our method limits the number of trainable parameters by restricting optimization to diagonal scaling matrices applied to the fixed random matrices. This allows us to effectively overcome the low-rank limitations while maintaining parameter and memory efficiency during training. Through extensive experimentation across vision, language, and vision-language benchmarks, we systematically evaluate the limitations of LoRA and existing random basis methods. Our findings reveal that full-rank updates are beneficial across vision and language tasks individually, and even more so for vision-language tasks, where RandLoRA significantly reduces-- and sometimes eliminates--the performance gap between standard fine-tuning and LoRA, demonstrating its efficacy. Large pre-trained models that leverage broad data have demonstrated significantly improved generalization capabilities and remarkable versatility across diverse tasks. However, the resultant high parameter count also leads to a significant increase in the computational resources required to finetune such models on downstream tasks. To tackle this issue, parameter-efficient fine-tuning (PEFT) approaches such as low-rank adaptation (LoRA) (Hu et al., 2022), draw inspiration from the low intrinsic dimensionality of pre-trained models (Li et al., 2018; Aghajanyan et al., 2021) and characterize the weight updates as the product of two low-rank matrices, substantially reducing the number of trainable parameters and memory requirements during training. This formulation leads to an adaptable number of trainable parameters, as one modifies the rank of the matrices, providing great flexibility under various resource constraints.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2502.00987

Country:

North America > United States (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Genre: Research Report > New Finding (0.87)

Industry: Government > Regional Government (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Reviews: An Empirical Study on The Properties of Random Bases for Kernel Methods

Neural Information Processing SystemsJan-20-2025, 05:13:21 GMT

Summary: The authors provided an empirical study contrasting neural networks and kernel methods, with a focus on how random and adaptive schemes would make efficient use of features in order to improve quality of learning, at four levels of abstraction: data-agnostic random basis (baseline kernel machines with traditional random features), unsupervised data-adaptive basis for better approximation of kernel function, supervised data-label-adaptive basis by kernel target alignment, discriminatively adaptive basis (neural nets). The paper concluded with several suggestions and caveats for efficient use of random features in practice. Comments: - 1 - Line 123, especially for sake of comparing UAB case where the underlying assumption is that using the true kernel function k in prediction yields the "best" performance so that UAB tries to approximate it, I would suggest testing in experiments a baseline model that utilizes the true kernel function k in prediction. Also this would suggest, for example in Figure 1 at which point of the KAE curve the accuracy is sufficiently good (despite many theoretical results available). However, in order to support those conclusions to a definitively convincing extent, more datasets should be needed. For example, the performance scores in Tab. 1 do not seem to be too significantly different marginally for each task.

empirical study, kernel method, random base, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Kernel Methods (1.00)

Add feedback

Probabilistic Method of Measuring Linguistic Productivity

Monakhov, Sergei

arXiv.org Artificial IntelligenceAug-24-2023

In this paper I propose a new way of measuring linguistic productivity that objectively assesses the ability of an affix to be used to coin new complex words and, unlike other popular measures, is not directly dependent upon token frequency. Specifically, I suggest that linguistic productivity may be viewed as the probability of an affix to combine with a random base. The advantages of this approach include the following. First, token frequency does not dominate the productivity measure but naturally influences the sampling of bases. Second, we are not just counting attested word types with an affix but rather simulating the construction of these types and then checking whether they are attested in the corpus. Third, a corpus-based approach and randomised design assure that true neologisms and words coined long ago have equal chances to be selected. The proposed algorithm is evaluated both on English and Russian data. The obtained results provide some valuable insights into the relation of linguistic productivity to the number of types and tokens. It looks like burgeoning linguistic productivity manifests itself in an increasing number of types. However, this process unfolds in two stages: first comes the increase in high-frequency items, and only then follows the increase in low-frequency items.

artificial intelligence, machine learning, productivity, (17 more...)

arXiv.org Artificial Intelligence

2308.12643

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)
Europe > Netherlands > South Holland > Dordrecht (0.05)
North America > United States > New York (0.04)
(6 more...)

Genre: Research Report > Experimental Study (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.48)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.47)

Add feedback

Improving Neural Network Training in Low Dimensional Random Bases

Gressmann, Frithjof, Eaton-Rosen, Zach, Luschi, Carlo

arXiv.org Machine LearningNov-9-2020

Stochastic Gradient Descent (SGD) has proven to be remarkably effective in optimizing deep neural networks that employ ever-larger numbers of parameters. Yet, improving the efficiency of large-scale optimization remains a vital and highly active area of research. Recent work has shown that deep neural networks can be optimized in randomly-projected subspaces of much smaller dimensionality than their native parameter space. While such training is promising for more efficient and scalable optimization schemes, its practical application is limited by inferior optimization performance. Here, we improve on recent random subspace approaches as follows: Firstly, we show that keeping the random projection fixed throughout training is detrimental to optimization. We propose re-drawing the random subspace at each step, which yields significantly better performance. We realize further improvements by applying independent projections to different parts of the network, making the approximation more efficient as network dimensionality grows. To implement these experiments, we leverage hardware-accelerated pseudo-random number generation to construct the random projections on-demand at every optimization step, allowing us to distribute the computation of independent random directions across multiple workers with shared random seeds. This yields significant reductions in memory and is up to 10 times faster for the workloads in question.

accuracy, dimensionality, optimization, (12 more...)

arXiv.org Machine Learning

2011.0472

Country: