AITopics | fast learning

Collaborating Authors

fast learning

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

CARE: A QLoRA-Fine Tuned Multi-Domain Chatbot With Fast Learning On Minimal Hardware

Dutta, Ankit, Ghosh, Nabarup, Chatterjee, Ankush

arXiv.org Artificial IntelligenceMar-18-2025

Large Language models have demonstrated excellent domain-specific question-answering capabilities when finetuned with a particular dataset of that specific domain. However, fine-tuning the models requires a significant amount of training time and a considerable amount of hardware. In this work, we propose CARE (Customer Assistance and Response Engine), a lightweight model made by fine-tuning Phi3.5-mini on very minimal hardware and data, designed to handle queries primarily across three domains: telecommunications support, medical support, and banking support. For telecommunications and banking, the chatbot addresses issues and problems faced by customers regularly in the above-mentioned domains. In the medical domain, CARE provides preliminary support by offering basic diagnoses and medical suggestions that a user might take before consulting a healthcare professional. Since CARE is built on Phi3.5-mini, it can be used even on mobile devices, increasing its usability. Our research also shows that CARE performs relatively well on various medical benchmarks, indicating that it can be used to make basic medical suggestions.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2503.14136

Country:

Asia > India > West Bengal > Kharagpur (0.05)
Asia > China > Hong Kong (0.04)

Genre: Research Report (1.00)

Industry:

Health & Medicine (0.49)
Telecommunications (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback

Fast Learning from Non-i.i.d. Observations

Neural Information Processing SystemsFeb-16-2024, 11:37:43 GMT

We prove an oracle inequality for generic regularized empirical risk minimization algorithms learning from \a -mixing processes. To illustrate this oracle inequality, we use it to derive learning rates for some learning methods including least squares SVMs. Since the proof of the oracle inequality uses recent localization ideas developed for independent and identically distributed (i.i.d.) processes, it turns out that these learning rates are close to the optimal rates known in the i.i.d.

fast learning, non-i, oracle inequality

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Fast Learning in Multi-Resolution Hierarchies

Neural Information Processing SystemsApr-6-2023, 19:57:59 GMT

A class of fast, supervised learning algorithms is presented. Inspired by Albus's CMAC model, the algorithms learn orders of magnitude more rapidly than typical imple(cid:173) mentations of back propagation, while often achieving comparable qualities of generalization. Furthermore, unlike most traditional function approximation methods, the algorithms are well suited for use in real time adaptive signal processing. Unlike simpler adaptive systems, such as linear predictive cod(cid:173) ing, the adaptive linear combiner, and the Kalman filter, the new algorithms are capable of efficiently capturing the structure of complicated non-linear systems. As an illustration, the algorithm is applied to the prediction of a chaotic timeseries.

algorithm, fast learning, multi-resolution hierarchy, (1 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)

Add feedback

Fast Learning with Predictive Forward Models

Neural Information Processing SystemsApr-6-2023, 19:18:49 GMT

A method for transforming performance evaluation signals distal both in space and time into proximal signals usable by supervised learning algo(cid:173) rithms, presented in [Jordan & Jacobs 90], is examined. A simple obser(cid:173) vation concerning differentiation through models trained with redundant inputs (as one of their networks is) explains a weakness in the original architecture and suggests a modification: an internal world model that encodes action-space exploration and, crucially, cancels input redundancy to the forward model is added. Learning time on an example task, cart(cid:173) pole balancing, is thereby reduced about 50 to 100 times.

cid, fast learning, predictive forward model

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.33)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.51)

Add feedback

Fast Learning by Bounding Likelihoods in Sigmoid Type Belief Networks

Neural Information Processing SystemsApr-6-2023, 18:28:12 GMT

Sigmoid type belief networks, a class of probabilistic neural net(cid:173) works, provide a natural framework for compactly representing probabilistic information in a variety of unsupervised and super(cid:173) vised learning problems. Often the parameters used in these net(cid:173) works need to be learned from examples. Unfortunately, estimat(cid:173) ing the parameters via exact probabilistic calculations (i.e, the EM-algorithm) is intractable even for networks with fairly small numbers of hidden units. We propose to avoid the infeasibility of the E step by bounding likelihoods instead of computing them ex(cid:173) actly. We introduce extended and complementary representations for these networks and show that the estimation of the network parameters can be made fast (reduced to quadratic optimization) by performing the estimation in either of the alternative domains.

bounding likelihood, fast learning, sigmoid type belief network, (2 more...)

Neural Information Processing Systems

Industry: Education > Focused Education > Special Education (0.30)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)

Add feedback

Fast Learning of Dynamic Hand Gesture Recognition with Few-Shot Learning Models

Schlüsener, Niels, Bücker, Michael

arXiv.org Artificial IntelligenceDec-16-2022

We develop Few-Shot Learning models trained to recognize five or ten different dynamic hand gestures, respectively, which are arbitrarily interchangeable by providing the model with one, two, or five examples per hand gesture. All models were built in the Few-Shot Learning architecture of the Relation Network (RN), in which Long-Short-Term Memory cells form the backbone. The models use hand reference points extracted from RGB-video sequences of the Jester dataset which was modified to contain 190 different types of hand gestures. Result show accuracy of up to 88.8% for recognition of five and up to 81.2% for ten dynamic hand gestures. The research also sheds light on the potential effort savings of using a Few-Shot Learning approach instead of a traditional Deep Learning approach to detect dynamic hand gestures. Savings were defined as the number of additional observations required when a Deep Learning model is trained on new hand gestures instead of a Few Shot Learning model. The difference with respect to the total number of observations required to achieve approximately the same accuracy indicates potential savings of up to 630 observations for five and up to 1260 observations for ten hand gestures to be recognized. Since labeling video recordings of hand gestures implies significant effort, these savings can be considered substantial.

artificial intelligence, machine learning, recognition, (13 more...)

arXiv.org Artificial Intelligence

2212.08363

Country: North America > United States > Washington > King County > Seattle (0.04)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Vision > Gesture Recognition (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Generalized vec trick for fast learning of pairwise kernel models

Viljanen, Markus, Airola, Antti, Pahikkala, Tapio

arXiv.org Machine LearningSep-2-2020

Pairwise learning corresponds to the supervised learning setting where the goal is to make predictions for pairs of objects. Prominent applications include predicting drug-target or protein-protein interactions, or customer-product preferences. Several kernel functions have been proposed for incorporating prior knowledge about the relationship between the objects, when training kernel based learning methods. However, the number of training pairs n is often very large, making O(n^2) cost of constructing the pairwise kernel matrix infeasible. If each training pair x= (d,t) consists of drug d and target t, let m and q denote the number of unique drugs and targets appearing in the training pairs. In many real-world applications m,q << n, which can be used to develop computational shortcuts. Recently, a O(nm+nq) time algorithm we refer to as the generalized vec trick was introduced for training kernel methods with the Kronecker kernel. In this work, we show that a large class of pairwise kernels can be expressed as a sum of product matrices, which generalizes the result to the most commonly used pairwise kernels. This includes symmetric and anti-symmetric, metric-learning, Cartesian, ranking, as well as linear, polynomial and Gaussian kernels. In the experiments, we demonstrate how the introduced approach allows scaling pairwise kernels to much larger data sets than previously feasible, and compare the kernels on a number of biological interaction prediction tasks.

artificial intelligence, inductive learning, machine learning, (16 more...)

arXiv.org Machine Learning

2009.01054

Country:

Europe > Finland > Southwest Finland > Turku (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > Portugal > Porto > Porto (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (0.70)

Add feedback

Fast Learning of Graph Neural Networks with Guaranteed Generalizability: One-hidden-layer Case

Zhang, Shuai, Wang, Meng, Liu, Sijia, Chen, Pin-Yu, Xiong, Jinjun

arXiv.org Machine LearningJun-24-2020

Although graph neural networks (GNNs) have made great progress recently on learning from graph-structured data in practice, their theoretical guarantee on generalizability remains elusive in the literature. In this paper, we provide a theoretically-grounded generalizability analysis of GNNs with one hidden layer for both regression and binary classification problems. Under the assumption that there exists a ground-truth GNN model (with zero generalization error), the objective of GNN learning is to estimate the ground-truth GNN parameters from the training data. To achieve this objective, we propose a learning algorithm that is built on tensor initialization and accelerated gradient descent. We then show that the proposed learning algorithm converges to the ground-truth GNN model for the regression problem, and to a model sufficiently close to the ground-truth for the binary classification problem. Moreover, for both cases, the convergence rate of the proposed learning algorithm is proven to be linear and faster than the vanilla gradient descent algorithm. We further explore the relationship between the sample complexity of GNNs and their underlying graph properties. Lastly, we provide numerical experiments to demonstrate the validity of our analysis and the effectiveness of the proposed learning algorithm for GNNs.

artificial intelligence, graph neural network, machine learning, (14 more...)

arXiv.org Machine Learning

2006.14117

Country:

Europe > Austria > Vienna (0.14)
North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Fast Learning from Non-i.i.d. Observations

Steinwart, Ingo, Christmann, Andreas

Neural Information Processing SystemsFeb-15-2020, 03:28:10 GMT

We prove an oracle inequality for generic regularized empirical risk minimization algorithms learning from $\a$-mixing processes. To illustrate this oracle inequality, we use it to derive learning rates for some learning methods including least squares SVMs. Since the proof of the oracle inequality uses recent localization ideas developed for independent and identically distributed (i.i.d.) processes, it turns out that these learning rates are close to the optimal rates known in the i.i.d. Papers published at the Neural Information Processing Systems Conference.

fast learning, non-i, oracle inequality

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Partial-Hessian Strategies for Fast Learning of Nonlinear Embeddings

Vladymyrov, Max, Carreira-Perpinan, Miguel

arXiv.org Machine LearningJun-18-2012

Stochastic neighbor embedding (SNE) and related nonlinear manifold learning algorithms achieve high-quality low-dimensional representations of similarity data, but are notoriously slow to train. We propose a generic formulation of embedding algorithms that includes SNE and other existing algorithms, and study their relation with spectral methods and graph Laplacians. This allows us to define several partial-Hessian optimization strategies, characterize their global and local convergence, and evaluate them empirically. We achieve up to two orders of magnitude speedup over existing training methods with a strategy (which we call the spectral direction) that adds nearly no overhead to the gradient and yet is simple, scalable and applicable to several existing and future embedding algorithms.

artificial intelligence, iteration, machine learning, (16 more...)

arXiv.org Machine Learning

1206.4646

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback