AITopics

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry:

Information Technology > Security & Privacy (0.68)
Health & Medicine > Diagnostic Medicine > Imaging (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Neural Information Processing SystemsFeb-16-2026, 04:54:54 GMT

Continual Learning with Global Alignment

Specifically, we learn the data representation as a task-specific composition of pre-trained token representations shared across all tasks. Then the correlations between different tasks' data representations are grounded

artificial intelligence, machine learning, natural language, (17 more...)

Country:

North America > United States > New York > Suffolk County > Stony Brook (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry:

Education (0.68)
Government (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Neural Information Processing SystemsFeb-10-2026, 02:57:45 GMT

MaximumClassSeparationasInductiveBias inOneMatrix

The main observation behind our approach is that separation does not require optimization butcan besolvedinclosed-form prior totraining and plugged into a network.

artificial intelligence, machine learning, maximum separation, (15 more...)

Country: Asia > China > Henan Province > Zhengzhou (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Dragutinović, Sara, Saxe, Andrew M., Singh, Aaditya K.

Softmax $\geq$ Linear: Transformers may learn to classify in-context by kernel gradient descent

arXiv.org Artificial IntelligenceOct-14-2025

The remarkable ability of transformers to learn new concepts solely by reading examples within the input prompt, termed in-context learning (ICL), is a crucial aspect of intelligent behavior. Here, we focus on understanding the learning algorithm transformers use to learn from context. Existing theoretical work, often based on simplifying assumptions, has primarily focused on linear self-attention and continuous regression tasks, finding transformers can learn in-context by gradient descent. Given that transformers are typically trained on discrete and complex tasks, we bridge the gap from this existing work to the setting of classification, with non-linear (importantly, softmax) activation. We find that transformers still learn to do gradient descent in-context, though on functionals in the kernel feature space and with a context-adaptive learning rate in the case of softmax transformer. These theoretical findings suggest a greater adaptability to context for softmax attention, which we empirically verify and study through ablations. Overall, we hope this enhances theoretical understanding of in-context learning algorithms in more realistic settings, pushes forward our intuitions and enables further theory bridging to larger models.

artificial intelligence, machine learning, transformer, (13 more...)

2510.10425

Genre: Research Report > New Finding (0.88)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.83)

Neural Information Processing SystemsOct-10-2025, 07:42:52 GMT

Continual Learning with Global Alignment

data representation, pre-trained token representation, representation, (14 more...)

Country:

North America > United States > New York > Suffolk County > Stony Brook (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)

Genre: Research Report > Experimental Study (0.93)

Industry:

Education (0.68)
Government (0.46)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Neural Information Processing SystemsAug-16-2025, 06:57:01 GMT

Maximum Class Separation as Inductive Bias in One Matrix

The main observation behind our approach is that separation does not require optimization but can be solved in closed-form prior to training and plugged into a network.

artificial intelligence, machine learning, maximum separation, (16 more...)

Country: Europe > Netherlands > North Holland > Amsterdam (0.04)

Genre: Research Report (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

arXiv.org Artificial IntelligenceFeb-10-2025

MEMHD: Memory-Efficient Multi-Centroid Hyperdimensional Computing for Fully-Utilized In-Memory Computing Architectures

Kang, Do Yeong, Oh, Yeong Hwan, Hwang, Chanwook, Kim, Jinhee, Jeon, Kang Eun, Ko, Jong Hwan

The implementation of Hyperdimensional Computing (HDC) on In-Memory Computing (IMC) architectures faces significant challenges due to the mismatch between highdimensional vectors and IMC array sizes, leading to inefficient memory utilization and increased computation cycles. This paper presents MEMHD, a Memory-Efficient Multi-centroid HDC framework designed to address these challenges. MEMHD introduces a clustering-based initialization method and quantization aware iterative learning for multi-centroid associative memory. Through these approaches and its overall architecture, MEMHD achieves a significant reduction in memory requirements while maintaining or improving classification accuracy. Our approach achieves full utilization of IMC arrays and enables one-shot (or few-shot) associative search. Experimental results demonstrate that MEMHD outperforms state-of-the-art binary HDC models, achieving up to 13.69% higher accuracy with the same memory usage, or 13.25x more memory efficiency at the same accuracy level. Moreover, MEMHD reduces computation cycles by up to 80x and array usage by up to 71x compared to baseline IMC mapping methods when mapped to 128x128 IMC arrays, while significantly improving energy and computation cycle efficiency.

artificial intelligence, machine learning, vector, (17 more...)

2502.07834

Country:

Asia > South Korea > Gyeonggi-do > Suwon (0.04)
North America > United States > Oregon > Washington County > Beaverton (0.04)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)

Cho, Gyusang, Youn, Chan-Hyun

Tilt and Average : Geometric Adjustment of the Last Layer for Recalibration

arXiv.org Artificial IntelligenceJun-14-2024

After the revelation that neural networks tend to produce overconfident predictions, the problem of calibration, which aims to align confidence with accuracy to enhance the reliability of predictions, has gained significant importance. Several solutions based on calibration maps have been proposed to address the problem of recalibrating a trained classifier using additional datasets. In this paper, we offer an algorithm that transforms the weights of the last layer of the classifier, distinct from the calibration-map-based approach. We concentrate on the geometry of the final linear layer, specifically its angular aspect, and adjust the weights of the corresponding layer. We name the method Tilt and Average(\textsc{Tna}), and validate the calibration effect empirically and theoretically. Through this, we demonstrate that our approach, in addition to the existing calibration-map-based techniques, can yield improved calibration performance. Code available : https://github.com/GYYYYYUUUUU/TNA_Angular_Scaling.

artificial intelligence, machine learning, tilt and average, (16 more...)

2406.10017

Country:

Europe > Austria > Vienna (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > Canada > Ontario > Toronto (0.04)
(3 more...)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Rouzrokh, Pouria, Khosravi, Bardia, Faghani, Shahriar, Moassefi, Mana, Vahdati, Sanaz, Erickson, Bradley J.

Multitask Brain Tumor Inpainting with Diffusion Models: A Methodological Report

arXiv.org Artificial IntelligenceMar-30-2023

Despite the ever-increasing interest in applying deep learning (DL) models to medical imaging, the typical scarcity and imbalance of medical datasets can severely impact the performance of DL models. The generation of synthetic data that might be freely shared without compromising patient privacy is a well-known technique for addressing these difficulties. Inpainting algorithms are a subset of DL generative models that can alter one or more regions of an input image while matching its surrounding context and, in certain cases, non-imaging input conditions. Although the majority of inpainting techniques for medical imaging data use generative adversarial networks (GANs), the performance of these algorithms is frequently suboptimal due to their limited output variety, a problem that is already well-known for GANs. Denoising diffusion probabilistic models (DDPMs) are a recently introduced family of generative networks that can generate results of comparable quality to GANs, but with diverse outputs. In this paper, we describe a DDPM to execute multiple inpainting tasks on 2D axial slices of brain MRI with various sequences, and present proof-of-concept examples of its performance in a variety of evaluation scenarios. Our model and a public online interface to try our tool are available here.

artificial intelligence, diffusion model, machine learning, (15 more...)

2210.12113

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.51)

Industry:

Health & Medicine > Nuclear Medicine (1.00)
Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.90)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

arXiv.org Artificial IntelligenceOct-26-2022

Principal Component Classification

Dahyot, Rozenn

We propose to directly compute classification estimates by learning features encoded with their class scores using PCA. Our resulting model has a encoder-decoder structure suitable for supervised learning, it is computationally efficient and performs well for classification on several datasets.

artificial intelligence, machine learning, principal component, (15 more...)

2210.12746

Country:

North America > United States (0.04)
Europe > Serbia > Central Serbia > Belgrade (0.04)
Europe > Ireland (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)