AITopics | learnable parameter

Knowledge Composition using Task Vectors with Learned Anisotropic Scaling Frederic Z. Zhang Paul Albert

Neural Information Processing SystemsFeb-16-2026, 01:35:16 GMT

Listed order was determined by a coin toss.

artificial intelligence, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(11 more...)

Genre:

Research Report > Experimental Study (0.92)
Research Report > New Finding (0.67)

Industry: Education (0.92)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Memory-Efficient Fine-Tuning of Compressed Large Language Models via sub-4-bit Integer Quantization

Neural Information Processing SystemsFeb-14-2026, 05:49:10 GMT

While parameter-efficient fine-tuning (PEFT) methods aim to reduce the memory usage of the optimizer state during fine-tuning, the inherent size of pre-trained LLM weights continues to be a pressing concern. Even though quantization techniques are widely proposed to ease memory demands and accelerate LLM inference, most of these techniques are geared towards the deployment phase.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
(4 more...)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

fb4ab556bc42d6f0ee0f9e24ec4d1af0-Supplemental.pdf

Neural Information Processing SystemsFeb-11-2026, 05:22:53 GMT

deepset 5 5, learnable parameter, particle, (15 more...)

Neural Information Processing Systems

Country: North America > Canada (0.04)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

5c60ee4d6e8faf0f3b2f2701c983dc8c-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-9-2026, 06:47:38 GMT

average length, dataset, knowledge, (11 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.15)
North America > Canada > British Columbia > Vancouver (0.05)

Technology: Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.42)

Add feedback

547108084f0c2af39b956f8eadb75d1b-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-9-2026, 00:58:58 GMT

dataset, molecule, substructure, (17 more...)

Neural Information Processing Systems

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.33)
Health & Medicine > Therapeutic Area > Immunology (0.33)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

45f0d179ef7e10eb7366550cd4e574ae-Supplemental-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 16:06:49 GMT

discriminator, experiment, histogram, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

37a413841a614b5414b333585e7613b8-Paper-Conference.pdf

Neural Information Processing SystemsFeb-8-2026, 08:06:30 GMT

computational linguistic, cpd kernel, kernel, (11 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(3 more...)

Genre: Research Report (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.69)

Add feedback

Deep Neural Networks with Box Convolutions

Neural Information Processing SystemsFeb-6-2026, 12:09:14 GMT

Box filters computed using integral images have been part of the computer vision toolset for a long time. Here, we show that a convolutional layer that computes box filter responses in a sliding manner can be used within deep architectures, whereas the dimensions and the offsets of the sliding boxes in such a layer can be learned as part of an end-to-end loss minimization. Crucially, the training process can make the size of the boxes in such a layer arbitrarily large without incurring extra computational cost and without the need to increase the number of learnable parameters. Due to its ability to integrate information over large boxes, the new layer facilitates long-range propagation of information and leads to the efficient increase of the receptive fields of downstream units in the network. By incorporating the new layer into existing architectures for semantic segmentation, we are able to achieve both the increase in segmentation accuracy as well as the decrease in the computational cost and the number of learnable parameters.

artificial intelligence, machine learning, neural information processing system 31, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.77)

Add feedback

Efficient Turing Machine Simulation with Transformers

Li, Qian, Wang, Yuyi

arXiv.org Artificial IntelligenceDec-3-2025

Constant bit-size Transformers are known to be Turing complete, but existing constructions require $Ω(s(n))$ chain-of-thought (CoT) steps per simulated Turing machine (TM) step, leading to impractical reasoning lengths. In this paper, we significantly reduce this efficiency gap by proving that any $(t(n),s(n))$-bounded multi-tape TM can be simulated by a constant bit-size Transformer with an optimal $O(s(n))$-long context window and only $O(s(n)^c)$ CoT steps per TM step, where $c>0$ can be made arbitrarily small by letting the Transformers' head-layer product sufficiently large. In addition, our construction shows that sparse attention with fixed geometric offsets suffices for efficient universal computation. Our proof leverages multi-queue TMs as a bridge. The main technical novelty is a more efficient simulation of multi-tape TMs by synchronous multi-queue TMs, improving both time and space complexity under stricter model assumptions.

arxiv preprint arxiv, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2512.00003

Country:

Europe > Austria > Vienna (0.14)
Asia > China > Guangdong Province > Shenzhen (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)

Add feedback

Calibration-Free EEG-based Driver Drowsiness Detection with Online Test-Time Adaptation

Jang, Geun-Deok, Han, Dong-Kyun, Park, Seo-Hyeon, Lee, Seong-Whan

arXiv.org Artificial IntelligenceDec-1-2025

Drowsy driving is a growing cause of traffic accidents, prompting recent exploration of electroencephalography (EEG)-based drowsiness detection systems. However, the inherent variability of EEG signals due to psychological and physical factors necessitates a cumbersome calibration process. In particular, the inter-subject variability of EEG signals leads to a domain shift problem, which makes it challenging to generalize drowsiness detection models to unseen target subjects. To address these issues, we propose a novel driver drowsiness detection framework that leverages online test-time adaptation (TTA) methods to dynamically adjust to target subject distributions. Our proposed method updates the learnable parameters in batch normalization (BN) layers, while preserving pretrained normalization statistics, resulting in a modified configuration that ensures effective adaptation during test time. We incorporate a memory bank that dynamically manages streaming EEG segments, selecting samples based on their reliability determined by negative energy scores and persistence time. In addition, we introduce prototype learning to ensure robust predictions against distribution shifts over time. We validated our method on the sustained-attention driving dataset collected in a simulated environment, where drowsiness was estimated from delayed reaction times during monotonous lane-keeping tasks. Our experiments show that our method outperforms all baselines, achieving an average F1-score of 81.73\%, an improvement of 11.73\% over the best TTA baseline. This demonstrates that our proposed method significantly enhances the adaptability of EEG-based drowsiness detection systems in non-i.i.d. scenarios.

artificial intelligence, machine learning, statistics, (16 more...)

arXiv.org Artificial Intelligence

2511.2203

Country: Asia > South Korea > Seoul > Seoul (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Collaborating Authors

learnable parameter

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Knowledge Composition using Task Vectors with Learned Anisotropic Scaling Frederic Z. Zhang Paul Albert

Memory-Efficient Fine-Tuning of Compressed Large Language Models via sub-4-bit Integer Quantization

fb4ab556bc42d6f0ee0f9e24ec4d1af0-Supplemental.pdf

5c60ee4d6e8faf0f3b2f2701c983dc8c-Supplemental-Conference.pdf

547108084f0c2af39b956f8eadb75d1b-Supplemental-Conference.pdf

45f0d179ef7e10eb7366550cd4e574ae-Supplemental-Conference.pdf

37a413841a614b5414b333585e7613b8-Paper-Conference.pdf

Deep Neural Networks with Box Convolutions

Efficient Turing Machine Simulation with Transformers

Calibration-Free EEG-based Driver Drowsiness Detection with Online Test-Time Adaptation