quanta
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > District of Columbia > Washington (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
- (2 more...)
QuanTA: Efficient High-Rank Fine-Tuning of LLMs with Quantum-Informed Tensor Adaptation
By leveraging quantum-inspired methods derived from quantum circuit structures, QuanTA enables efficient fine-tuning, surpassing the limitations of Low-Rank Adaptation (LoRA)---low-rank approximation may fail for complicated downstream tasks. Our approach is theoretically supported by the universality theorem and the rank representation theorem to achieve efficient high-rank adaptations. Experiments demonstrate that QuanTA significantly enhances commonsense reasoning, arithmetic reasoning, and scalability compared to traditional methods. Furthermore, QuanTA shows superior performance with fewer trainable parameters compared to other approaches and can be designed to integrate with existing fine-tuning algorithms for further improvement, providing a scalable and efficient solution for fine-tuning large language models and advancing state-of-the-art in natural language processing.
The Quantization Model of Neural Scaling
We propose the Quantization Model of neural scaling laws, explaining both the observed power law dropoff of loss with model and data size, and also the sudden emergence of new capabilities with scale. We derive this model from what we call the Quantization Hypothesis, where network knowledge and skills are quantized into discrete chunks (quanta). We show that when quanta are learned in order of decreasing use frequency, then a power law in use frequencies explains observed power law scaling of loss.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > Colorado (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (2 more...)
- Leisure & Entertainment > Sports > Football (1.00)
- Information Technology > Security & Privacy (0.67)
- Government > Regional Government > North America Government > United States Government (0.67)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)
- Information Technology > Hardware (0.94)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > District of Columbia > Washington (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
- (2 more...)
Improvement of Nuclide Detection through Graph Spectroscopic Analysis Framework and its Application to Nuclear Facility Upset Detection
Fernández, Pedro Rodríguez, Svinth, Christian, Hagen, Alex
We present a method to improve the detection limit for radionuclides using spectroscopic radiation detectors and the arrival time of each detected radiation quantum. We enable this method using a neural network with an attention mechanism. We illustrate the method on the detection of Cesium release from a nuclear facility during an upset, and our method shows $2\times$ improvement over the traditional spectroscopic method. We hypothesize that our method achieves this performance increase by modulating its detection probability by the overall rate of probable detections, specifically by adapting detection thresholds based on temporal event distributions and local spectral features, and show evidence to this effect. We believe this method is applicable broadly and may be more successful for radionuclides with more complicated decay chains than Cesium; we also note that our method can generalize beyond the addition of arrival time and could integrate other data about each detection event, such as pulse quality, location in detector, or even combining the energy and time from detections in different detectors.
- North America > United States (0.28)
- Asia > Japan > Honshū > Tōhoku > Fukushima Prefecture > Fukushima (0.04)
- Europe > Belgium > Flanders > Antwerp Province > Antwerp (0.04)
- (2 more...)
- Government > Military (1.00)
- Energy > Power Industry > Utilities > Nuclear (1.00)
QuanTA: Efficient High-Rank Fine-Tuning of LLMs with Quantum-Informed Tensor Adaptation
We propose Quantum-informed Tensor Adaptation (QuanTA), a novel, easy-to-implement, fine-tuning method with no inference overhead for large-scale pre-trained language models. By leveraging quantum-inspired methods derived from quantum circuit structures, QuanTA enables efficient high-rank fine-tuning, surpassing the limitations of Low-Rank Adaptation (LoRA)---low-rank approximation may fail for complicated downstream tasks. Our approach is theoretically supported by the universality theorem and the rank representation theorem to achieve efficient high-rank adaptations. Experiments demonstrate that QuanTA significantly enhances commonsense reasoning, arithmetic reasoning, and scalability compared to traditional methods. Furthermore, QuanTA shows superior performance with fewer trainable parameters compared to other approaches and can be designed to integrate with existing fine-tuning algorithms for further improvement, providing a scalable and efficient solution for fine-tuning large language models and advancing state-of-the-art in natural language processing.
The Quantization Model of Neural Scaling
We propose the Quantization Model of neural scaling laws, explaining both the observed power law dropoff of loss with model and data size, and also the sudden emergence of new capabilities with scale. We derive this model from what we call the Quantization Hypothesis, where network knowledge and skills are "quantized" into discrete chunks (quanta). We show that when quanta are learned in order of decreasing use frequency, then a power law in use frequencies explains observed power law scaling of loss. We tentatively find that the frequency at which these quanta are used in the training distribution roughly follows a power law corresponding with the empirical scaling exponent for language models, a prediction of our theory.
QuanTA: Efficient High-Rank Fine-Tuning of LLMs with Quantum-Informed Tensor Adaptation
Chen, Zhuo, Dangovski, Rumen, Loh, Charlotte, Dugan, Owen, Luo, Di, Soljačić, Marin
We propose Quantum-informed Tensor Adaptation (QuanTA), a novel, easy-to-implement, fine-tuning method with no inference overhead for large-scale pre-trained language models. By leveraging quantum-inspired methods derived from quantum circuit structures, QuanTA enables efficient high-rank fine-tuning, surpassing the limitations of Low-Rank Adaptation (LoRA)--low-rank approximation may fail for complicated downstream tasks. Our approach is theoretically supported by the universality theorem and the rank representation theorem to achieve efficient high-rank adaptations. Experiments demonstrate that QuanTA significantly enhances commonsense reasoning, arithmetic reasoning, and scalability compared to traditional methods. Furthermore, QuanTA shows superior performance with fewer trainable parameters compared to other approaches and can be designed to integrate with existing fine-tuning algorithms for further improvement, providing a scalable and efficient solution for fine-tuning large language models and advancing state-of-the-art in natural language processing.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > Colorado (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (2 more...)
- Leisure & Entertainment > Sports > Football (1.00)
- Government > Regional Government > North America Government > United States Government (0.67)
Token-Efficient Leverage Learning in Large Language Models
Zeng, Yuanhao, Wang, Min, Wang, Yihang, Shao, Yingxia
Large Language Models (LLMs) have excelled in various tasks but perform better in high-resource scenarios, which presents challenges in low-resource scenarios. Data scarcity and the inherent difficulty of adapting LLMs to specific tasks compound the challenge. To address the twin hurdles, we introduce \textbf{Leverage Learning}. We present a streamlined implement of this methodology called Token-Efficient Leverage Learning (TELL). TELL showcases the potential of Leverage Learning, demonstrating effectiveness across various LLMs and low-resource tasks, ranging from $10^4$ to $10^6$ tokens. It reduces task data requirements by up to nearly an order of magnitude compared to conventional Supervised Fine-Tuning (SFT) while delivering competitive performance. With the same amount of task data, TELL leads in improving task performance compared to SFT. We discuss the mechanism of Leverage Learning, suggesting it aligns with quantization hypothesis and explore its promising potential through empirical testing.
- Asia > China > Beijing > Beijing (0.05)
- Europe > Iceland (0.05)
- North America > United States > Washington > King County > Seattle (0.04)
- (4 more...)