AITopics | icb

Collaborating Authors

icb

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Scalable iterative pruning of large language and vision models using block coordinate descent

Rosenberg, Gili, Brubaker, J. Kyle, Schuetz, Martin J. A., Zhu, Elton Yechao, Kadıoğlu, Serdar, Borujeni, Sima E., Katzgraber, Helmut G.

arXiv.org Artificial IntelligenceNov-26-2024

Pruning neural networks, which involves removing a fraction of their weights, can often maintain high accuracy while significantly reducing model complexity, at least up to a certain limit. We present a neural network pruning technique that builds upon the Combinatorial Brain Surgeon, but solves an optimization problem over a subset of the network weights in an iterative, block-wise manner using block coordinate descent. The iterative, block-based nature of this pruning technique, which we dub ``iterative Combinatorial Brain Surgeon'' (iCBS) allows for scalability to very large models, including large language models (LLMs), that may not be feasible with a one-shot combinatorial optimization approach. When applied to large models like Mistral and DeiT, iCBS achieves higher performance metrics at the same density levels compared to existing pruning methods such as Wanda. This demonstrates the effectiveness of this iterative, block-wise pruning method in compressing and optimizing the performance of large deep learning models, even while optimizing over only a small fraction of the weights. Moreover, our approach allows for a quality-time (or cost) tradeoff that is not available when using a one-shot pruning technique alone. The block-wise formulation of the optimization problem enables the use of hardware accelerators, potentially offsetting the increased computational costs compared to one-shot pruning methods like Wanda. In particular, the optimization problem solved for each block is quantum-amenable in that it could, in principle, be solved by a quantum computer.

machine learning, natural language, pruning, (17 more...)

arXiv.org Artificial Intelligence

2411.17796

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > Washington > King County > Seattle (0.04)
North America > United States > California > Los Angeles County > Pasadena (0.04)

Genre: Research Report > New Finding (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

KANICE: Kolmogorov-Arnold Networks with Interactive Convolutional Elements

Ferdaus, Md Meftahul, Abdelguerfi, Mahdi, Ioup, Elias, Dobson, David, Niles, Kendall N., Pathak, Ken, Sloan, Steven

arXiv.org Artificial IntelligenceOct-22-2024

We introduce KANICE (Kolmogorov-Arnold Networks with Interactive Convolutional Elements), a novel neural architecture that combines Convolutional Neural Networks (CNNs) with Kolmogorov-Arnold Network (KAN) principles. KANICE integrates Interactive Convolutional Blocks (ICBs) and KAN linear layers into a CNN framework. This leverages KANs' universal approximation capabilities and ICBs' adaptive feature learning. KANICE captures complex, non-linear data relationships while enabling dynamic, context-dependent feature extraction based on the Kolmogorov-Arnold representation theorem. We evaluated KANICE on four datasets: MNIST, Fashion-MNIST, EMNIST, and SVHN, comparing it against standard CNNs, CNN-KAN hybrids, and ICB variants. KANICE consistently outperformed baseline models, achieving 99.35% accuracy on MNIST and 90.05% on the SVHN dataset. Furthermore, we introduce KANICE-mini, a compact variant designed for efficiency. A comprehensive ablation study demonstrates that KANICE-mini achieves comparable performance to KANICE with significantly fewer parameters. KANICE-mini reached 90.00% accuracy on SVHN with 2,337,828 parameters, compared to KANICE's 25,432,000. This study highlights the potential of KAN-based architectures in balancing performance and computational efficiency in image classification tasks. Our work contributes to research in adaptive neural networks, integrates mathematical theorems into deep learning architectures, and explores the trade-offs between model complexity and performance, advancing computer vision and pattern recognition. The source code for this paper is publicly accessible through our GitHub repository (https://github.com/m-ferdaus/kanice).

artificial intelligence, kanice, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2410.17172

Country:

North America > United States > Louisiana > East Baton Rouge Parish > Baton Rouge (0.05)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > Mississippi > Warren County > Vicksburg (0.04)
(3 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.94)

Industry:

Government > Military (0.94)
Government > Regional Government > North America Government > United States Government (0.94)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback

Bounding generalization error with input compression: An empirical study with infinite-width networks

Galloway, Angus, Golubeva, Anna, Salem, Mahmoud, Nica, Mihai, Ioannou, Yani, Taylor, Graham W.

arXiv.org Artificial IntelligenceJul-19-2022

Estimating the Generalization Error (GE) of Deep Neural Networks (DNNs) is an important task that often relies on availability of held-out data. The ability to better predict GE based on a single training set may yield overarching DNN design principles to reduce a reliance on trial-and-error, along with other performance assessment advantages. In search of a quantity relevant to GE, we investigate the Mutual Information (MI) between the input and final layer representations, using the infinite-width DNN limit to bound MI. An existing input compression-based GE bound is used to link MI and GE. To the best of our knowledge, this represents the first empirical study of this bound. In our attempt to empirically falsify the theoretical bound, we find that it is often tight for best-performing models. Furthermore, it detects randomization of training labels in many cases, reflects test-time perturbation robustness, and works well given only few training samples. These results are promising given that input compression is broadly applicable where MI can be estimated with confidence.

classification task, generalization, icb, (16 more...)

arXiv.org Artificial Intelligence

2207.09408

Country:

North America > Canada > Ontario > Wellington County > Guelph (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > Canada > Alberta > Census Division No. 6 > Calgary Metropolitan Region > Calgary (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Government > Regional Government > North America Government > United States Government (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Inverse Contextual Bandits: Learning How Behavior Evolves over Time

Hüyük, Alihan, Jarrett, Daniel, van der Schaar, Mihaela

arXiv.org Machine LearningJul-13-2021

Understanding an agent's priorities by observing their behavior is critical for transparency and accountability in decision processes, such as in healthcare. While conventional approaches to policy learning almost invariably assume stationarity in behavior, this is hardly true in practice: Medical practice is constantly evolving, and clinical professionals are constantly fine-tuning their priorities. We desire an approach to policy learning that provides (1) interpretable representations of decision-making, accounts for (2) non-stationarity in behavior, as well as operating in an (3) offline manner. First, we model the behavior of learning agents in terms of contextual bandits, and formalize the problem of inverse contextual bandits (ICB). Second, we propose two algorithms to tackle ICB, each making varying degrees of assumptions regarding the agent's learning strategy. Finally, through both real and simulated data for liver transplantations, we illustrate the applicability and explainability of our method, as well as validating its accuracy.

agent, icb, transplantation, (16 more...)

arXiv.org Machine Learning

2107.06317

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

ICBS: Improved Conflict-Based Search Algorithm for Multi-Agent Pathfinding

AAAI ConferencesJul-15-2015

Conflict-Based Search (CBS) and its enhancements, Meta-Agent CBS and bypassing conflicts are amongst the strongest newly introduced algorithms for Multi-Agent Path Finding. This paper introduces two new improvements to CBS and incorporates them into a coherent, improved version of CBS, namely ICBS. Experimental results show that each of these improvements further reduces the runtime over the existing CBS-based approaches. When all improvements are combined, an even larger improvement is achieved, producing state-of-the art results for a number of domains.

agent, artificial intelligence, conflict, (17 more...)

AAAI Conferences

Twenty-Fourth International Joint Conference on Artificial Intelligence

Country: Asia > Middle East > Israel (0.14)

Genre: Research Report > New Finding (0.48)

Industry: Energy > Oil & Gas (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

ICBS: The Improved Conflict-Based Search Algorithm for Multi-Agent Pathfinding

Boyarski, Eli (Bar-Ilan University) | Felner, Ariel (Ben-Gurion University of the Negev) | Stern, Roni (Ben-Gurion University of the Negev) | Sharon, Guni (Ben-Gurion University of the Negev) | Betzalel, Oded (Ben-Gurion University of the Negev) | Tolpin, David (Ben-Gurion University of the Negev) | Shimony, Eyal (Ben-Gurion University of the Negev)

AAAI ConferencesMay-21-2015

Conflict-Based Search (CBS) and its generalization, Meta-Agent CBS are amongst the strongest newly introduced algorithms for Multi-Agent Path Finding. This paper introduces ICBS, an improved version of CBS. ICBS incorporates three orthogonal improvements to CBS which are systematically described and studied. Experimental results show that each of these improvements reduces the runtime over basic CBS by up to 20x in many cases. When all three improvements are combined, an even larger improvement is achieved, producing state-ofthe art results for a number of domains.

agent, conflict, constraint, (12 more...)

AAAI Conferences

Eighth Annual Symposium on Combinatorial Search

Country: Asia > Middle East > Israel (0.06)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.65)

Add feedback