AITopics | Geifman, Yonatan

Collaborating Authors

Geifman, Yonatan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Puzzle: Distillation-Based NAS for Inference-Optimized LLMs

Bercovich, Akhiad, Ronen, Tomer, Abramovich, Talor, Ailon, Nir, Assaf, Nave, Dabbah, Mohammad, Galil, Ido, Geifman, Amnon, Geifman, Yonatan, Golan, Izhak, Haber, Netanel, Karpas, Ehud, Koren, Roi, Levy, Itay, Molchanov, Pavlo, Mor, Shahar, Moshe, Zach, Nabwani, Najeeb, Puny, Omri, Rubin, Ran, Schen, Itamar, Shahaf, Ido, Tropp, Oren, Argov, Omer Ullman, Zilberstein, Ran, El-Yaniv, Ran

arXiv.org Artificial IntelligenceDec-8-2024

Large language models (LLMs) have demonstrated remarkable capabilities, but their adoption is limited by high computational costs during inference. While increasing parameter counts enhances accuracy, it also widens the gap between state-of-the-art capabilities and practical deployability. We present Puzzle, a framework to accelerate LLM inference on specific hardware while preserving their capabilities. Through an innovative application of neural architecture search (NAS) at an unprecedented scale, Puzzle systematically optimizes models with tens of billions of parameters under hardware constraints. Our approach utilizes blockwise local knowledge distillation (BLD) for parallel architecture exploration and employs mixed-integer programming for precise constraint optimization. We demonstrate the real-world impact of our framework through Llama-3.1-Nemotron-51B-Instruct (Nemotron-51B), a publicly available model derived from Llama-3.1-70B-Instruct. Nemotron-51B achieves a 2.17x inference throughput speedup, fitting on a single NVIDIA H100 GPU while preserving 98.4% of the original model's capabilities. Nemotron-51B currently stands as the most accurate language model capable of inference on a single GPU with large batch sizes. Remarkably, this transformation required just 45B training tokens, compared to over 15T tokens used for the 70B model it was derived from. This establishes a new paradigm where powerful models can be optimized for efficient deployment with only negligible compromise of their capabilities, demonstrating that inference performance, not parameter count alone, should guide model selection. With the release of Nemotron-51B and the presentation of the Puzzle framework, we provide practitioners immediate access to state-of-the-art language modeling capabilities at significantly reduced computational costs.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2411.19146

Country:

Europe (1.00)
Asia (1.00)
North America > United States (0.93)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Industry: Information Technology (0.49)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

SelectiveNet: A Deep Neural Network with an Integrated Reject Option

Geifman, Yonatan, El-Yaniv, Ran

arXiv.org Machine LearningJan-26-2019

We consider the problem of selective prediction (also known as reject option) in deep neural networks, and introduce SelectiveNet, a deep neural architecture with an integrated reject option. Existing rejection mechanisms are based mostly on a threshold over the prediction confidence of a pre-trained network. In contrast, SelectiveNet is trained to optimize both classification (or regression) and rejection simultaneously, end-to-end. The result is a deep neural network that is optimized over the covered domain. In our experiments, we show a consistently improved risk-coverage trade-off over several well-known classification and regression datasets, thus reaching new state-of-the-art results for deep selective classification.

deep learning, neural network, selectivenet, (20 more...)

arXiv.org Machine Learning

1901.09192

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Deep Active Learning with a Neural Architecture Search

Geifman, Yonatan, El-Yaniv, Ran

arXiv.org Machine LearningNov-19-2018

We consider active learning of deep neural networks. Most active learning works in this context have focused on studying effective querying mechanisms and assumed that an appropriate network architecture is a priori known for the problem at hand. We challenge this assumption and propose a novel active strategy whereby the learning algorithm searches for effective architectures on the fly, while actively learning. We apply our strategy using three known querying techniques (softmax response, MC-dropout, and coresets) and show that the proposed approach overwhelmingly outperforms active learning using fixed architectures.

architecture, deep learning, neural network, (18 more...)

arXiv.org Machine Learning

1811.07579

Genre:

Research Report (0.64)
Workflow (0.46)

Industry: Health & Medicine > Diagnostic Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Boosting Uncertainty Estimation for Deep Neural Classifiers

Geifman, Yonatan, Uziel, Guy, El-Yaniv, Ran

arXiv.org Machine LearningMay-27-2018

We consider the problem of uncertainty estimation in the context of (non-Bayesian) deep neural classification. All current methods are based on extracting uncertainty signals from a trained network optimized to solve the classification problem at hand. We demonstrate that such techniques tend to misestimate instances whose predictions are supposed to be highly confident. This deficiency is an artifact of the training process with SGD-like optimizers. Based on this observation, we develop an uncertainty estimation algorithm that "peels away" highly confident points sequentially and estimates their confidence using earlier snapshots of the trained model, before their uncertainty estimates are jittered. We present extensive experiments indicating that the proposed algorithm provides uncertainty estimates that are consistently better than the best known methods.

algorithm, deep learning, neural network, (21 more...)

arXiv.org Machine Learning

1805.08206

Genre: Research Report (0.82)

Add feedback

Selective Classification for Deep Neural Networks

Geifman, Yonatan, El-Yaniv, Ran

Neural Information Processing SystemsDec-31-2017

Selective classification techniques (also known as reject option) have not yet been considered in the context of deep neural networks (DNNs). These techniques can potentially significantly improve DNNs prediction performance by trading-off coverage. In this paper we propose a method to construct a selective classifier given a trained neural network. Our method allows a user to set a desired risk level. At test time, the classifier rejects instances as needed, to grant the desired risk (with high probability). Empirical results over CIFAR and ImageNet convincingly demonstrate the viability of our method, which opens up possibilities to operate DNNs in mission-critical applications. For example, using our method an unprecedented 2% error in top-5 ImageNet classification can be guaranteed with probability 99.9%, with almost 60% test coverage.

classifier, deep learning, neural network, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)

Add feedback