AITopics | Ordentlich, Or

Collaborating Authors

Ordentlich, Or

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

NestQuant: Nested Lattice Quantization for Matrix Products and LLMs

Savkin, Semyon, Porat, Eitan, Ordentlich, Or, Polyanskiy, Yury

arXiv.org Artificial IntelligenceFeb-13-2025

Post-training quantization (PTQ) has emerged as a critical technique for efficient deployment of large language models (LLMs). This work proposes NestQuant, a novel PTQ scheme for weights and activations that is based on self-similar nested lattices. Recent work have mathematically shown such quantizers to be information-theoretically optimal for low-precision matrix multiplication. We implement a practical low-complexity version of NestQuant based on Gosset lattice, making it a drop-in quantizer for any matrix multiplication step (e.g., in self-attention, MLP etc). For example, NestQuant quantizes weights, KV-cache, and activations of Llama-3-8B to 4 bits, achieving perplexity of 6.6 on wikitext2. This represents more than 55% reduction in perplexity gap with respect to unquantized model (perplexity of 6.14) compared to state-of-the-art Meta's SpinQuant (perplexity 7.3). Comparisons on various LLM evaluation benchmarks also show a reduction in performance degradation induced by quantization.

large language model, natural language, quantization, (19 more...)

arXiv.org Artificial Intelligence

2502.0972

Country:

North America > United States (0.14)
Asia (0.14)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Optimal Quantization for Matrix Multiplication

Ordentlich, Or, Polyanskiy, Yury

arXiv.org Artificial IntelligenceOct-17-2024

Recent work in machine learning community proposed multiple methods for performing lossy compression (quantization) of large matrices. This quantization is important for accelerating matrix multiplication (main component of large language models), which is often bottlenecked by the speed of loading these matrices from memory. Unlike classical vector quantization and rate-distortion theory, the goal of these new compression algorithms is to be able to approximate not the matrices themselves, but their matrix product. Specifically, given a pair of real matrices $A,B$ an encoder (compressor) is applied to each of them independently producing descriptions with $R$ bits per entry. These representations subsequently are used by the decoder to estimate matrix product $A^\top B$. In this work, we provide a non-asymptotic lower bound on the mean squared error of this approximation (as a function of rate $R$) for the case of matrices $A,B$ with iid Gaussian entries. Algorithmically, we construct a universal quantizer based on nested lattices with an explicit guarantee of approximation error for any (non-random) pair of matrices $A$, $B$ in terms of only Frobenius norms $\|A\|_F, \|B\|_F$ and $\|A^\top B\|_F$. For iid Gaussian matrices our quantizer achieves the lower bound and is, thus, asymptotically optimal. A practical low-complexity version of our quantizer achieves performance quite close to optimal. In information-theoretic terms we derive rate-distortion function for matrix multiplication of iid Gaussian matrices.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2410.1378

Country:

North America > United States (0.45)
Asia > Middle East > Israel (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.66)

Add feedback

Statistical Inference with Limited Memory: A Survey

Berg, Tomer, Ordentlich, Or, Shayevitz, Ofer

arXiv.org Machine LearningDec-23-2023

The problem of statistical inference in its various forms has been the subject of decades-long extensive research. Most of the effort has been focused on characterizing the behavior as a function of the number of available samples, with far less attention given to the effect of memory limitations on performance. Recently, this latter topic has drawn much interest in the engineering and computer science literature. In this survey paper, we attempt to review the state-of-the-art of statistical inference under memory constraints in several canonical problems, including hypothesis testing, parameter estimation, and distribution property testing/estimation. We discuss the main results in this developing field, and by identifying recurrent themes, we extract some fundamental building blocks for algorithmic construction, as well as useful techniques for lower bound derivations.

artificial intelligence, machine learning, survey article, (18 more...)

arXiv.org Machine Learning

2312.15225

Country:

Europe (1.00)
North America > United States (0.27)

Genre:

Research Report (1.00)
Overview (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)

Add feedback

Spiked Covariance Estimation from Modulo-Reduced Measurements

Romanov, Elad, Ordentlich, Or

arXiv.org Machine LearningOct-10-2021

Consider the rank-1 spiked model: $\bf{X}=\sqrt{\nu}\xi \bf{u}+ \bf{Z}$, where $\nu$ is the spike intensity, $\bf{u}\in\mathbb{S}^{k-1}$ is an unknown direction and $\xi\sim \mathcal{N}(0,1),\bf{Z}\sim \mathcal{N}(\bf{0},\bf{I})$. Motivated by recent advances in analog-to-digital conversion, we study the problem of recovering $\bf{u}\in \mathbb{S}^{k-1}$ from $n$ i.i.d. modulo-reduced measurements $\bf{Y}=[\bf{X}]\mod \Delta$, focusing on the high-dimensional regime ($k\gg 1$). We develop and analyze an algorithm that, for most directions $\bf{u}$ and $\nu=\mathrm{poly}(k)$, estimates $\bf{u}$ to high accuracy using $n=\mathrm{poly}(k)$ measurements, provided that $\Delta\gtrsim \sqrt{\log k}$. Up to constants, our algorithm accurately estimates $\bf{u}$ at the smallest possible $\Delta$ that allows (in an information-theoretic sense) to recover $\bf{X}$ from $\bf{Y}$. A key step in our analysis involves estimating the probability that a line segment of length $\approx\sqrt{\nu}$ in a random direction $\bf{u}$ passes near a point in the lattice $\Delta \mathbb{Z}^k$. Numerical experiments show that the developed algorithm performs well even in a non-asymptotic setting.

artificial intelligence, machine learning, probability, (18 more...)

arXiv.org Machine Learning

2110.0115

Country: Asia > Middle East > Israel (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.46)

Add feedback

Constructing Multiclass Classifiers using Binary Classifiers Under Log-Loss

Ben-Yishai, Assaf, Ordentlich, Or

arXiv.org Machine LearningFeb-16-2021

The construction of multiclass classifiers from binary classifiers is studied in this paper, and performance is quantified by the regret, defined with respect to the Bayes optimal log-loss. We start by proving that the regret of the well known One vs. All (OVA) method is upper bounded by the sum of the regrets of its constituent binary classifiers. We then present a new method called Conditional OVA (COVA), and prove that its regret is given by the weighted sum of the regrets corresponding to the constituent binary classifiers. Lastly, we present a method termed Leveraged COVA (LCOVA), designated to reduce the regret of a multiclass classifier by breaking it down to independently optimized binary classifiers.

artificial intelligence, classifier, machine learning, (15 more...)

arXiv.org Machine Learning

2102.08184

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.95)

Add feedback

Multi-reference alignment in high dimensions: sample complexity and phase transition

Romanov, Elad, Bendory, Tamir, Ordentlich, Or

arXiv.org Machine LearningJul-22-2020

Multi-reference alignment entails estimating a signal in $\mathbb{R}^L$ from its circularly-shifted and noisy copies. This problem has been studied thoroughly in recent years, focusing on the finite-dimensional setting (fixed $L$). Motivated by single-particle cryo-electron microscopy, we analyze the sample complexity of the problem in the high-dimensional regime $L\to\infty$. Our analysis uncovers a phase transition phenomenon governed by the parameter $\alpha = L/(\sigma^2\log L)$, where $\sigma^2$ is the variance of the noise. When $\alpha>2$, the impact of the unknown circular shifts on the sample complexity is minor. Namely, the number of measurements required to achieve a desired accuracy $\varepsilon$ approaches $\sigma^2/\varepsilon$ for small $\varepsilon$; this is the sample complexity of estimating a signal in additive white Gaussian noise, which does not involve shifts. In sharp contrast, when $\alpha\leq 2$, the problem is significantly harder and the sample complexity grows substantially quicker with $\sigma^2$.

artificial intelligence, machine learning, sample complexity, (17 more...)

arXiv.org Machine Learning

2007.11482

Country: Asia > Middle East > Israel (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.46)

Add feedback