AITopics | Pananjady, Ashwin

Collaborating Authors

Pananjady, Ashwin

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Estimating stationary mass, frequency by frequency

Nakul, Milind, Muthukumar, Vidya, Pananjady, Ashwin

arXiv.org Machine LearningMar-17-2025

Suppose we observe a trajectory of length $n$ from an $\alpha$-mixing stochastic process over a finite but potentially large state space. We consider the problem of estimating the probability mass placed by the stationary distribution of any such process on elements that occur with a certain frequency in the observed sequence. We estimate this vector of probabilities in total variation distance, showing universal consistency in $n$ and recovering known results for i.i.d. sequences as special cases. Our proposed methodology carefully combines the plug-in (or empirical) estimator with a recently-proposed modification of the Good--Turing estimator called WingIt, which was originally developed for Markovian sequences. En route to controlling the error of our estimator, we develop new performance bounds on WingIt and the plug-in estimator for $\alpha$-mixing stochastic processes. Importantly, the extensively used method of Poissonization can no longer be applied in our non i.i.d. setting, and so we develop complementary tools -- including concentration inequalities for a natural self-normalized statistic of mixing sequences -- that may prove independently useful in the design and analysis of estimators for related problems.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Machine Learning

2503.12808

Country: Europe > France (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Add feedback

Just Wing It: Optimal Estimation of Missing Mass in a Markovian Sequence

Pananjady, Ashwin, Muthukumar, Vidya, Thangaraj, Andrew

arXiv.org Machine LearningApr-8-2024

We study the problem of estimating the stationary mass -- also called the unigram mass -- that is missing from a single trajectory of a discrete-time, ergodic Markov chain. This problem has several applications -- for example, estimating the stationary missing mass is critical for accurately smoothing probability estimates in sequence models. While the classical Good--Turing estimator from the 1950s has appealing properties for i.i.d. data, it is known to be biased in the Markov setting, and other heuristic estimators do not come equipped with guarantees. Operating in the general setting in which the size of the state space may be much larger than the length $n$ of the trajectory, we develop a linear-runtime estimator called \emph{Windowed Good--Turing} (\textsc{WingIt}) and show that its risk decays as $\widetilde{\mathcal{O}}(\mathsf{T_{mix}}/n)$, where $\mathsf{T_{mix}}$ denotes the mixing time of the chain in total variation distance. Notably, this rate is independent of the size of the state space and minimax-optimal up to a logarithmic factor in $n / \mathsf{T_{mix}}$. We also present a bound on the variance of the missing mass random variable, which may be of independent interest. We extend our estimator to approximate the stationary mass placed on elements occurring with small frequency in $X^n$. Finally, we demonstrate the efficacy of our estimators both in simulations on canonical chains and on sequences constructed from a popular natural language corpus.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Machine Learning

2404.05819

Country: North America > United States (0.45)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Add feedback

Efficient reductions between some statistical models

Lou, Mengqi, Bresler, Guy, Pananjady, Ashwin

arXiv.org Machine LearningFeb-12-2024

We study the problem of approximately transforming a sample from a source statistical model to a sample from a target statistical model without knowing the parameters of the source model, and construct several computationally efficient such reductions between statistical experiments. In particular, we provide computationally efficient procedures that approximately reduce uniform, Erlang, and Laplace location models to general target families. We illustrate our methodology by establishing nonasymptotic reductions between some canonical high-dimensional problems, spanning mixtures of experts, phase retrieval, and signal denoising. Notably, the reductions are structure preserving and can accommodate missing data. We also point to a possible application in transforming one differentially private mechanism to another.

artificial intelligence, machine learning, reduction, (18 more...)

arXiv.org Machine Learning

2402.07717

Country: Asia > India (0.14)

Genre: Research Report (0.81)

Industry: Information Technology (0.45)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

Hyperparameter tuning via trajectory predictions: Stochastic prox-linear methods in matrix sensing

Lou, Mengqi, Verchand, Kabir Aladin, Pananjady, Ashwin

arXiv.org Machine LearningFeb-2-2024

This model finds applications in diverse areas of science and engineering, including astronomy, medical imaging, and communications (Jefferies and Christou, 1993; W ang and Poor, 1998; Campisi and Egiazarian, 2017). For instance, it forms an example of the blind deconvolution problem in statistical signal processing (see, e.g., Recht et al. (2010); Ahmed et al. (2013) and the references therein for several applications of this problem). W e are interested in the model-fitting problem, and the natural least squares population objectiveL: R d R d R (corresponding to the scaled negative log-likelihood of our observations under Gaussian noise) can be written as L ( µ, ν) = E nullnull y x, µ z, ν null 2null, (2) where the conditional distribution of y given x, z is as specified by the model (1). Note that L is a jointly nonconvex function in the parameters ( µ, ν) . With the goal of minimizing the population loss L, we consider online algorithms which operate on a mini-batch of size m with 1 m d for which we draw a fresh 1 set of observations { y i, x i, z i} m i = 1 at each iteration and form the averaged loss L m( µ, ν) = 1 m m i = 1 null y i x i, µ z i, ν null 2 .

artificial intelligence, machine learning, null, (19 more...)

arXiv.org Machine Learning

2402.01599

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.13)

Genre:

Workflow (0.68)
Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)

Add feedback

Modeling and Correcting Bias in Sequential Evaluation

Wang, Jingyan, Pananjady, Ashwin

arXiv.org Machine LearningNov-16-2023

We consider the problem of sequential evaluation, in which an evaluator observes candidates in a sequence and assigns scores to these candidates in an online, irrevocable fashion. Motivated by the psychology literature that has studied sequential bias in such settings -- namely, dependencies between the evaluation outcome and the order in which the candidates appear -- we propose a natural model for the evaluator's rating process that captures the lack of calibration inherent to such a task. We conduct crowdsourcing experiments to demonstrate various facets of our model. We then proceed to study how to correct sequential bias under our model by posing this as a statistical inference problem. We propose a near-linear time, online algorithm for this task and prove guarantees in terms of two canonical ranking metrics. We also prove that our algorithm is information theoretically optimal, by establishing matching lower bounds in both metrics. Finally, we perform a host of numerical experiments to show that our algorithm often outperforms the de facto method of using the rankings induced by the reported scores, both in simulation and on the crowdsourcing data that we collected.

artificial intelligence, estimator, machine learning, (21 more...)

arXiv.org Machine Learning

2205.01607

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.92)

Industry:

Education (0.92)
Leisure & Entertainment > Sports (0.46)
Media > Music (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications > Social Media > Crowdsourcing (0.54)

Add feedback

Perceptual adjustment queries and an inverted measurement paradigm for low-rank metric learning

Xu, Austin, McRae, Andrew D., Wang, Jingyan, Davenport, Mark A., Pananjady, Ashwin

arXiv.org Machine LearningSep-8-2023

We introduce a new type of query mechanism for collecting human feedback, called the perceptual adjustment query ( PAQ). Being both informative and cognitively lightweight, the PAQ adopts an inverted measurement scheme, and combines advantages from both cardinal and ordinal queries. We showcase the PAQ in the metric learning problem, where we collect PAQ measurements to learn an unknown Mahalanobis distance. This gives rise to a high-dimensional, low-rank matrix estimation problem to which standard matrix estimators cannot be applied. Consequently, we develop a two-stage estimator for metric learning from PAQs, and provide sample complexity guarantees for this estimator. We present numerical simulations demonstrating the performance of the estimator and its notable properties.

artificial intelligence, machine learning, matrix, (17 more...)

arXiv.org Machine Learning

2309.04626

Country:

North America > United States (0.14)
Europe > United Kingdom > England (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback

Sharp analysis of EM for learning mixtures of pairwise differences

Dhawan, Abhishek, Mao, Cheng, Pananjady, Ashwin

arXiv.org Artificial IntelligenceJun-22-2023

A common approach is to apply a spectral algorithm to initialize the parameters of interest, and then run a locally convergent nonconvex optimization algorithm; alternatively, one could even run the nonconvex algorithm from a random initialization. These nonconvex optimization algorithms have been extensively analyzed under Gaussian assumptions on the covariates (see, e.g. Balakrishnan et al. (2017); Klusowski et al. (2019); Chen et al. (2019); Kwon and Caramanis (2020)), and it is known that they can exhibit favorable behavior globally in some settings. In many tasks such as ranking (Bradley and Terry, 1952), crowd-labeling (Dawid and Skene, 1979) and web search (Chen et al., 2013), however, measurements are taken as pairwise comparisons between entities, which renders the covariates inherently discrete. These designs or covariates are far from Gaussian, and it is natural to ask how standard nonconvex optimization algorithms now perform. Motivated by this question, we study how the celebrated expectation-maximization (EM) algorithm behaves when estimating a mixture of symmetric linear regressions under the pairwise comparison design.

artificial intelligence, machine learning, optimization problem, (19 more...)

arXiv.org Artificial Intelligence

2302.10066

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)

Add feedback

A Dual Accelerated Method for Online Stochastic Distributed Averaging: From Consensus to Decentralized Policy Evaluation

Zhang, Sheng, Pananjady, Ashwin, Romberg, Justin

arXiv.org Artificial IntelligenceAug-8-2022

Motivated by decentralized sensing and policy evaluation problems, we consider a particular type of distributed stochastic optimization problem over a network, called the online stochastic distributed averaging problem. We design a dual-based method for this distributed consensus problem with Polyak--Ruppert averaging and analyze its behavior. We show that the proposed algorithm attains an accelerated deterministic error depending optimally on the condition number of the network, and also that it has an order-optimal stochastic error. This improves on the guarantees of state-of-the-art distributed stochastic optimization algorithms when specialized to this setting, and yields -- among other things -- corollaries for decentralized policy evaluation. Our proofs rely on explicitly studying the evolution of several relevant linear systems, and may be of independent interest. Numerical experiments are provided, which validate our theoretical results and demonstrate that our approach outperforms existing methods in finite-sample scenarios on several natural network topologies.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2207.11425

Country: North America > United States (0.45)

Genre: Research Report (0.50)

Industry: Energy > Power Industry (0.45)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Alternating minimization for generalized rank one matrix sensing: Sharp predictions from a random initialization

Chandrasekher, Kabir Aladin, Lou, Mengqi, Pananjady, Ashwin

arXiv.org Machine LearningJul-20-2022

We consider the problem of estimating the factors of a rank-$1$ matrix with i.i.d. Gaussian, rank-$1$ measurements that are nonlinearly transformed and corrupted by noise. Considering two prototypical choices for the nonlinearity, we study the convergence properties of a natural alternating update rule for this nonconvex optimization problem starting from a random initialization. We show sharp convergence guarantees for a sample-split version of the algorithm by deriving a deterministic recursion that is accurate even in high-dimensional problems. Notably, while the infinite-sample population update is uninformative and suggests exact recovery in a single step, the algorithm -- and our deterministic prediction -- converges geometrically fast from a random initialization. Our sharp, non-asymptotic analysis also exposes several other fine-grained properties of this problem, including how the nonlinearity and noise level affect convergence behavior. On a technical level, our results are enabled by showing that the empirical error recursion can be predicted by our deterministic sequence within fluctuations of the order $n^{-1/2}$ when each iteration is run with $n$ observations. Our technique leverages leave-one-out tools originating in the literature on high-dimensional $M$-estimation and provides an avenue for sharply analyzing higher-order iterative algorithms from a random initialization in other high-dimensional optimization problems with random data.

artificial intelligence, machine learning, optimization problem, (19 more...)

arXiv.org Machine Learning

2207.0966

Country: North America > United States (0.28)

Genre:

Workflow (0.67)
Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.92)

Add feedback

Accelerated and instance-optimal policy evaluation with linear function approximation

Li, Tianjiao, Lan, Guanghui, Pananjady, Ashwin

arXiv.org Machine LearningDec-24-2021

We study the problem of policy evaluation with linear function approximation and present efficient and practical algorithms that come with strong optimality guarantees. We begin by proving lower bounds that establish baselines on both the deterministic error and stochastic error in this problem. In particular, we prove an oracle complexity lower bound on the deterministic error in an instance-dependent norm associated with the stationary distribution of the transition kernel, and use the local asymptotic minimax machinery to prove an instance-dependent lower bound on the stochastic error in the i.i.d. observation model. Existing algorithms fail to match at least one of these lower bounds: To illustrate, we analyze a variance-reduced variant of temporal difference learning, showing in particular that it fails to achieve the oracle complexity lower bound. To remedy this issue, we develop an accelerated, variance-reduced fast temporal difference algorithm (VRFTD) that simultaneously matches both lower bounds and attains a strong notion of instance-optimality. Finally, we extend the VRFTD algorithm to the setting with Markovian observations, and provide instance-dependent convergence results that match those in the i.i.d. setting up to a multiplicative factor that is proportional to the mixing time of the chain. Our theoretical guarantees of optimality are corroborated by numerical experiments.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Machine Learning

2112.13109

Country: North America > United States (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (0.61)

Add feedback