AITopics | Ohannessian, Mesrob I.

Collaborating Authors

Ohannessian, Mesrob I.

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Tight Bounds on the Binomial CDF, and the Minimum of i.i.d Binomials, in terms of KL-Divergence

Zhu, Xiaohan, Ohannessian, Mesrob I., Srebro, Nathan

arXiv.org Machine LearningFeb-25-2025

We provide finite sample upper and lower bounds on the Binomial tail probability which are a direct application of Sanov's theorem. We then use these to obtain high probability upper and lower bounds on the minimum of i.i.d. Both bounds are finite sample, asymptotically tight, and expressed in terms of the KL-divergence. The purpose of this note is to provide, in a self-contained and concise way, both upper and lower bounds on the Binomial tail, and through that, on the minimum of i.i.d. The upper bound on the minimum of i.i.d.

artificial intelligence, binomial, machine learning, (13 more...)

arXiv.org Machine Learning

2502.18611

Country: North America > United States > Illinois (0.16)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.48)

Add feedback

See Me and Believe Me: Causality and Intersectionality in Testimonial Injustice in Healthcare

Andrews, Kenya S., Ohannessian, Mesrob I., Zheleva, Elena

arXiv.org Artificial IntelligenceOct-2-2024

In medical settings, it is critical that all who are in need of care are correctly heard and understood. When this is not the case due to prejudices a listener has, the speaker is experiencing \emph{testimonial injustice}, which, building upon recent work, we quantify by the presence of several categories of unjust vocabulary in medical notes. In this paper, we use FCI, a causal discovery method, to study the degree to which certain demographic features could lead to marginalization (e.g., age, gender, and race) by way of contributing to testimonial injustice. To achieve this, we review physicians' notes for each patient, where we identify occurrences of unjust vocabulary, along with the demographic features present, and use causal discovery to build a Structural Causal Model (SCM) relating those demographic features to testimonial injustice. We analyze and discuss the resulting SCMs to show the interaction of these factors and how they influence the experience of injustice. Despite the potential presence of some confounding variables, we observe how one contributing feature can make a person more prone to experiencing another contributor of testimonial injustice. There is no single root of injustice and thus intersectionality cannot be ignored. These results call for considering more than singular or equalized attributes of who a person is when analyzing and improving their experiences of bias and injustice. This work is thus a first foray at using causal discovery to understand the nuanced experiences of patients in medical settings, and its insights could be used to guide design principles throughout healthcare, to build trust and promote better patient care.

artificial intelligence, natural language, testimonial injustice, (15 more...)

arXiv.org Artificial Intelligence

2410.01227

Country: North America > United States > Illinois (0.14)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Consumer Health (0.93)
Health & Medicine > Health Care Technology (0.68)
Health & Medicine > Health Care Providers & Services (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.48)
Information Technology > Artificial Intelligence > Natural Language (0.46)

Add feedback

Induced Model Matching: How Restricted Models Can Help Larger Ones

Muneeb, Usama, Ohannessian, Mesrob I.

arXiv.org Artificial IntelligenceFeb-19-2024

We consider scenarios where a very accurate predictive model using restricted features is available at the time of training of a larger, full-featured, model. This restricted model may be thought of as "side-information", derived either from an auxiliary exhaustive dataset or on the same dataset, by forcing the restriction. How can the restricted model be useful to the full model? We propose an approach for transferring the knowledge of the restricted model to the full model, by aligning the full model's context-restricted performance with that of the restricted model's. We call this methodology Induced Model Matching (IMM) and first illustrate its general applicability by using logistic regression as a toy example. We then explore IMM's use in language modeling, the application that initially inspired it, and where it offers an explicit foundation in contrast to the implicit use of restricted models in techniques such as noising. We demonstrate the methodology on both LSTM and transformer full models, using $N$-grams as restricted models. To further illustrate the potential of the principle whenever it is much cheaper to collect restricted rather than full information, we conclude with a simple RL example where POMDP policies can improve learned MDP policies via IMM.

artificial intelligence, imm, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2402.12513

Country: North America > United States > Illinois (0.14)

Genre:

Research Report > New Finding (0.49)
Research Report > Experimental Study (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)

Add feedback

The power of absolute discounting: all-dimensional distribution estimation

Falahatgar, Moein, Ohannessian, Mesrob I., Orlitsky, Alon, Pichapati, Venkatadheeraj

Neural Information Processing SystemsFeb-14-2020, 19:12:58 GMT

absolute discounting, all-dimensional distribution estimation, artificial intelligence, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Natural Language (0.65)

Add feedback

From Fair Decision Making to Social Equality

Mouzannar, Hussein, Ohannessian, Mesrob I., Srebro, Nathan

arXiv.org Machine LearningDec-7-2018

The study of fairness in intelligent decision systems has mostly ignored long-term influence on the underlying population. Yet fairness considerations (e.g. affirmative action) have often the implicit goal of achieving balance among groups within the population. The most basic notion of balance is eventual equality between the qualifications of the groups. How can we incorporate influence dynamics in decision making? How well do dynamics-oblivious fairness policies fare in terms of reaching equality? In this paper, we propose a simple yet revealing model that encompasses (1) a selection process where an institution chooses from multiple groups according to their qualifications so as to maximize an institutional utility and (2) dynamics that govern the evolution of the groups' qualifications according to the imposed policies. We focus on demographic parity as the formalism of affirmative action. We then give conditions under which an unconstrained policy reaches equality on its own. In this case, surprisingly, imposing demographic parity may break equality. When it doesn't, one would expect the additional constraint to reduce utility, however, we show that utility may in fact increase. In more realistic scenarios, unconstrained policies do not lead to equality. In such cases, we show that although imposing demographic parity may remedy it, there is a danger that groups settle at a worse set of qualifications. As a silver lining, we also identify when the constraint not only leads to equality, but also improves all groups. This gives quantifiable insight into both sides of the mismatch hypothesis. These cases and trade-offs are instrumental in determining when and how imposing demographic parity can be beneficial in selection processes, both for the institution and for society on the long run.

artificial intelligence, educational setting, equality, (19 more...)

arXiv.org Machine Learning

1812.02952

Country: North America > United States (1.00)

Genre: Research Report (0.40)

Industry:

Law (1.00)
Education (0.93)
Government > Regional Government > North America Government > United States Government (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

The power of absolute discounting: all-dimensional distribution estimation

Falahatgar, Moein, Ohannessian, Mesrob I., Orlitsky, Alon, Pichapati, Venkatadheeraj

Neural Information Processing SystemsDec-31-2017

Categorical models are a natural fit for many problems. When learning the distribution ofcategories from samples, high-dimensionality may dilute the data. Minimax optimality is too pessimistic to remedy this issue. A serendipitously discovered estimator, absolute discounting, corrects empirical frequencies by subtracting aconstant from observed categories, which it then redistributes among the unobserved. It outperforms classical estimators empirically, and has been used extensively innatural language modeling. In this paper, we rigorously explain the prowess of this estimator using less pessimistic notions. We show that (1) absolute discountingrecovers classical minimax KL-risk rates, (2) it is adaptive to an effective dimension rather than the true dimension, (3) it is strongly related to the Good-Turing estimator and inherits its competitive properties. We use powerlaw distributionsas the cornerstone of these results.

absolute discounting, artificial intelligence, renewable energy, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Near-Optimal Smoothing of Structured Conditional Probability Matrices

Falahatgar, Moein, Ohannessian, Mesrob I., Orlitsky, Alon

Neural Information Processing SystemsDec-31-2016

Utilizing the structure of a probabilistic model can significantly increase its learning speed. Motivated by several recent applications, in particular bigram models in language processing, we consider learning low-rank conditional probability matrices under expected KL-risk. This choice makes smoothing, that is the careful handling of low-probability elements, paramount. We derive an iterative algorithm that extends classical non-negative matrix factorization to naturally incorporate additive smoothing and prove that it converges to the stationary points of a penalized empirical risk. We then derive sample-complexity bounds for the global minimizer of the penalized risk and show that it is within a small factor of the optimal sample complexity. This framework generalizes to more sophisticated smoothing techniques, including absolute-discounting.

algorithm, artificial intelligence, bayesian inference, (16 more...)

Neural Information Processing Systems

Country: North America > United States > California (0.14)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Tradeoffs for Space, Time, Data and Risk in Unsupervised Learning

Lucic, Mario, Ohannessian, Mesrob I., Karbasi, Amin, Krause, Andreas

arXiv.org Machine LearningMay-2-2016

Faced with massive data, is it possible to trade off (statistical) risk, and (computational) space and time? This challenge lies at the heart of large-scale machine learning. Using k-means clustering as a prototypical unsupervised learning problem, we show how we can strategically summarize the data (control space) in order to trade off risk and time when data is generated by a probabilistic model. Our summarization is based on coreset constructions from computational geometry. We also develop an algorithm, TRAM, to navigate the space/time/data/risk tradeoff in practice. In particular, we show that for a fixed risk (or data size), as the data size increases (resp. risk increases) the running time of TRAM decreases. Our extensive experiments on real data sets demonstrate the existence and practical utility of such tradeoffs, not only for k-means but also for Gaussian Mixture Models.

artificial intelligence, machine learning, running time, (17 more...)

arXiv.org Machine Learning

1605.00529

Country: North America > United States > California (0.14)

Genre: Research Report (0.81)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback

On the Impossibility of Learning the Missing Mass

Mossel, Elchanan, Ohannessian, Mesrob I.

arXiv.org Machine LearningMar-12-2015

This paper shows that one cannot learn the probability of rare events without imposing further structural assumptions. The event of interest is that of obtaining an outcome outside the coverage of an i.i.d. sample from a discrete distribution. The probability of this event is referred to as the "missing mass". The impossibility result can then be stated as: the missing mass is not distribution-free PAC-learnable in relative error. The proof is semi-constructive and relies on a coupling argument using a dithered geometric distribution. This result formalizes the folklore that in order to predict rare events, one necessarily needs distributions with "heavy tails".

artificial intelligence, machine learning, probability, (17 more...)

arXiv.org Machine Learning

1503.03613

Country: North America > United States > California (0.28)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback