AITopics | Singh, Moninder

Plotting

Singh, Moninder

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Ranking Large Language Models without Ground Truth

Dhurandhar, Amit, Nair, Rahul, Singh, Moninder, Daly, Elizabeth, Ramamurthy, Karthikeyan Natesan

arXiv.org Artificial IntelligenceJun-10-2024

Evaluation and ranking of large language models (LLMs) has become an important problem with the proliferation of these models and their impact. Evaluation methods either require human responses which are expensive to acquire or use pairs of LLMs to evaluate each other which can be unreliable. In this paper, we provide a novel perspective where, given a dataset of prompts (viz. questions, instructions, etc.) and a set of LLMs, we rank them without access to any ground truth or reference responses. Inspired by real life where both an expert and a knowledgeable person can identify a novice our main idea is to consider triplets of models, where each one of them evaluates the other two, correctly identifying the worst model in the triplet with high probability. We also analyze our idea and provide sufficient conditions for it to succeed. Applying this idea repeatedly, we propose two methods to rank LLMs. In experiments on different generative tasks (summarization, multiple-choice, and dialog), our methods reliably recover close to true rankings without reference data. This points to a viable low-resource mechanism for practical use.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2402.1486

Country:

Europe (0.67)
North America > Canada > Ontario (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Reasoning about concepts with LLMs: Inconsistencies abound

Uceda-Sosa, Rosario, Ramamurthy, Karthikeyan Natesan, Chang, Maria, Singh, Moninder

arXiv.org Artificial IntelligenceMay-30-2024

The ability to summarize and organize knowledge into abstract concepts is key to learning and reasoning. Many industrial applications rely on the consistent and systematic use of concepts, especially when dealing with decision-critical knowledge. However, we demonstrate that, when methodically questioned, large language models (LLMs) often display and demonstrate significant inconsistencies in their knowledge. Computationally, the basic aspects of the conceptualization of a given domain can be represented as Is-A hierarchies in a knowledge graph (KG) or ontology, together with a few properties or axioms that enable straightforward reasoning. We show that even simple ontologies can be used to reveal conceptual inconsistencies across several LLMs. We also propose strategies that domain experts can use to evaluate and improve the coverage of key domain concepts in LLMs of various sizes. In particular, we have been able to significantly enhance the performance of LLMs of various sizes with openly available weights using simple knowledge-graph (KG) based prompting strategies.

artificial intelligence, large language model, natural language, (16 more...)

arXiv.org Artificial Intelligence

2405.20163

Genre: Research Report (0.50)

Industry:

Banking & Finance > Insurance (0.68)
Health & Medicine > Therapeutic Area > Pediatrics/Neonatology (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Alignment Studio: Aligning Large Language Models to Particular Contextual Regulations

Achintalwar, Swapnaja, Baldini, Ioana, Bouneffouf, Djallel, Byamugisha, Joan, Chang, Maria, Dognin, Pierre, Farchi, Eitan, Makondo, Ndivhuwo, Mojsilovic, Aleksandra, Nagireddy, Manish, Ramamurthy, Karthikeyan Natesan, Padhi, Inkit, Raz, Orna, Rios, Jesus, Sattigeri, Prasanna, Singh, Moninder, Thwala, Siphiwe, Uceda-Sosa, Rosario A., Varshney, Kush R.

arXiv.org Artificial IntelligenceMar-8-2024

The alignment of large language models is usually done by model providers to add or control behaviors that are common or universally understood across use cases and contexts. In contrast, in this article, we present an approach and architecture that empowers application developers to tune a model to their particular values, social norms, laws and other regulations, and orchestrate between potentially conflicting requirements in context. We lay out three main components of such an Alignment Studio architecture: Framers, Instructors, and Auditors that work in concert to control the behavior of a language model. We illustrate this approach with a running example of aligning a company's internal-facing enterprise chatbot to its business conduct guidelines.

artificial intelligence, large language model, natural language, (19 more...)

arXiv.org Artificial Intelligence

2403.09704

Genre: Research Report (0.40)

Industry: Law (0.68)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)

Add feedback

SocialStigmaQA: A Benchmark to Uncover Stigma Amplification in Generative Language Models

Nagireddy, Manish, Chiazor, Lamogha, Singh, Moninder, Baldini, Ioana

arXiv.org Artificial IntelligenceDec-27-2023

Current datasets for unwanted social bias auditing are limited to studying protected demographic features such as race and gender. In this work, we introduce a comprehensive benchmark that is meant to capture the amplification of social bias, via stigmas, in generative language models. Taking inspiration from social science research, we start with a documented list of 93 US-centric stigmas and curate a question-answering (QA) dataset which involves simple social situations. Our benchmark, SocialStigmaQA, contains roughly 10K prompts, with a variety of prompt styles, carefully constructed to systematically test for both social bias and model robustness. We present results for SocialStigmaQA with two open source generative language models and we find that the proportion of socially biased output ranges from 45% to 59% across a variety of decoding strategies and prompting styles. We demonstrate that the deliberate design of the templates in our benchmark (e.g., adding biasing text to the prompt or using different verbs that change the answer that indicates bias) impacts the model tendencies to generate socially biased output. Additionally, through manual evaluation, we discover problematic patterns in the generated chain-of-thought output that range from subtle bias to lack of reasoning. Warning: This paper contains examples of text which are toxic, biased, and potentially harmful.

large language model, natural language, question answering, (18 more...)

arXiv.org Artificial Intelligence

2312.07492

Country:

North America > United States (0.67)
Asia > Middle East > UAE (0.14)

Genre: Research Report > New Finding (1.00)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Addiction Disorder (1.00)
Law (0.94)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.46)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.34)

Add feedback

Function Composition in Trustworthy Machine Learning: Implementation Choices, Insights, and Questions

Nagireddy, Manish, Singh, Moninder, Hoffman, Samuel C., Ju, Evaline, Ramamurthy, Karthikeyan Natesan, Varshney, Kush R.

arXiv.org Artificial IntelligenceFeb-17-2023

Ensuring trustworthiness in machine learning (ML) models is a multi-dimensional task. In addition to the traditional notion of predictive performance, other notions such as privacy, fairness, robustness to distribution shift, adversarial robustness, interpretability, explainability, and uncertainty quantification are important considerations to evaluate and improve (if deficient). However, these sub-disciplines or 'pillars' of trustworthiness have largely developed independently, which has limited us from understanding their interactions in real-world ML pipelines. In this paper, focusing specifically on compositions of functions arising from the different pillars, we aim to reduce this gap, develop new insights for trustworthy ML, and answer questions such as the following. Does the composition of multiple fairness interventions result in a fairer model compared to a single intervention? How do bias mitigation algorithms for fairness affect local post-hoc explanations? Does a defense algorithm for untargeted adversarial attacks continue to be effective when composed with a privacy transformation? Toward this end, we report initial empirical results and new insights from 9 different compositions of functions (or pipelines) on 7 real-world datasets along two trustworthy dimensions - fairness and explainability. We also report progress, and implementation choices, on an extensible composer tool to encourage the combination of functionalities from multiple pillars. To-date, the tool supports bias mitigation algorithms for fairness and post-hoc explainability methods. We hope this line of work encourages the thoughtful consideration of multiple pillars when attempting to formulate and resolve a trustworthiness problem.

artificial intelligence, intervention, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2302.0919

Country: North America > United States (1.00)

Genre: Research Report (0.82)

Industry:

Banking & Finance (0.68)
Information Technology > Security & Privacy (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

AI Explainability 360: Impact and Design

Arya, Vijay, Bellamy, Rachel K. E., Chen, Pin-Yu, Dhurandhar, Amit, Hind, Michael, Hoffman, Samuel C., Houde, Stephanie, Liao, Q. Vera, Luss, Ronny, Mojsilovic, Aleksandra, Mourad, Sami, Pedemonte, Pablo, Raghavendra, Ramya, Richards, John, Sattigeri, Prasanna, Shanmugam, Karthikeyan, Singh, Moninder, Varshney, Kush R., Wei, Dennis, Zhang, Yunfeng

arXiv.org Artificial IntelligenceSep-24-2021

We also introduced a taxonomy to The increasing use of artificial intelligence (AI) systems in navigate the space of explanation methods, not only the ten high stakes domains has been coupled with an increase in societal in the toolkit but also the broader literature on explainable demands for these systems to provide explanations for AI. The taxonomy was intended to be usable by consumers their outputs. This societal demand has already resulted in with varied backgrounds to choose an appropriate explanation new regulations requiring explanations (Goodman and Flaxman method for their application. AIX360 differs from other 2016; Wachter, Mittelstadt, and Floridi 2017; Selbst open source explainability toolkits (see Arya et al. (2020) and Powles 2017; Pasternak 2019). Explanations can allow for a list) in two main ways: 1) its support for a broad and users to gain insight into the system's decision-making process, diverse spectrum of explainability methods, implemented in which is a key component in calibrating appropriate a common architecture, and 2) its educational material as trust and confidence in AI systems (Doshi-Velez and Kim discussed below.

deep learning, explanation, neural network, (21 more...)

arXiv.org Artificial Intelligence

2109.12151

Country: North America > United States > Illinois (0.14)

Genre:

Instructional Material (0.67)
Research Report (0.50)

Industry:

Law (1.00)
Health & Medicine (1.00)
Banking & Finance (1.00)
Information Technology > Security & Privacy (0.94)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

One Explanation Does Not Fit All: A Toolkit and Taxonomy of AI Explainability Techniques

Arya, Vijay, Bellamy, Rachel K. E., Chen, Pin-Yu, Dhurandhar, Amit, Hind, Michael, Hoffman, Samuel C., Houde, Stephanie, Liao, Q. Vera, Luss, Ronny, Mojsilović, Aleksandra, Mourad, Sami, Pedemonte, Pablo, Raghavendra, Ramya, Richards, John, Sattigeri, Prasanna, Shanmugam, Karthikeyan, Singh, Moninder, Varshney, Kush R., Wei, Dennis, Zhang, Yunfeng

arXiv.org Artificial IntelligenceSep-14-2019

As artificial intelligence and machine learning algorithms make further inroads into society, calls are increasing from multiple stakeholders for these algorithms to explain their outputs. At the same time, these stakeholders, whether they be affected citizens, government regulators, domain experts, or system developers, present different requirements for explanations. Toward addressing these needs, we introduce AI Explainability 360 (http://aix360.mybluemix.net/), an open-source software toolkit featuring eight diverse and state-of-the-art explainability methods and two evaluation metrics. Equally important, we provide a taxonomy to help entities requiring explanations to navigate the space of explanation methods, not only those in the toolkit but also in the broader literature on explainability. For data scientists and other users of the toolkit, we have implemented an extensible software architecture that organizes methods according to their place in the AI modeling pipeline. We also discuss enhancements to bring research innovations closer to consumers of explanations, ranging from simplified, more accessible versions of algorithms, to tutorials and an interactive web demo to introduce AI explainability to different audiences and application domains. Together, our toolkit and taxonomy can help identify gaps where more explainability methods are needed and provide a platform to incorporate them as they are developed.

deep learning, explanation, neural network, (22 more...)

arXiv.org Artificial Intelligence

1909.03012

Country:

Europe (0.46)
North America > United States > Illinois (0.14)

Genre:

Overview (0.67)
Research Report (0.64)

Industry:

Health & Medicine (1.00)
Law (0.93)
Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

AI Fairness 360: An Extensible Toolkit for Detecting, Understanding, and Mitigating Unwanted Algorithmic Bias

Bellamy, Rachel K. E., Dey, Kuntal, Hind, Michael, Hoffman, Samuel C., Houde, Stephanie, Kannan, Kalapriya, Lohia, Pranay, Martino, Jacquelyn, Mehta, Sameep, Mojsilovic, Aleksandra, Nagar, Seema, Ramamurthy, Karthikeyan Natesan, Richards, John, Saha, Diptikalyan, Sattigeri, Prasanna, Singh, Moninder, Varshney, Kush R., Zhang, Yunfeng

arXiv.org Artificial IntelligenceOct-3-2018

We used Python's Flask framework for building the service and exposed a REST API that generates a bias report based on the following input parameters from a user: the dataset name, the protected attributes, the privileged and unprivileged groups, the chosen fairness metrics, and the chosen mitigation algorithm, if any. With these inputs, the back-end then runs a series of steps to 1) split the dataset into training, development, and validation sets; 2) train a logistic regression classifier on the training set; 3) run the bias-checking metrics on the classifier against the test dataset; 4) if a mitigation algorithm is chosen, run the mitigation algorithm with the appropriate pipeline (pre-processing, in-processing, or post-processing). The end result is then cached so that if the exact same inputs are provided, the result can be directly retrieved from cache and no additional computation is needed. The reason to truly use the toolkit code in serving the Web application rather than having a pre-computed lookup table of results is twofold: we want to make the app a real representation of the underlying capabilities (in fact, creating the Web app helped us debug a few items in the code), and we also avoid any issues of synchronizing updates to the metrics, explainers, and algorithms with the results shown: synchronization is automatic. Currently, the service is limited to three built-in datasets, but it can be expanded to support the user's own data upload. The service is also limited to building logistic regression classifiers, but again this can be expanded. Such expansions can be more easily implemented if this fairness service is integrated into a full AI suite that provides various classifier options and data storage solutions.

algorithm, artificial intelligence, machine learning, (14 more...)

arXiv.org Artificial Intelligence

1810.01943

Country:

North America > United States > Connecticut (0.14)
Asia > India > NCT (0.14)

Genre: Research Report > New Finding (0.88)

Industry: Information Technology (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)

Add feedback

Interpretable Multi-Objective Reinforcement Learning through Policy Orchestration

Noothigattu, Ritesh, Bouneffouf, Djallel, Mattei, Nicholas, Chandra, Rachita, Madan, Piyush, Varshney, Kush, Campbell, Murray, Singh, Moninder, Rossi, Francesca

arXiv.org Artificial IntelligenceSep-21-2018

Autonomous cyber-physical agents and systems play an increasingly large role in our lives. To ensure that agents behave in ways aligned with the values of the societies in which they operate, we must develop techniques that allow these agents to not only maximize their reward in an environment, but also to learn and follow the implicit constraints of society. These constraints and norms can come from any number of sources including regulations, business process guidelines, laws, ethical principles, social norms, and moral values. We detail a novel approach that uses inverse reinforcement learning to learn a set of unspecified constraints from demonstrations of the task, and reinforcement learning to learn to maximize the environment rewards. More precisely, we assume that an agent can observe traces of behavior of members of the society but has no access to the explicit set of constraints that give rise to the observed behavior. Inverse reinforcement learning is used to learn such constraints, that are then combined with a possibly orthogonal value function through the use of a contextual bandit-based orchestrator that picks a contextually-appropriate choice between the two policies (constraint-based and environment reward-based) when taking actions. The contextual bandit orchestrator allows the agent to mix policies in novel ways, taking the best actions from either a reward maximizing or constrained policy. In addition, the orchestrator is transparent on which policy is being employed at each time step. We test our algorithms using a Pac-Man domain and show that the agent is able to learn to act optimally, act within the demonstrated constraints, and mix these two functions in complex ways.

computer game, constraint, ground transportation, (20 more...)

arXiv.org Artificial Intelligence

1809.08343

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England (0.14)

Genre:

Research Report > Promising Solution (0.34)
Overview > Innovation (0.34)

Industry: Leisure & Entertainment > Games > Computer Games (0.37)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Assessing National Development Plans for Alignment With Sustainable Development Goals via Semantic Search

Galsurkar, Jonathan (IBM T.J. Watson Research Center) | Singh, Moninder (IBM T.J. Watson Research Center) | Wu, Lingfei (IBM T.J. Watson Research Center) | Vempaty, Aditya (IBM T.J. Watson Research Center) | Sushkov, Mikhail (IBM Watson) | Iyer, Devika (United Nations Development Programme) | Kapto, Serge (United Nations Development Programme) | Varshney, Kush R. (IBM T.J. Watson Research Center)

AAAI ConferencesFeb-8-2018

The United Nations Development Programme (UNDP) helps countries implement the United Nations (UN) Sustainable Development Goals (SDGs), an agenda for tackling major societal issues such as poverty, hunger, and environmental degradation by the year 2030. A key service provided by UNDP to countries that seek it is a review of national development plans and sector strategies by policy experts to assess alignment of national targets with one or more of the 169 targets of the 17 SDGs. Known as the Rapid Integrated Assessment (RIA), this process involves manual review of hundreds, if not thousands, of pages of documents and takes weeks to complete. In this work, we develop a natural language processing-based methodology to accelerate the workflow of policy experts. Specifically we use paragraph embedding techniques to find paragraphs in the documents that match the semantic concepts of each of the SDG targets. One novel technical contribution of our work is in our use of historical RIAs from other countries as a form of neighborhood-based supervision for matches in the country under study. We have successfully piloted the algorithm to perform the RIA for Papua New Guinea’s national plan, with the UNDP estimating it will help reduce their completion time from an estimated 3-4 weeks to 3 days.

artificial intelligence, policy expert, text processing, (17 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country:

Africa (0.51)
Oceania > Papua New Guinea (0.25)
North America > United States > New York (0.14)

Industry:

Government (1.00)
Social Sector (0.85)
Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.88)

Add feedback