AITopics | Dibia, Victor

Plotting

Dibia, Victor

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Challenges in Human-Agent Communication

Bansal, Gagan, Vaughan, Jennifer Wortman, Amershi, Saleema, Horvitz, Eric, Fourney, Adam, Mozannar, Hussein, Dibia, Victor, Weld, Daniel S.

arXiv.org Artificial IntelligenceNov-27-2024

Remarkable advancements in modern generative foundation models have enabled the development of sophisticated and highly capable autonomous agents that can observe their environment, invoke tools, and communicate with other agents to solve problems. Although such agents can communicate with users through natural language, their complexity and wide-ranging failure modes present novel challenges for human-AI interaction. Building on prior research and informed by a communication grounding perspective, we contribute to the study of \emph{human-agent communication} by identifying and analyzing twelve key communication challenges that these systems pose. These include challenges in conveying information from the agent to the user, challenges in enabling the user to convey information to the agent, and overarching challenges that need to be considered across all human-agent communication. We illustrate each challenge through concrete examples and identify open directions of research. Our findings provide insights into critical gaps in human-agent communication research and serve as an urgent call for new design patterns, principles, and guidelines to support transparency and control in these systems.

artificial intelligence, machine learning, survey article, (19 more...)

arXiv.org Artificial Intelligence

2412.1038

Country: North America > United States (1.00)

Genre:

Overview (1.00)
Research Report > New Finding (0.48)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.46)

Add feedback

Magentic-One: A Generalist Multi-Agent System for Solving Complex Tasks

Fourney, Adam, Bansal, Gagan, Mozannar, Hussein, Tan, Cheng, Salinas, Eduardo, Erkang, null, Zhu, null, Niedtner, Friederike, Proebsting, Grace, Bassman, Griffin, Gerrits, Jack, Alber, Jacob, Chang, Peter, Loynd, Ricky, West, Robert, Dibia, Victor, Awadallah, Ahmed, Kamar, Ece, Hosn, Rafah, Amershi, Saleema

arXiv.org Artificial IntelligenceNov-7-2024

Modern AI agents, driven by advances in large foundation models, promise to enhance our productivity and transform our lives by augmenting our knowledge and capabilities. To achieve this vision, AI agents must effectively plan, perform multi-step reasoning and actions, respond to novel observations, and recover from errors, to successfully complete complex tasks across a wide range of scenarios. In this work, we introduce Magentic-One, a high-performing open-source agentic system for solving such tasks. Magentic-One uses a multi-agent architecture where a lead agent, the Orchestrator, plans, tracks progress, and re-plans to recover from errors. Throughout task execution, the Orchestrator directs other specialized agents to perform tasks as needed, such as operating a web browser, navigating local files, or writing and executing Python code. We show that Magentic-One achieves statistically competitive performance to the state-of-the-art on three diverse and challenging agentic benchmarks: GAIA, AssistantBench, and WebArena. Magentic-One achieves these results without modification to core agent capabilities or to how they collaborate, demonstrating progress towards generalist agentic systems. Moreover, Magentic-One's modular design allows agents to be added or removed from the team without additional prompt tuning or training, easing development and making it extensible to future scenarios. We provide an open-source implementation of Magentic-One, and we include AutoGenBench, a standalone tool for agentic evaluation. AutoGenBench provides built-in controls for repetition and isolation to run agentic benchmarks in a rigorous and contained manner -- which is important when agents' actions have side-effects. Magentic-One, AutoGenBench and detailed empirical performance evaluations of Magentic-One, including ablations and error analysis are available at https://aka.ms/magentic-one

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2411.04468

Country:

North America > United States (0.28)
Europe > Middle East > Malta (0.14)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Industry: Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)

Add feedback

Data Analysis in the Era of Generative AI

Inala, Jeevana Priya, Wang, Chenglong, Drucker, Steven, Ramos, Gonzalo, Dibia, Victor, Riche, Nathalie, Brown, Dave, Marshall, Dan, Gao, Jianfeng

arXiv.org Artificial IntelligenceSep-27-2024

This paper explores the potential of AI-powered tools to reshape data analysis, focusing on design considerations and challenges. We explore how the emergence of large language and multimodal models offers new opportunities to enhance various stages of data analysis workflow by translating high-level user intentions into executable code, charts, and insights. We then examine human-centered design principles that facilitate intuitive interactions, build user trust, and streamline the AI-assisted analysis workflow across multiple apps. Finally, we discuss the research challenges that impede the development of these AI-based systems such as enhancing model capabilities, evaluating and benchmarking, and understanding end-user needs.

large language model, machine learning, programming language, (19 more...)

arXiv.org Artificial Intelligence

2409.18475

Country: North America > United States (0.46)

Genre:

Workflow (1.00)
Research Report > Experimental Study (0.45)

Industry:

Health & Medicine (1.00)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
(6 more...)

Add feedback

Towards better Human-Agent Alignment: Assessing Task Utility in LLM-Powered Applications

Arabzadeh, Negar, Kiseleva, Julia, Wu, Qingyun, Wang, Chi, Awadallah, Ahmed, Dibia, Victor, Fourney, Adam, Clarke, Charles

arXiv.org Artificial IntelligenceFeb-22-2024

The rapid development in the field of Large Language Models (LLMs) has led to a surge in applications that facilitate collaboration among multiple agents to assist humans in their daily tasks. However, a significant gap remains in assessing whether LLM-powered applications genuinely enhance user experience and task execution efficiency. This highlights the pressing need for methods to verify utility of LLM-powered applications, particularly by ensuring alignment between the application's functionality and end-user needs. We introduce AgentEval provides an implementation for the math problems, a novel framework designed to simplify the utility verification process by automatically proposing a set of criteria tailored to the unique purpose of any given application. This allows for a comprehensive assessment, quantifying the utility of an application against the suggested criteria. We present a comprehensive analysis of the robustness of quantifier's work.

criteria, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2402.09015

Country:

Asia > China (0.14)
Europe > Sweden (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Axiomatic Preference Modeling for Longform Question Answering

Rosset, Corby, Zheng, Guoqing, Dibia, Victor, Awadallah, Ahmed, Bennett, Paul

arXiv.org Artificial IntelligenceDec-2-2023

The remarkable abilities of large language models (LLMs) like GPT-4 partially stem from post-training processes like Reinforcement Learning from Human Feedback (RLHF) involving human preferences encoded in a reward model. However, these reward models (RMs) often lack direct knowledge of why, or under what principles, the preferences annotations were made. In this study, we identify principles that guide RMs to better align with human preferences, and then develop an axiomatic framework to generate a rich variety of preference signals to uphold them. We use these axiomatic signals to train a model for scoring answers to longform questions. Our approach yields a Preference Model with only about 220M parameters that agrees with gold human-annotated preference labels more often than GPT-4. The contributions of this work include: training a standalone preference model that can score human- and LLM-generated answers on the same scale; developing an axiomatic framework for generating training data pairs tailored to certain principles; and showing that a small amount of axiomatic signals can help small models outperform GPT-4 in preference scoring. We release our model on huggingface: https://huggingface.co/corbyrosset/axiomatic_preference_model

information, large language model, machine learning, (22 more...)

arXiv.org Artificial Intelligence

2312.02206

Country: North America > United States (0.46)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Media (0.67)
Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Aligning Offline Metrics and Human Judgments of Value for Code Generation Models

Dibia, Victor, Fourney, Adam, Bansal, Gagan, Poursabzi-Sangdeh, Forough, Liu, Han, Amershi, Saleema

arXiv.org Artificial IntelligenceJun-13-2023

Large language models have demonstrated great potential to assist programmers in generating code. For such human-AI pair programming scenarios, we empirically demonstrate that while generated code is most often evaluated in terms of their functional correctness (i.e., whether generations pass available unit tests), correctness does not fully capture (e.g., may underestimate) the productivity gains these models may provide. Through a user study with N = 49 experienced programmers, we show that while correctness captures high-value generations, programmers still rate code that fails unit tests as valuable if it reduces the overall effort needed to complete a coding task. Finally, we propose a hybrid metric that combines functional correctness and syntactic similarity and show that it achieves a 14% stronger correlation with value and can therefore better represent real-world gains when evaluating and comparing models.

machine learning, natural language, programmer, (20 more...)

arXiv.org Artificial Intelligence

2210.16494

Country: North America > United States (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.49)
Information Technology > Artificial Intelligence > Representation & Reasoning > Automatic Programming (0.44)

Add feedback

LIDA: A Tool for Automatic Generation of Grammar-Agnostic Visualizations and Infographics using Large Language Models

Dibia, Victor

arXiv.org Artificial IntelligenceJun-5-2023

Systems that support users in the automatic creation of visualizations must address several subtasks - understand the semantics of data, enumerate relevant visualization goals and generate visualization specifications. In this work, we pose visualization generation as a multi-stage generation problem and argue that well-orchestrated pipelines based on large language models (LLMs) such as ChatGPT/GPT-4 and image generation models (IGMs) are suitable to addressing these tasks. We present LIDA, a novel tool for generating grammar-agnostic visualizations and infographics. LIDA comprises of 4 modules - A SUMMARIZER that converts data into a rich but compact natural language summary, a GOAL EXPLORER that enumerates visualization goals given the data, a VISGENERATOR that generates, refines, executes and filters visualization code and an INFOGRAPHER module that yields data-faithful stylized graphics using IGMs. LIDA provides a python api, and a hybrid user interface (direct manipulation and multilingual natural language) for interactive chart, infographics and data story generation. Learn more about the project here - https://microsoft.github.io/lida/

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2303.02927

Genre: Research Report (0.64)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

NeuralQA: A Usable Library for Question Answering (Contextual Query Expansion + BERT) on Large Datasets

Dibia, Victor

arXiv.org Artificial IntelligenceJul-29-2020

Existing tools for Question Answering (QA) have challenges that limit their use in practice. They can be complex to set up or integrate with existing infrastructure, do not offer configurable interactive interfaces, and do not cover the full set of subtasks that frequently comprise the QA pipeline (query expansion, retrieval, reading, and explanation/sensemaking). To help address these issues, we introduce NeuralQA - a usable library for QA on large datasets. NeuralQA integrates well with existing infrastructure (e.g., ElasticSearch instances and reader models trained with the HuggingFace Transformers API) and offers helpful defaults for QA subtasks. It introduces and implements contextual query expansion (CQE) using a masked language model (MLM) as well as relevant snippets (RelSnip) - a method for condensing large documents into smaller passages that can be speedily processed by a document reader model. Finally, it offers a flexible user interface to support workflows for research explorations (e.g., visualization of gradient-based explanations to support qualitative inspection of model behaviour) and large scale search deployment. Code and documentation for NeuralQA is available as open source on Github.

neural network, representation, survey article, (17 more...)

arXiv.org Artificial Intelligence

2007.15211

Country:

North America > United States (0.14)
Europe > Italy (0.14)
Asia > China (0.14)
Europe > Belgium (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Designing for Democratization: Introducing Novices to Artificial Intelligence Via Maker Kits

Dibia, Victor, Ashoori, Maryam, Cox, Aaron, Weisz, Justin

arXiv.org Artificial IntelligenceMay-27-2018

Existing research highlight the myriad of benefits realized when technology is sufficiently democratized and made accessible to non-technical or novice users. However, democratizing complex technologies such as artificial intelligence (AI) remains hard. In this work, we draw on theoretical underpinnings from the democratization of innovation, in exploring the design of maker kits that help introduce novice users to complex technologies. We report on our work designing TJBot: an open source cardboard robot that can be programmed using pre-built AI services. We highlight principles we adopted in this process (approachable design, simplicity, extensibility and accessibility), insights we learned from showing the kit at workshops (66 participants) and how users interacted with the project on GitHub over a 12-month period (Nov 2016 - Nov 2017). We find that the project succeeds in attracting novice users (40\% of users who forked the project are new to GitHub) and a variety of demographics are interested in prototyping use cases such as home automation, task delegation, teaching and learning.

deep learning, maker kit, neural network, (23 more...)

arXiv.org Artificial Intelligence

1805.10723

Country: North America > United States > New York > New York County > New York City (0.15)

Genre:

Questionnaire & Opinion Survey (1.00)
Research Report > New Finding (0.46)
Instructional Material > Course Syllabus & Notes (0.46)

Industry:

Education > Curriculum > Subject-Specific Education (1.00)
Information Technology > Smart Houses & Appliances (0.88)
Education > Educational Setting (0.67)
Health & Medicine > Diagnostic Medicine > Imaging (0.67)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(4 more...)

Add feedback

Data2Vis: Automatic Generation of Data Visualizations Using Sequence to Sequence Recurrent Neural Networks

Dibia, Victor, Demiralp, Çağatay

arXiv.org Artificial IntelligenceApr-19-2018

Rapidly creating effective visualizations using expressive grammars is challenging for users who have limited time and limited skills in statistics and data visualization. Even high-level, dedicated visualization tools often require users to manually select among data attributes, decide which transformations to apply, and specify mappings between visual encoding variables and raw or transformed attributes. In this paper we introduce Data2Vis, a neural translation model for automatically generating visualizations from given datasets. We formulate visualization generation as a sequence to sequence translation problem where data specifications are mapped to visualization specifications in a declarative language (Vega-Lite). To this end, we train a multilayered attention-based recurrent neural network (RNN) with long short-term memory (LSTM) units on a corpus of visualization specifications. Qualitative results show that our model learns the vocabulary and syntax for a valid visualization specification, appropriate transformations (count, bins, mean) and how to use common data selection patterns that occur within data visualizations. Data2Vis generates visualizations that are comparable to manually-created visualizations in a fraction of the time, with potential to learn more complex visualization strategies at scale.

deep learning, neural network, visualization, (19 more...)

arXiv.org Artificial Intelligence

1804.03126

Country: North America > United States (0.14)

Genre: Research Report > New Finding (0.48)

Industry:

Leisure & Entertainment (0.93)
Media > Music (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback