AITopics | Weld, Daniel S.

Collaborating Authors

Weld, Daniel S.

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

ScatterShot: Interactive In-context Example Curation for Text Transformation

Wu, Tongshuang, Shen, Hua, Weld, Daniel S., Heer, Jeffrey, Ribeiro, Marco Tulio

arXiv.org Artificial IntelligenceFeb-14-2023

The in-context learning capabilities of LLMs like GPT-3 allow annotators to customize an LLM to their specific tasks with a small number of examples. However, users tend to include only the most obvious patterns when crafting examples, resulting in underspecified in-context functions that fall short on unseen cases. Further, it is hard to know when "enough" examples have been included even for known patterns. In this work, we present ScatterShot, an interactive system for building high-quality demonstration sets for in-context learning. ScatterShot iteratively slices unlabeled data into task-specific patterns, samples informative inputs from underexplored or not-yet-saturated slices in an active learning manner, and helps users label more efficiently with the help of an LLM and the current example set. In simulation studies on two text perturbation scenarios, ScatterShot sampling improves the resulting few-shot functions by 4-5 percentage points over random sampling, with less variance as more examples are added. In a user study, ScatterShot greatly helps users in covering different patterns in the input space and labeling in-context examples more efficiently, resulting in better in-context learning and less user effort.

artificial intelligence, machine learning, scattershot, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3581641.3584059

2302.07346

Country:

Europe (1.00)
Asia (0.67)
North America > United States > Minnesota (0.28)
North America > United States > California (0.28)

Genre:

Questionnaire & Opinion Survey (0.86)
Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Industry:

Health & Medicine (0.68)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback

The Semantic Scholar Open Data Platform

Kinney, Rodney, Anastasiades, Chloe, Authur, Russell, Beltagy, Iz, Bragg, Jonathan, Buraczynski, Alexandra, Cachola, Isabel, Candra, Stefan, Chandrasekhar, Yoganand, Cohan, Arman, Crawford, Miles, Downey, Doug, Dunkelberger, Jason, Etzioni, Oren, Evans, Rob, Feldman, Sergey, Gorney, Joseph, Graham, David, Hu, Fangzhou, Huff, Regan, King, Daniel, Kohlmeier, Sebastian, Kuehl, Bailey, Langan, Michael, Lin, Daniel, Liu, Haokun, Lo, Kyle, Lochner, Jaron, MacMillan, Kelsey, Murray, Tyler, Newell, Chris, Rao, Smita, Rohatgi, Shaurya, Sayre, Paul, Shen, Zejiang, Singh, Amanpreet, Soldaini, Luca, Subramanian, Shivashankar, Tanaka, Amber, Wade, Alex D., Wagner, Linda, Wang, Lucy Lu, Wilhelm, Chris, Wu, Caroline, Yang, Jiangjiang, Zamarron, Angele, Van Zuylen, Madeleine, Weld, Daniel S.

arXiv.org Artificial IntelligenceJan-24-2023

The volume of scientific output is creating an urgent need for automated tools to help scientists keep up with developments in their field. Semantic Scholar (S2) is an open data platform and website aimed at accelerating science by helping scholars discover and understand scientific literature. We combine public and proprietary data sources using state-of-the-art techniques for scholarly PDF content extraction and automatic knowledge graph construction to build the Semantic Scholar Academic Graph, the largest open scientific literature graph to-date, with 200M+ papers, 80M+ authors, 550M+ paper-authorship edges, and 2.4B+ citation edges. The graph includes advanced semantic features such as structurally parsed text, natural language summaries, and vector embeddings. In this paper, we describe the components of the S2 data processing pipeline and the associated APIs offered by the platform. We will update this living document to reflect changes as we add new data offerings and improve existing services.

data mining, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2301.1014

Genre: Research Report (1.00)

Industry: Information Technology (0.35)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Data Science > Data Mining > Big Data (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

Add feedback

GENIE: A Leaderboard for Human-in-the-Loop Evaluation of Text Generation

Khashabi, Daniel, Stanovsky, Gabriel, Bragg, Jonathan, Lourie, Nicholas, Kasai, Jungo, Choi, Yejin, Smith, Noah A., Weld, Daniel S.

arXiv.org Artificial IntelligenceJan-16-2021

Leaderboards have eased model development for many NLP datasets by standardizing their evaluation and delegating it to an independent external repository. Their adoption, however, is so far limited to tasks that can be reliably evaluated in an automatic manner. This work introduces GENIE, an extensible human evaluation leaderboard, which brings the ease of leaderboards to text generation tasks. GENIE automatically posts leaderboard submissions to crowdsourcing platforms asking human annotators to evaluate them on various axes (e.g., correctness, conciseness, fluency) and compares their answers to various automatic metrics. We introduce several datasets in English to GENIE, representing four core challenges in text generation: machine translation, summarization, commonsense reasoning, and machine comprehension. We provide formal granular evaluation metrics and identify areas for future research. We make GENIE publicly available and hope that it will spur progress in language generation models as well as their automatic and manual evaluation.

artificial intelligence, evaluation, machine translation, (16 more...)

arXiv.org Artificial Intelligence

2101.06561

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.82)

Industry: Education (0.68)

Technology: Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)

Add feedback

Augmenting Scientific Papers with Just-in-Time, Position-Sensitive Definitions of Terms and Symbols

Head, Andrew, Lo, Kyle, Kang, Dongyeop, Fok, Raymond, Skjonsberg, Sam, Weld, Daniel S., Hearst, Marti A.

arXiv.org Artificial IntelligenceSep-29-2020

Despite the central importance of research papers to scientific progress, they can be difficult to read. Comprehension is often stymied when the information needed to understand a passage resides somewhere else: in another section, or in another paper. In this work, we envision how interfaces can bring definitions of technical terms and symbols to readers when and where they need them most. We introduce ScholarPhi, an augmented reading interface with four novel features: (1) tooltips that surface position-sensitive definitions from elsewhere in a paper, (2) a filter over the paper that "declutters" it to reveal how the term or symbol is used across the paper, (3) automatic equation diagrams that expose multiple definitions in parallel, and (4) an automatically generated glossary of important terms and symbols. A usability study showed that the tool helps researchers of all experience levels read papers. Furthermore, researchers were eager to have ScholarPhi's definitions available to support their everyday reading.

information management, neural network, nonce word, (24 more...)

arXiv.org Artificial Intelligence

2009.14237

Country: Europe (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine (1.00)
Education > Educational Setting > Higher Education (0.67)
Education > Curriculum > Subject-Specific Education (0.46)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Human Computer Interaction (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(5 more...)

Add feedback

Does the Whole Exceed its Parts? The Effect of AI Explanations on Complementary Team Performance

Bansal, Gagan, Wu, Tongshuang, Zhou, Joyce, Fok, Raymond, Nushi, Besmira, Kamar, Ece, Ribeiro, Marco Tulio, Weld, Daniel S.

arXiv.org Artificial IntelligenceJun-30-2020

Increasingly, organizations are pairing humans with AI systems to improve decision-making and reducing costs. Proponents of human-centered AI argue that team performance can even further improve when the AI model explains its recommendations. However, a careful analysis of existing literature reveals that prior studies observed improvements due to explanations only when the AI, alone, outperformed both the human and the best human-AI team. This raises an important question: can explanations lead to complementary performance, i.e., with accuracy higher than both the human and the AI working alone? We address this question by devising comprehensive studies on human-AI teaming, where participants solve a task with help from an AI system without explanations and from one with varying types of AI explanation support. We carefully controlled to ensure comparable human and AI accuracy across experiments on three NLP datasets (two for sentiment analysis and one for question answering). While we found complementary improvements from AI augmentation, they were not increased by state-of-the-art explanations compared to simpler strategies, such as displaying the AI's confidence. We show that explanations increase the chance that humans will accept the AI's recommendation regardless of whether the AI is correct. While this clarifies the gains in team performance from explanations in prior work, it poses new challenges for human-centered AI: how can we best design systems to produce complementary performance? Can we develop explanatory approaches that help humans decide whether and when to trust AI input?

educational setting, explanation, social media, (22 more...)

arXiv.org Artificial Intelligence

2006.14779

Country: North America > United States (0.92)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (0.93)

Industry:

Health & Medicine (1.00)
Education (1.00)
Law (0.68)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Optimizing AI for Teamwork

Bansal, Gagan, Nushi, Besmira, Kamar, Ece, Horvitz, Eric, Weld, Daniel S.

arXiv.org Artificial IntelligenceJun-25-2020

In many high-stakes domains such as criminal justice, finance, and healthcare, AI systems may recommend actions to a human expert responsible for final decisions, a context known as AI-advised decision making. When AI practitioners deploy the most accurate system in these domains, they implicitly assume that the system will function alone in the world. We argue that the most accurate AI team-mate is not necessarily the em best teammate; for example, predictable performance is worth a slight sacrifice in AI accuracy. So, we propose training AI systems in a human-centered manner and directly optimizing for team performance. We study this proposal for a specific type of human-AI team, where the human overseer chooses to accept the AI recommendation or solve the task themselves. To optimize the team performance we maximize the team's expected utility, expressed in terms of quality of the final decision, cost of verifying, and individual accuracies. Our experiments with linear and non-linear models on real-world, high-stakes datasets show that the improvements in utility while being small and varying across datasets and parameters (such as cost of mistake), are real and consistent with our definition of team utility. We discuss the shortcoming of current optimization approaches beyond well-studied loss functions such as log-loss, and encourage future work on human-centered optimization problems motivated by human-AI collaborations.

dataset, health & medicine, optimization problem, (21 more...)

arXiv.org Artificial Intelligence

2004.13102

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

The Challenge of Crafting Intelligible Intelligence

Weld, Daniel S., Bansal, Gagan

arXiv.org Artificial IntelligenceJul-2-2018

Since Artificial Intelligence (AI) software uses techniques like deep lookahead search and stochastic optimization of huge neural networks to fit mammoth datasets, it often results in complex behavior that is difficult for people to understand. Yet organizations are deploying AI algorithms in many mission-critical settings. To trust their behavior, we must make AI intelligible, either by using inherently interpretable models or by developing new methods for explaining and controlling otherwise overwhelmingly complex decisions using local approximation, vocabulary alignment, and interactive explanation. This paper argues that intelligibility is essential, surveys recent work on building such systems, and highlights key directions for research.

deep learning, explanation, neural network, (21 more...)

arXiv.org Artificial Intelligence

1803.04263

Country: North America > United States > Washington > King County (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (0.93)
Information Technology > Artificial Intelligence > Vision (0.93)
(3 more...)

Add feedback

Active Learning with Unbalanced Classes and Example-Generation Queries

Lin, Christopher H. (Microsoft) | Mausam, Mausam (Indian Institute of Technology, Delhi) | Weld, Daniel S. (University of Washington)

AAAI ConferencesJul-1-2018

Machine learning in real-world high-skew domains is difficult, because traditional strategies for crowdsourcing labeled training examples are ineffective at locating the scarce minority-class examples. For example, both random sampling and traditional active learning (which reduces to random sampling when just starting) will most likely recover very few minority-class examples. To bootstrap the machine learning process, researchers have proposed tasking the crowd with finding or generating minority-class examples, but such strategies have their weaknesses as well. They are unnecessarily expensive in well-balanced domains, and they often yield samples from a biased distribution that is unrepresentative of the one being learned.This paper extends the traditional active learning framework by investigating the problem of intelligently switching between various crowdsourcing strategies for obtaining labeled training examples in order to optimally train a classifier. We start by analyzing several such strategies (e.g., annotate an example, generate a minority-class example, etc.), and then develop a novel, skew-robust algorithm, called MB-CB, for the control problem. Experiments show that our method outperforms state-of-the-art GL-Hybrid by up to 14.3 points in F1 AUC, across various domains and class-frequency settings.

active learning, unbalanced class and example-generation query

AAAI Conferences

Sixth AAAI Conference on Human Computation and Crowdsourcing

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.93)

Add feedback

A Coverage-Based Utility Model for Identifying Unknown Unknowns

Bansal, Gagan (Paul G. Allen School of Computer Science and Engineering University of Washington) | Weld, Daniel S. (Paul G. Allen School of Computer Science and Engineering University of Washington)

AAAI ConferencesFeb-8-2018

A classifier’s low confidence in prediction is often indicative of whether its prediction will be wrong; in this case, inputs are called known unknowns. In contrast, unknown unknowns (UUs) are inputs on which a classifier makes a high confidence mistake. Identifying UUs is especially important in safety-critical domains like medicine (diagnosis) and law (recidivism prediction). Previous work by Lakkaraju et al. (2017) on identifying unknown unknowns assumes that the utility of each revealed UU is independent of the others, rather than considering the set holistically. While this assumption yields an efficient discovery algorithm, we argue that it produces an incomplete understanding of the classifier’s limitations. In response, this paper proposes a new class of utility models that rewards how well the discovered UUs cover (or "explain") a sample distribution of expected queries. Although choosing an optimal cover is intractable, even if the UUs were known, our utility model is monotone submodular, affording a greedy discovery strategy. Experimental results on four datasets show that our method outperforms bandit-based approaches and achieves within 60.9% utility of an omniscient, tractable upper bound.

artificial intelligence, data mining, utility model, (19 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country: North America > United States > Washington > King County > Seattle (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.47)

Add feedback

MicroTalk: Using Argumentation to Improve Crowdsourcing Accuracy

Drapeau, Ryan (University of Washington) | Chilton, Lydia B. (University of Washington) | Bragg, Jonathan (University of Washington) | Weld, Daniel S. (University of Washington)

AAAI ConferencesSep-24-2016

Crowd workers are human and thus sometimes make mistakes. In order to ensure the highest quality output, requesters often issue redundant jobs with gold test questions and sophisticated aggregation mechanisms based on expectation maximization (EM). While these methods yield accurate results in many cases, they fail on extremely difficult problems with local minima, such as situations where the majority of workers get the answer wrong. Indeed, this has caused some researchers to conclude that on some tasks crowdsourcing can never achieve high accuracies, no matter how many workers are involved. This paper presents a new quality-control workflow, called MicroTalk, that requires some workers to Justify their reasoning and asks others to Reconsider their decisions after reading counter-arguments from workers with opposing views. Experiments on a challenging NLP annotation task with workers from Amazon Mechanical Turk show that (1) argumentation improves the accuracy of individual workers by 20%, (2) restricting consideration to workers with complex explanations improves accuracy even more, and (3) our complete MicroTalk aggregation workflow produces much higher accuracy than simpler voting approaches for a range of budgets.

argumentation, crowdsourcing accuracy, microtalk

AAAI Conferences

Fourth AAAI Conference on Human Computation and Crowdsourcing

Technology:

Information Technology > Communications > Social Media > Crowdsourcing (0.89)
Information Technology > Artificial Intelligence > Natural Language > Explanation & Argumentation (0.60)

Add feedback