AITopics | Hope, Tom

Collaborating Authors

Hope, Tom

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

How do Humans and Language Models Reason About Creativity? A Comparative Analysis

Laverghetta, Antonio Jr., Chakrabarty, Tuhin, Hope, Tom, Pronchick, Jimmy, Bhawsar, Krupa, Beaty, Roger E.

arXiv.org Artificial IntelligenceFeb-5-2025

Creativity assessment in science and engineering is increasingly based on both human and AI judgment, but the cognitive processes and biases behind these evaluations remain poorly understood. We conducted two experiments examining how including example solutions with ratings impact creativity evaluation, using a finegrained annotation protocol where raters were tasked with explaining their originality scores and rating for the facets of remoteness (whether the response is "far" from everyday ideas), uncommonness (whether the response is rare), and cleverness. In Study 1, we analyzed creativity ratings from 72 experts with formal science or engineering training, comparing those who received example solutions with ratings (example) to those who did not (no example). Computational text analysis revealed that, compared to experts with examples, no-example experts used more comparative language (e.g., "better/worse") and emphasized solution uncommonness, suggesting they may have relied more on memory retrieval for comparisons. In Study 2, parallel analyses with state-of-the-art LLMs revealed that models prioritized uncommonness and remoteness of ideas when rating originality, suggesting an evaluative process rooted around the semantic similarity of ideas. In the example condition, while LLM accuracy in predicting the true originality scores improved, the correlations of remoteness, uncommonness, and cleverness with originality also increased substantially - to upwards of 0.99 - suggesting a homogenization in the LLMs evaluation of the individual facets. These findings highlight important implications for how humans and AI reason about creativity and suggest diverging preferences for what different populations prioritize when rating.

artificial intelligence, large language model, natural language, (4 more...)

arXiv.org Artificial Intelligence

2502.03253

Genre: Research Report (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.64)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.53)

Add feedback

Scideator: Human-LLM Scientific Idea Generation Grounded in Research-Paper Facet Recombination

Radensky, Marissa, Shahid, Simra, Fok, Raymond, Siangliulue, Pao, Hope, Tom, Weld, Daniel S.

arXiv.org Artificial IntelligenceJan-13-2025

A good idea should be relevant to the scientist's interests and novel within the scientific community. Research papers are a major source of inspiration for relevant and novel ideas, as they expose scientists to relevant concepts to re-combine and form new ideas [4, 21, 36]. However, generating relevant and novel scientific ideas by recombining concepts from research papers is difficult for multiple reasons. For one, scientists must wade through an ever-expanding scientific literature to find relevant concepts [2, 19]. Moreover, the phenomenon of fixation biases scientists against considering more diverse concepts and concept recombinations for their research; instead, they are predisposed to thinking about a problem in familiar terms, which hinders the stimulation of novel ideas [11, 37]. Even if a scientist manages to identify interesting concept recombinations to form potential research ideas, assessing the ideas' novelty in comparison to the existing literature is a cumbersome yet critical task. Building a fully or semi-automated ideation system has been an ambition of researchers for decades, and Scideatorbuilds on strong prior work from many other researchers, filling a unique niche. We extend a line of work that presents systems for finding analogies between research papers [4, 21, 36], adopting their facet-based framework but using modern large language model (LLM) methods to identify relevant facets and perform facet recombinations. We are also inspired by recent work showing that LLMs have promise to assist ideation in domains outside science, helping people to generate more ideas [6] and more diverse ideas [27, 40].

facet, large language model, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2409.14634

Country: North America > United States (0.28)

Genre:

Research Report > Promising Solution (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

IdeaSynth: Iterative Research Idea Development Through Evolving and Composing Idea Facets with Literature-Grounded Feedback

Pu, Kevin, Feng, K. J. Kevin, Grossman, Tovi, Hope, Tom, Mishra, Bhavana Dalvi, Latzke, Matt, Bragg, Jonathan, Chang, Joseph Chee, Siangliulue, Pao

arXiv.org Artificial IntelligenceOct-5-2024

Research ideation involves broad exploring and deep refining ideas. Both require deep engagement with literature. Existing tools focus primarily on idea broad generation, yet offer little support for iterative specification, refinement, and evaluation needed to further develop initial ideas. To bridge this gap, we introduce IdeaSynth, a research idea development system that uses LLMs to provide literature-grounded feedback for articulating research problems, solutions, evaluations, and contributions. IdeaSynth represents these idea facets as nodes on a canvas, and allow researchers to iteratively refine them by creating and exploring variations and composing them. Our lab study (N=20) showed that participants, while using IdeaSynth, explored more alternative ideas and expanded initial ideas with more details compared to a strong LLM-based baseline. Our deployment study (N=7) demonstrated that participants effectively used IdeaSynth for real-world research projects at various ideation stages from developing initial ideas to revising framings of mature manuscripts, highlighting the possibilities to adopt IdeaSynth in researcher's workflows.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2410.04025

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (1.00)
Personal > Interview (0.67)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Human Computer Interaction (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Inferring Scientific Cross-Document Coreference and Hierarchy with Definition-Augmented Relational Reasoning

Forer, Lior, Hope, Tom

arXiv.org Artificial IntelligenceSep-24-2024

We address the fundamental task of inferring cross-document coreference and hierarchy in scientific texts, which has important applications in knowledge graph construction, search, recommendation and discovery. LLMs can struggle when faced with many long-tail technical concepts with nuanced variations. We present a novel method which generates context-dependent definitions of concept mentions by retrieving full-text literature, and uses the definitions to enhance detection of cross-document relations. We further generate relational definitions, which describe how two concept mentions are related or different, and design an efficient re-ranking approach to address the combinatorial explosion involved in inferring links across papers. In both fine-tuning and in-context learning settings we achieve large gains in performance. We provide analysis of generated definitions, shedding light on the relational reasoning ability of LLMs over fine-grained scientific concepts.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2409.15113

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

SciRIFF: A Resource to Enhance Language Model Instruction-Following over Scientific Literature

Wadden, David, Shi, Kejian, Morrison, Jacob, Naik, Aakanksha, Singh, Shruti, Barzilay, Nitzan, Lo, Kyle, Hope, Tom, Soldaini, Luca, Shen, Shannon Zejiang, Downey, Doug, Hajishirzi, Hannaneh, Cohan, Arman

arXiv.org Artificial IntelligenceJun-18-2024

We present SciRIFF (Scientific Resource for Instruction-Following and Finetuning), a dataset of 137K instruction-following demonstrations for 54 tasks covering five essential scientific literature understanding capabilities: information extraction, summarization, question answering, claim verification, and classification. SciRIFF demonstrations are notable for their long input contexts, detailed task specifications, and complex structured outputs. While instruction-following resources are available in specific domains such as clinical medicine and chemistry, SciRIFF is the first dataset focused on extracting and synthesizing information from research literature across a wide range of scientific fields. To demonstrate the utility of SciRIFF, we develop a sample-efficient strategy to adapt a general instruction-following model for science by performing additional finetuning on a mix of general-domain and SciRIFF demonstrations. In evaluations on nine held-out scientific tasks, our model -- called SciTulu -- improves over a strong LLM baseline by 28.1% and 6.5% at the 7B and 70B scales respectively, while maintaining general instruction-following performance within 2% of the baseline. We are optimistic that SciRIFF will facilitate the development and evaluation of LLMs to help researchers navigate the ever-growing body of scientific literature. We release our dataset, model checkpoints, and data processing and evaluation code to enable further research.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2406.07835

Country:

North America > United States (1.00)
Asia > Middle East > UAE (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Consumer Health (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.93)
Government > Regional Government > North America Government > United States Government > FDA (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

What Can Natural Language Processing Do for Peer Review?

Kuznetsov, Ilia, Afzal, Osama Mohammed, Dercksen, Koen, Dycke, Nils, Goldberg, Alexander, Hope, Tom, Hovy, Dirk, Kummerfeld, Jonathan K., Lauscher, Anne, Leyton-Brown, Kevin, Lu, Sheng, Mausam, null, Mieskes, Margot, Névéol, Aurélie, Pruthi, Danish, Qu, Lizhen, Schwartz, Roy, Smith, Noah A., Solorio, Thamar, Wang, Jingyan, Zhu, Xiaodan, Rogers, Anna, Shah, Nihar B., Gurevych, Iryna

arXiv.org Artificial IntelligenceMay-10-2024

The number of scientific articles produced every year is growing rapidly. Providing quality control over them is crucial for scientists and, ultimately, for the public good. In modern science, this process is largely delegated to peer review -- a distributed procedure in which each submission is evaluated by several independent experts in the field. Peer review is widely used, yet it is hard, time-consuming, and prone to error. Since the artifacts involved in peer review -- manuscripts, reviews, discussions -- are largely text-based, Natural Language Processing has great potential to improve reviewing. As the emergence of large language models (LLMs) has enabled NLP assistance for many new tasks, the discussion on machine-assisted peer review is picking up the pace. Yet, where exactly is help needed, where can NLP help, and where should it stand aside? The goal of our paper is to provide a foundation for the future efforts in NLP for peer-reviewing assistance. We discuss peer review as a general process, exemplified by reviewing at AI conferences. We detail each step of the process from manuscript submission to camera-ready revision, and discuss the associated challenges and opportunities for NLP assistance, illustrated by existing work. We then turn to the big challenges in NLP for peer review as a whole, including data acquisition and licensing, operationalization and experimentation, and ethical issues. To help consolidate community efforts, we create a companion repository that aggregates key datasets pertaining to peer review. Finally, we issue a detailed call for action for the scientific community, NLP and AI researchers, policymakers, and funding bodies to help bring the research in NLP for peer review forward. We hope that our work will help set the agenda for research in machine-assisted scientific quality control in the age of AI, within the NLP community and beyond.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2405.06563

Country:

Europe (1.00)
Asia > Middle East (0.67)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre:

Research Report > Experimental Study (1.00)
Overview (1.00)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.86)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.45)

Add feedback

On-the-fly Definition Augmentation of LLMs for Biomedical NER

Munnangi, Monica, Feldman, Sergey, Wallace, Byron C, Amir, Silvio, Hope, Tom, Naik, Aakanksha

arXiv.org Artificial IntelligenceApr-23-2024

Despite their general capabilities, LLMs still struggle on biomedical NER tasks, which are difficult due to the presence of specialized terminology and lack of training data. In this work we set out to improve LLM performance on biomedical NER in limited data settings via a new knowledge augmentation approach which incorporates definitions of relevant concepts on-the-fly. During this process, to provide a test bed for knowledge augmentation, we perform a comprehensive exploration of prompting strategies. Our experiments show that definition augmentation is useful for both open source and closed LLMs. For example, it leads to a relative improvement of 15\% (on average) in GPT-4 performance (F1) across all (six) of our test datasets. We conduct extensive ablations and analyses to demonstrate that our performance improvements stem from adding relevant definitional knowledge. We find that careful prompting strategies also improve LLM performance, allowing them to outperform fine-tuned language models in few-shot settings. To facilitate future research in this direction, we release our code at https://github.com/allenai/beacon.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2404.00152

Country:

North America > United States (0.28)
Europe (0.28)
Asia > Middle East > UAE (0.14)

Genre: Research Report > New Finding (0.47)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Learning to Generate Novel Scientific Directions with Contextualized Literature-based Discovery

Wang, Qingyun, Downey, Doug, Ji, Heng, Hope, Tom

arXiv.org Artificial IntelligenceJan-22-2024

Literature-Based Discovery (LBD) aims to discover new scientific knowledge by mining papers and generating hypotheses. Standard LBD is limited to predicting pairwise relations between discrete concepts (e.g., drug-disease links), and ignores critical contexts like experimental settings (e.g., a specific patient population where a drug is evaluated) and background motivations (e.g., to find drugs without specific side effects). We address these limitations with a novel formulation of contextualized-LBD (C-LBD): generating scientific hypotheses in natural language, while grounding them in a context that controls the hypothesis search space. We present a modeling framework using retrieval of ``inspirations'' from past scientific papers. Our evaluations reveal that GPT-4 tends to generate ideas with overall low technical depth and novelty, while our inspiration prompting approaches partially mitigate this issue. Our work represents a first step toward building language models that generate new ideas derived from scientific literature.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2305.14259

Country:

Europe (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre:

Research Report > Experimental Study (0.67)
Research Report > Promising Solution (0.67)
Research Report > New Finding (0.46)

Industry:

Health & Medicine (1.00)
Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)

Add feedback

MARG: Multi-Agent Review Generation for Scientific Papers

D'Arcy, Mike, Hope, Tom, Birnbaum, Larry, Downey, Doug

arXiv.org Artificial IntelligenceJan-8-2024

We study the ability of LLMs to generate feedback for scientific papers and develop MARG, a feedback generation approach using multiple LLM instances that engage in internal discussion. By distributing paper text across agents, MARG can consume the full text of papers beyond the input length limitations of the base LLM, and by specializing agents and incorporating sub-tasks tailored to different comment types (experiments, clarity, impact) it improves the helpfulness and specificity of feedback. In a user study, baseline methods using GPT-4 were rated as producing generic or very generic comments more than half the time, and only 1.7 comments per paper were rated as good overall in the best baseline. Our system substantially improves the ability of GPT-4 to generate specific and helpful feedback, reducing the rate of generic comments from 60% to 29% and generating 3.7 good comments per paper (a 2.2x improvement).

information, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2401.04259

Country:

Asia > Middle East > Israel (0.14)
Europe > Croatia (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (0.86)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.67)

Add feedback

CHAMP: Efficient Annotation and Consolidation of Cluster Hierarchies

Cattan, Arie, Hope, Tom, Downey, Doug, Bar-Haim, Roy, Eden, Lilach, Kantor, Yoav, Dagan, Ido

arXiv.org Artificial IntelligenceNov-19-2023

Various NLP tasks require a complex hierarchical structure over nodes, where each node is a cluster of items. Examples include generating entailment graphs, hierarchical cross-document coreference resolution, annotating event and subevent relations, etc. To enable efficient annotation of such hierarchical structures, we release CHAMP, an open source tool allowing to incrementally construct both clusters and hierarchy simultaneously over any type of texts. This incremental approach significantly reduces annotation time compared to the common pairwise annotation approach and also guarantees maintaining transitivity at the cluster and hierarchy levels. Furthermore, CHAMP includes a consolidation mode, where an adjudicator can easily compare multiple cluster hierarchy annotations and resolve disagreements.

artificial intelligence, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2311.11301

Country:

Europe (1.00)
North America > United States > Texas (0.14)
North America > United States > Michigan (0.14)
(3 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications > Social Media > Crowdsourcing (0.47)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.46)

Add feedback