AITopics | Hoyle, Alexander

Collaborating Authors

Hoyle, Alexander

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Large Language Models Struggle to Describe the Haystack without Human Help: Human-in-the-loop Evaluation of LLMs

Li, Zongxia, Calvo-Bartolomé, Lorena, Hoyle, Alexander, Xu, Paiheng, Dima, Alden, Fung, Juan Francisco, Boyd-Graber, Jordan

arXiv.org Artificial IntelligenceFeb-20-2025

A common use of NLP is to facilitate the understanding of large document collections, with a shift from using traditional topic models to Large Language Models. Yet the effectiveness of using LLM for large corpus understanding in real-world applications remains under-explored. This study measures the knowledge users acquire with unsupervised, supervised LLM-based exploratory approaches or traditional topic models on two datasets. While LLM-based methods generate more human-readable topics and show higher average win probabilities than traditional models for data exploration, they produce overly generic topics for domain-specific datasets that do not easily allow users to learn much about the documents. Adding human supervision to the LLM generation process improves data exploration by mitigating hallucination and over-genericity but requires greater human effort. In contrast, traditional. models like Latent Dirichlet Allocation (LDA) remain effective for exploration but are less user-friendly. We show that LLMs struggle to describe the haystack of large corpora without human help, particularly domain-specific data, and face scaling and hallucination limitations due to context length constraints. Dataset available at https://huggingface. co/datasets/zli12321/Bills.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2502.14748

Country:

Europe (1.00)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Law (0.93)
Health & Medicine (0.93)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

The Prompt Report: A Systematic Survey of Prompting Techniques

Schulhoff, Sander, Ilie, Michael, Balepur, Nishant, Kahadze, Konstantine, Liu, Amanda, Si, Chenglei, Li, Yinheng, Gupta, Aayush, Han, HyoJung, Schulhoff, Sevien, Dulepet, Pranav Sandeep, Vidyadhara, Saurav, Ki, Dayeon, Agrawal, Sweta, Pham, Chau, Kroiz, Gerson, Li, Feileen, Tao, Hudson, Srivastava, Ashay, Da Costa, Hevander, Gupta, Saloni, Rogers, Megan L., Goncearenco, Inna, Sarli, Giuseppe, Galynker, Igor, Peskoff, Denis, Carpuat, Marine, White, Jules, Anadkat, Shyamal, Hoyle, Alexander, Resnik, Philip

arXiv.org Artificial IntelligenceJul-14-2024

Generative Artificial Intelligence (GenAI) systems are being increasingly deployed across all parts of industry and research settings. Developers and end users interact with these systems through the use of prompting or prompt engineering. While prompting is a widespread and highly researched concept, there exists conflicting terminology and a poor ontological understanding of what constitutes a prompt due to the area's nascency. This paper establishes a structured understanding of prompts, by assembling a taxonomy of prompting techniques and analyzing their use. We present a comprehensive vocabulary of 33 vocabulary terms, a taxonomy of 58 text-only prompting techniques, and 40 techniques for other modalities. We further present a meta-analysis of the entire literature on natural language prefix-prompting.

artificial intelligence, natural language, prompting technique, (2 more...)

arXiv.org Artificial Intelligence

2406.06608

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language (0.87)

Add feedback

TopicGPT: A Prompt-based Topic Modeling Framework

Pham, Chau Minh, Hoyle, Alexander, Sun, Simeng, Iyyer, Mohit

arXiv.org Artificial IntelligenceNov-2-2023

Topic modeling is a well-established technique for exploring text corpora. Conventional topic models (e.g., LDA) represent topics as bags of words that often require "reading the tea leaves" to interpret; additionally, they offer users minimal semantic control over topics. To tackle these issues, we introduce TopicGPT, a prompt-based framework that uses large language models (LLMs) to uncover latent topics within a provided text collection. TopicGPT produces topics that align better with human categorizations compared to competing methods: for example, it achieves a harmonic mean purity of 0.74 against human-annotated Wikipedia topics compared to 0.64 for the strongest baseline. Its topics are also more interpretable, dispensing with ambiguous bags of words in favor of topics with natural language labels and associated free-form descriptions. Moreover, the framework is highly adaptable, allowing users to specify constraints and modify topics without the need for model retraining. TopicGPT can be further extended to hierarchical topical modeling, enabling users to explore topics at various levels of granularity. By streamlining access to high-quality and interpretable topics, TopicGPT represents a compelling, human-centered approach to topic modeling.

large language model, machine learning, natural language, (23 more...)

arXiv.org Artificial Intelligence

2311.01449

Country:

North America > United States > Illinois (0.14)
North America > United States > Kansas (0.14)
North America > United States > Massachusetts (0.14)
(3 more...)

Genre: Research Report (0.82)

Industry:

Transportation (1.00)
Law (1.00)
Consumer Products & Services > Restaurants (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Natural Language Decompositions of Implicit Content Enable Better Text Representations

Hoyle, Alexander, Sarkar, Rupak, Goel, Pranav, Resnik, Philip

arXiv.org Artificial IntelligenceOct-24-2023

When people interpret text, they rely on inferences that go beyond the observed language itself. Inspired by this observation, we introduce a method for the analysis of text that takes implicitly communicated content explicitly into account. We use a large language model to produce sets of propositions that are inferentially related to the text that has been observed, then validate the plausibility of the generated content via human judgments. Incorporating these explicit representations of implicit content proves useful in multiple problem settings that involve the human interpretation of utterances: assessing the similarity of arguments, making sense of a body of opinion data, and modeling legislative behavior. Our results suggest that modeling the meanings behind observed language, rather than the literal text alone, is a valuable direction for NLP and particularly its applications to social science.

artificial intelligence, content enable better text representation, natural language decomposition

arXiv.org Artificial Intelligence

2305.14583

Genre: Research Report > New Finding (0.53)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Revisiting Automated Topic Model Evaluation with Large Language Models

Stammbach, Dominik, Zouhar, Vilém, Hoyle, Alexander, Sachan, Mrinmaya, Ash, Elliott

arXiv.org Artificial IntelligenceOct-22-2023

Topic models are used to make sense of large text collections. However, automatically evaluating topic model output and determining the optimal number of topics both have been longstanding challenges, with no effective automated solutions to date. This paper proposes using large language models to evaluate such output. We find that large language models appropriately assess the resulting topics, correlating more strongly with human judgments than existing automated metrics. We then investigate whether we can use large language models to automatically determine the optimal number of topics. We automatically assign labels to documents and choosing configurations with the most pure labels returns reasonable values for the optimal number of topics.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2305.12152

Country:

North America > United States (1.00)
Asia > Middle East > UAE (0.14)

Genre: Research Report > New Finding (0.47)

Industry:

Law (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)
Health & Medicine > Therapeutic Area > Immunology (0.46)
Government > Regional Government > North America Government > United States Government (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback