Goto

Collaborating Authors

 elaboration


Functional-Group-Based Diffusion for Pocket-Specific Molecule Generation and Elaboration

Neural Information Processing Systems

In recent years, AI-assisted drug design methods have been proposed to generate molecules given the pockets' structures of target proteins. Most of them are {\em atom-level-based} methods, which consider atoms as basic components and generate atom positions and types. In this way, however, it is hard to generate realistic fragments with complicated structures. To solve this, we propose \textsc{D3FG}, a {\em functional-group-based} diffusion model for pocket-specific molecule generation and elaboration.


Human-Centred Evaluation of Text-to-Image Generation Models for Self-expression of Mental Distress: A Dataset Based on GPT-4o

He, Sui, Qian, Shenbin

arXiv.org Artificial Intelligence

Effective communication is central to achieving positive healthcare outcomes in mental health contexts, yet international students often face linguistic and cultural barriers that hinder their communication of mental distress. In this study, we evaluate the effectiveness of AI-generated images in supporting self-expression of mental distress. To achieve this, twenty Chinese international students studying at UK universities were invited to describe their personal experiences of mental distress. These descriptions were elaborated using GPT-4o with four persona-based prompt templates rooted in contemporary counselling practice to generate corresponding images. Participants then evaluated the helpfulness of generated images in facilitating the expression of their feelings based on their original descriptions. The resulting dataset comprises 100 textual descriptions of mental distress, 400 generated images, and corresponding human evaluation scores. Findings indicate that prompt design substantially affects perceived helpfulness, with the illustrator persona achieving the highest ratings. This work introduces the first publicly available text-to-image evaluation dataset with human judgment scores in the mental health domain, offering valuable resources for image evaluation, reinforcement learning with human feedback, and multi-modal research on mental health communication.


Levers of Power in the Field of AI

Mackenzie, Tammy, Punj, Sukriti, Perez, Natalie, Bhaduri, Sreyoshi, Radeljic, Branislav

arXiv.org Artificial Intelligence

This paper examines how decision makers in academia, government, business, and civil society navigate questions of power in implementations of artificial intelligence (AI). The study explores how individuals experience and exercise "levers of power," which are presented as social mechanisms that shape institutional responses to technological change. The study reports on the responses of personalized questionnaires designed to gather insight on a decision maker's institutional purview, based on an institutional governance framework developed from the work of Neo Institutionalists. Findings present the anonymized, real responses and circumstances of respondents in the form of twelve fictional personas of high-level decision makers from North America and Europe. These personas illustrate how personal agency, organizational logics, and institutional infrastructures may intersect in the governance of AI. The decision makers' responses to the questionnaires then inform a discussion of the field level personal power of decision-makers, methods of fostering institutional stability in times of change, and methods of influencing institutional change in the field of AI. The final section of the discussion presents a table of the dynamics of the levers of power in the field of AI for change makers and 5 testable hypotheses for institutional and social movement researchers. In summary, this study provides insight on the means for policymakers within institutions and their counterparts in civil society to personally engage with AI governance.


Annotate Rhetorical Relations with INCEpTION: A Comparison with Automatic Approaches

Emon, Mehedi Hasan

arXiv.org Artificial Intelligence

Automatically identifying rhetorical relations in discourse units is a challenging task in natural language processing (NLP) because it should be able to logically and semantically connect the discourse units. Although large language models (LLMs) shows po tential for application in many domains, including text classification tasks, their effectiveness in predicting rhetorical relations remains open for research. One of the major challenges in this domain is the lack of annotated data sets capturing differen t rhetorical relations, which would then make model training more difficult. In this research, we manually created the da-tasets from various cricket reports and then annotated the reports as discourse units. We used the INCEpTION annotation tools for annotation and then structured the dataset for the machine - learning model.


Evaluating the Creativity of LLMs in Persian Literary Text Generation

Tourajmehr, Armin, Modarres, Mohammad Reza, Yaghoobzadeh, Yadollah

arXiv.org Artificial Intelligence

Large language models (LLMs) have demonstrated notable creative abilities in generating literary texts, including poetry and short stories. However, prior research has primarily centered on English, with limited exploration of non-English literary traditions and without standardized methods for assessing creativity. In this paper, we evaluate the capacity of LLMs to generate Persian literary text enriched with culturally relevant expressions. We build a dataset of user-generated Persian literary spanning 20 diverse topics and assess model outputs along four creativity dimensions-originality, fluency, flexibility, and elaboration-by adapting the Torrance Tests of Creative Thinking. To reduce evaluation costs, we adopt an LLM as a judge for automated scoring and validate its reliability against human judgments using intraclass correlation coefficients, observing strong agreement. In addition, we analyze the models' ability to understand and employ four core literary devices: simile, metaphor, hyperbole, and antithesis. Our results highlight both the strengths and limitations of LLMs in Persian literary text generation, underscoring the need for further refinement.


The AI Memory Gap: Users Misremember What They Created With AI or Without

Zindulka, Tim, Goller, Sven, Fernandes, Daniela, Welsch, Robin, Buschek, Daniel

arXiv.org Artificial Intelligence

As large language models (LLMs) become embedded in interactive text generation, disclosure of AI as a source depends on people remembering which ideas or texts came from themselves and which were created with AI. We investigate how accurately people remember the source of content when using AI. In a pre-registered experiment, 184 participants generated and elaborated on ideas both unaided and with an LLM-based chatbot. One week later, they were asked to identify the source (noAI vs withAI) of these ideas and texts. Our findings reveal a significant gap in memory: After AI use, the odds of correct attribution dropped, with the steepest decline in mixed human-AI workflows, where either the idea or elaboration was created with AI. We validated our results using a computational model of source memory. Discussing broader implications, we highlight the importance of considering source confusion in the design and use of interactive text generation technologies.


Requirements for Recognition and Rapid Response to Unfamiliar Events Outside of Agent Design Scope

Wray, Robert E., Jones, Steven J., Laird, John E.

arXiv.org Artificial Intelligence

Regardless of past learning, an agent in an open world will face unfamiliar events outside of prior experience, existing models, or policies. Further, the agent will sometimes lack relevant knowledge and/or sufficient time to assess the situation and evaluate response options. How can an agent respond reasonably to situations that are outside of its original design scope? How can it recognize such situations sufficiently quickly and reliably to determine reasonable, adaptive courses of action? We identify key characteristics needed for solutions, review the state-of-the-art, and outline a proposed, novel approach that combines domain-general meta-knowledge (inspired by human cognition) and metareason-ing. This approach offers potential for fast, adaptive responses to unfamiliar situations, more fully meeting the performance characteristics required for open-world, general agents.


An Artificial Trend Index for Private Consumption Using Google Trends

Tenorio, Juan, Alpiste, Heidi, Remón, Jakelin, Segil, Arian

arXiv.org Machine Learning

In recent years, the use of databases that analyze trends, sentiments or news to make economic projections or create indicators has gained significant popularity, particularly with the Google Trends platform. This article explores the potential of Google search data to develop a new index that improves economic forecasts, with a particular focus on one of the key components of economic activity: private consumption (64\% of GDP in Peru). By selecting and estimating categorized variables, machine learning techniques are applied, demonstrating that Google data can identify patterns to generate a leading indicator in real time and improve the accuracy of forecasts. Finally, the results show that Google's "Food" and "Tourism" categories significantly reduce projection errors, highlighting the importance of using this information in a segmented manner to improve macroeconomic forecasts.


Understanding Common Ground Misalignment in Goal-Oriented Dialog: A Case-Study with Ubuntu Chat Logs

Sarkar, Rupak, Srikanth, Neha, Hudson, Taylor, Rudinger, Rachel, Bonial, Claire, Resnik, Philip

arXiv.org Artificial Intelligence

While it is commonly accepted that maintaining common ground plays a role in conversational success, little prior research exists connecting conversational grounding to success in task-oriented conversations. We study failures of grounding in the Ubuntu IRC dataset, where participants use text-only communication to resolve technical issues. We find that disruptions in conversational flow often stem from a misalignment in common ground, driven by a divergence in beliefs and assumptions held by participants. These disruptions, which we call conversational friction, significantly correlate with task success. We find that although LLMs can identify overt cases of conversational friction, they struggle with subtler and more context-dependent instances requiring pragmatic or domain-specific reasoning.


LLM+AL: Bridging Large Language Models and Action Languages for Complex Reasoning about Actions

Ishay, Adam, Lee, Joohyung

arXiv.org Artificial Intelligence

Large Language Models (LLMs) have made significant strides in various intelligent tasks but still struggle with complex action reasoning tasks that require systematic search. To address this limitation, we propose a method that bridges the natural language understanding capabilities of LLMs with the symbolic reasoning strengths of action languages. Our approach, termed "LLM+AL," leverages the LLM's strengths in semantic parsing and commonsense knowledge generation alongside the action language's proficiency in automated reasoning based on encoded knowledge. We compare LLM+AL against state-of-the-art LLMs, including ChatGPT-4, Claude 3 Opus, Gemini Ultra 1.0, and o1-preview, using benchmarks for complex reasoning about actions. Our findings indicate that, although all methods exhibit errors, LLM+AL, with relatively minimal human corrections, consistently leads to correct answers, whereas standalone LLMs fail to improve even with human feedback. LLM+AL also contributes to automated generation of action languages.