Goto

Collaborating Authors

 inquirer


At Least Two Newspapers Syndicated AI Garbage

The Atlantic - Technology

At first glance, "Heat Index" appears as inoffensive as newspaper features get. A "summer guide" sprawling across more than 50 pages, the feature, which was syndicated over the past week in both the Chicago Sun-Times and The Philadelphia Inquirer, contains "303 Must-Dos, Must-Tastes, and Must-Tries" for the sweaty months ahead. Readers are advised in one section to "Take a moonlight hike on a well-marked trail" and "Fly a kite on a breezy afternoon." In others, they receive tips about running a lemonade stand and enjoying "unexpected frozen treats." Yet close readers of the guide noticed that something was very off.


Breaking the Stigma! Unobtrusively Probe Symptoms in Depression Disorder Diagnosis Dialogue

Cao, Jieming, Huang, Chen, Zhang, Yanan, Deng, Ruibo, Zhang, Jincheng, Lei, Wenqiang

arXiv.org Artificial Intelligence

Stigma has emerged as one of the major obstacles to effectively diagnosing depression, as it prevents users from open conversations about their struggles. This requires advanced questioning skills to carefully probe the presence of specific symptoms in an unobtrusive manner. While recent efforts have been made on depression-diagnosis-oriented dialogue systems, they largely ignore this problem, ultimately hampering their practical utility. To this end, we propose a novel and effective method, UPSD$^{4}$, developing a series of strategies to promote a sense of unobtrusiveness within the dialogue system and assessing depression disorder by probing symptoms. We experimentally show that UPSD$^{4}$ demonstrates a significant improvement over current baselines, including unobtrusiveness evaluation of dialogue content and diagnostic accuracy. We believe our work contributes to developing more accessible and user-friendly tools for addressing the widespread need for depression diagnosis.


Fine-tuning Large Language Models for Improving Factuality in Legal Question Answering

Hu, Yinghao, Gan, Leilei, Xiao, Wenyi, Kuang, Kun, Wu, Fei

arXiv.org Artificial Intelligence

Hallucination, or the generation of incorrect or fabricated information, remains a critical challenge in large language models (LLMs), particularly in high-stake domains such as legal question answering (QA). In order to mitigate the hallucination rate in legal QA, we first introduce a benchmark called LegalHalBench and three automatic metrics to evaluate the common hallucinations when LLMs answer legal questions. We then propose a hallucination mitigation method that integrates behavior cloning and a novel Hard Sample-aware Iterative Direct Preference Optimization (HIPO). We conduct extensive real-data experiments to validate the effectiveness of our approach. Our results demonstrate remarkable improvements in various metrics, including the newly proposed Non-Hallucinated Statute Rate, Statute Relevance Rate, Legal Claim Truthfulness, as well as traditional metrics such as METEOR, BERTScore, ROUGE-L, and win rates.


LLM Roleplay: Simulating Human-Chatbot Interaction

Tamoyan, Hovhannes, Schuff, Hendrik, Gurevych, Iryna

arXiv.org Artificial Intelligence

The development of chatbots requires collecting a large number of human-chatbot dialogues to reflect the breadth of users' sociodemographic backgrounds and conversational goals. However, the resource requirements to conduct the respective user studies can be prohibitively high and often only allow for a narrow analysis of specific dialogue goals and participant demographics. In this paper, we propose LLM-Roleplay: a goal-oriented, persona-based method to automatically generate diverse multi-turn dialogues simulating human-chatbot interaction. LLM-Roleplay can be applied to generate dialogues with any type of chatbot and uses large language models (LLMs) to play the role of textually described personas. To validate our method we collect natural human-chatbot dialogues from different sociodemographic groups and conduct a human evaluation to compare real human-chatbot dialogues with our generated dialogues. We compare the abilities of state-of-the-art LLMs in embodying personas and holding a conversation and find that our method can simulate human-chatbot dialogues with a high indistinguishability rate.


Fetterman campaign says Dem nominee is healthy after two cognitive tests, won't provide documentation: Report

FOX News

Democratic U.S. Senate candidate in Pennsylvania, John Fetterman, ripped Republican opponent Dr. Mehmet Oz for taking "cheap shots" at his health on campaign trail in an interview with MSNBC's Alex Wagner on Thursday. The campaign for Pennsylvania Democratic Senate nominee John Fetterman has released some results from two recent cognitive tests that he recently took as questions about his ability to serve in the Senate continue to circulate ahead of the November election. According to the Philadelphia Inquirer, Fetterman took two cognitive tests earlier this year, one being the Saint Louis University Mental Status Examination (SLUMS) and the other being the Repeatable Battery for the Assessment of Neuropsychological Status (RBANS). The SLUMS test, which consists of simple memory questions and requires patients to perform basic tasks like recognizing a shape and drawing in X inside of it, was taken by Fetterman on July 14 and the RBANS test was taken by Fetterman this week, according to the Inquirer. The RBANS test consists of an assessment related to immediate memory, delayed memory, attention, language, and other functions.


Turing Test and the Practice of Law: The Role of Autonomous Levels of AI Legal Reasoning

Eliot, Lance

arXiv.org Artificial Intelligence

A major question that has generally been unaddressed involves how we will know when AILR has achieved Artificial Intelligence (AI) is increasingly being autonomous capacities. So far, AI as applied to the applied to law and a myriad of legal tasks amid legal profession has primarily consisted of aiding or attempts to bolster AI Legal Reasoning (AILR) supporting the legal work of human lawyers but has autonomous capabilities. A major question that has not reached the capability of being able to generally been unaddressed involves how we will autonomously perform legal tasks. A base assumption know when AILR has achieved autonomous is that inexorably there will be advances made in AI capacities. The field of AI has grappled with similar that will boost AILR systems and ultimately transcend quandaries over how to assess the attainment of them into having autonomous capacities, but there Artificial General Intelligence (AGI), a persistently does not yet exist any bona fide and nor rigorous discussed issue among scholars since the inception of means to viably attest to whether such AILR AI, with the Turing Test communally being considered autonomy has been achieved [44].


A Personalized System for Conversational Recommendations

Goker, M. H., Langley, P., Thompson, C. A.

arXiv.org Artificial Intelligence

Searching for and making decisions about information is becoming increasingly difficult as the amount of information and number of choices increases. Recommendation systems help users find items of interest of a particular type, such as movies or restaurants, but are still somewhat awkward to use. Our solution is to take advantage of the complementary strengths of personalized recommendation systems and dialogue systems, creating personalized aides. We present a system -- the Adaptive Place Advisor -- that treats item selection as an interactive, conversational process, with the program inquiring about item attributes and the user responding. Individual, long-term user preferences are unobtrusively obtained in the course of normal recommendation dialogues and used to direct future conversations with the same user. We present a novel user model that influences both item search and the questions asked during a conversation. We demonstrate the effectiveness of our system in significantly reducing the time and number of interactions required to find a satisfactory item, as compared to a control group of users interacting with a non-adaptive version of the system.


A Personalized System for Conversational Recommendations

Thompson, C. A., Goker, M. H., Langley, P.

Journal of Artificial Intelligence Research

Searching for and making decisions about information is becoming increasingly difficult as the amount of information and number of choices increases. Recommendation systems help users find items of interest of a particular type, such as movies or restaurants, but are still somewhat awkward to use. Our solution is to take advantage of the complementary strengths of personalized recommendation systems and dialogue systems, creating personalized aides. We present a system -- the Adaptive Place Advisor -- that treats item selection as an interactive, conversational process, with the program inquiring about item attributes and the user responding. Individual, long-term user preferences are unobtrusively obtained in the course of normal recommendation dialogues and used to direct future conversations with the same user. We present a novel user model that influences both item search and the questions asked during a conversation. We demonstrate the effectiveness of our system in significantly reducing the time and number of interactions required to find a satisfactory item, as compared to a control group of users interacting with a non-adaptive version of the system.