Personal
Black Eyed Peas star taps AI bot as radio show co-host: 'Didn't want to just do a traditional show'
Black Eyed Peas member Will.i.am is taking another step into the future, partnering with an AI to co-host a radio show. Will.i.am is set to debut "Will.i.am Presents the FYI Show" Jan. 25 on Sirius XM radio, a new weekly show co-hosted by the musician and "the first ever AI co-host on the SiriusXM platform, qd.pi [pronounced cutie pi]," per a press release for the show. In an interview with The Hollywood Reporter, Will.i.am. He continued, "I'm ultra-freaking colorful and expressive. And that combination, we ain't seen in the history of freaking broadcasts anywhere."
TypeDance: Creating Semantic Typographic Logos from Image through Personalized Generation
Xiao, Shishi, Wang, Liangwei, Ma, Xiaojuan, Zeng, Wei
One notable application is the semantic typographic logo, which symbolizes a unique identity in a concise yet informative manner. Due to its expressiveness and memorability [7], semantic typographic logo has been widely used as visual signatures for individuals [28], brand logos with commercial values [15, 20], and symbols for significant events and city promotions [3, 43]. However, crafting a semantic typographic logo presents a formidable challenge, requiring seamless blending of typeface and imagery while preserving readability. Experienced designers often rely on professional software like Adobe Illustrator to manually adjust the outline of the typeface to incorporate specific imagery, which is a time-consuming and error-prone process. They often experiment with different strokes or letters of typeface and various imageries to find a visually appealing and memorable representation, intensifying the lengthy process. This requires creative thinking, practical skills, and the ability to persist through continuous trial and error.
Privacy-Preserving Neural Graph Databases
Hu, Qi, Li, Haoran, Bai, Jiaxin, Song, Yangqiu
In the era of big data and rapidly evolving information systems, efficient and accurate data retrieval has become increasingly crucial. Neural graph databases (NGDBs) have emerged as a powerful paradigm that combines the strengths of graph databases (graph DBs) and neural networks to enable efficient storage, retrieval, and analysis of graph-structured data. The usage of neural embedding storage and complex neural logical query answering provides NGDBs with generalization ability. When the graph is incomplete, by extracting latent patterns and representations, neural graph databases can fill gaps in the graph structure, revealing hidden relationships and enabling accurate query answering. Nevertheless, this capability comes with inherent trade-offs, as it introduces additional privacy risks to the database. Malicious attackers can infer more sensitive information in the database using well-designed combinatorial queries, such as by comparing the answer sets of where Turing Award winners born before 1950 and after 1940 lived, the living places of Turing Award winner Hinton are probably exposed, although the living places may have been deleted in the training due to the privacy concerns. In this work, inspired by the privacy protection in graph embeddings, we propose a privacy-preserving neural graph database (P-NGDB) to alleviate the risks of privacy leakage in NGDBs. We introduce adversarial training techniques in the training stage to force the NGDBs to generate indistinguishable answers when queried with private information, enhancing the difficulty of inferring sensitive information through combinations of multiple innocuous queries. Extensive experiment results on three datasets show that P-NGDB can effectively protect private information in the graph database while delivering high-quality public answers responses to queries.
Generative User-Experience Research for Developing Domain-specific Natural Language Processing Applications
Zhukova, Anastasia, von Sperl, Lukas, Matt, Christian E., Gipp, Bela
Natural Language Processing (NLP) has been recently extensively incorporated into industrial and domain applications. For example, NLP is used for speeding up processes, e.g., automation classification of types of customer feedback or filtering out spam emails, information extraction, e.g., named entity recognition to extract symptoms, diagnoses, and treatments from medical records, or auto-completing input forms with language models. Despite the broad integration, domain-specific NLP applications may require practicing more user-driven methodologies to address user needs with these applications. Often, the data-driven approach falls short in exploring the needs of the domain users (Yang, 2018). On the one hand, domain users are often integrated into development at the late test phase to evaluate the usability of ML/NLP applications (Carney, 2019). Unlike user-driven software development, the development of NLP applications depends mainly on data availability or experimenting with machine learning (ML)/NLP trends and thus is a major driver of application development. On the other hand, the user-driven development of a domain-specific ML/NLP application in medicine showed that close collaboration with the domain users in the earlier stages increases the effectiveness of the final product (Yang, 2017). Therefore, integrating user experience (UX) and human-computer interaction (HCI) research into ML/NLP research addresses users' needs, fuses their expertise, and increases intuitiveness, transparency, simplicity, and trust for the system users (Boukhelifa et al, 2018; Paleyes et al, 2022).
FactCHD: Benchmarking Fact-Conflicting Hallucination Detection
Chen, Xiang, Song, Duanzheng, Gui, Honghao, Wang, Chenxi, Zhang, Ningyu, Yong, Jiang, Huang, Fei, Lv, Chengfei, Zhang, Dan, Chen, Huajun
Despite their impressive generative capabilities, LLMs are hindered by fact-conflicting hallucinations in real-world applications. The accurate identification of hallucinations in texts generated by LLMs, especially in complex inferential scenarios, is a relatively unexplored area. To address this gap, we present FactCHD, a dedicated benchmark designed for the detection of fact-conflicting hallucinations from LLMs. FactCHD features a diverse dataset that spans various factuality patterns, including vanilla, multi-hop, comparison, and set operation. A distinctive element of FactCHD is its integration of fact-based evidence chains, significantly enhancing the depth of evaluating the detectors' explanations. Experiments on different LLMs expose the shortcomings of current approaches in detecting factual errors accurately. Furthermore, we introduce Truth-Triangulator that synthesizes reflective considerations by tool-enhanced ChatGPT and LoRA-tuning based on Llama2, aiming to yield more credible detection through the amalgamation of predictive results and evidence. The benchmark dataset is available at https://github.com/zjunlp/FactCHD.
Impact of Large Language Model Assistance on Patients Reading Clinical Notes: A Mixed-Methods Study
Mannhardt, Niklas, Bondi-Kelly, Elizabeth, Lam, Barbara, O'Connell, Chloe, Asiedu, Mercy, Mozannar, Hussein, Agrawal, Monica, Buendia, Alejandro, Urman, Tatiana, Riaz, Irbaz B., Ricciardi, Catherine E., Ghassemi, Marzyeh, Sontag, David
Patients derive numerous benefits from reading their clinical notes, including an increased sense of control over their health and improved understanding of their care plan. However, complex medical concepts and jargon within clinical notes hinder patient comprehension and may lead to anxiety. We developed a patient-facing tool to make clinical notes more readable, leveraging large language models (LLMs) to simplify, extract information from, and add context to notes. We prompt engineered GPT-4 to perform these augmentation tasks on real clinical notes donated by breast cancer survivors and synthetic notes generated by a clinician, a total of 12 notes with 3868 words. In June 2023, 200 female-identifying US-based participants were randomly assigned three clinical notes with varying levels of augmentations using our tool. Participants answered questions about each note, evaluating their understanding of follow-up actions and self-reported confidence. We found that augmentations were associated with a significant increase in action understanding score (0.63 $\pm$ 0.04 for select augmentations, compared to 0.54 $\pm$ 0.02 for the control) with p=0.002. In-depth interviews of self-identifying breast cancer patients (N=7) were also conducted via video conferencing. Augmentations, especially definitions, elicited positive responses among the seven participants, with some concerns about relying on LLMs. Augmentations were evaluated for errors by clinicians, and we found misleading errors occur, with errors more common in real donated notes than synthetic notes, illustrating the importance of carefully written clinical notes. Augmentations improve some but not all readability metrics. This work demonstrates the potential of LLMs to improve patients' experience with clinical notes at a lower burden to clinicians. However, having a human in the loop is important to correct potential model errors.
Challenge design roadmap
Balderas, Hugo Jair Escalante, Guyon, Isabelle, Howard, Addison, Reade, Walter, Treguer, Sebastien
Challenges can be seen as a type of game that motivates participants to solve serious tasks. As a result, competition organizers must develop effective game rules. However, these rules have multiple objectives beyond making the game enjoyable for participants. These objectives may include solving real-world problems, advancing scientific or technical areas, making scientific discoveries, and educating the public. In many ways, creating a challenge is similar to launching a product. It requires the same level of excitement and rigorous testing, and the goal is to attract ''customers'' in the form of participants. The process begins with a solid plan, such as a competition proposal that will eventually be submitted to an international conference and subjected to peer review. Although peer review does not guarantee quality, it does force organizers to consider the impact of their challenge, identify potential oversights, and generally improve its quality. This chapter provides guidelines for creating a strong plan for a challenge. The material draws on the preparation guidelines from organizations such as Kaggle 1 , ChaLearn 2 and Tailor 3 , as well as the NeurIPS proposal template, which some of the authors contributed to.
Truth Forest: Toward Multi-Scale Truthfulness in Large Language Models through Intervention without Tuning
Chen, Zhongzhi, Sun, Xingwu, Jiao, Xianfeng, Lian, Fengzong, Kang, Zhanhui, Wang, Di, Xu, Cheng-Zhong
Despite the great success of large language models (LLMs) in various tasks, they suffer from generating hallucinations. We introduce Truth Forest, a method that enhances truthfulness in LLMs by uncovering hidden truth representations using multi-dimensional orthogonal probes. Specifically, it creates multiple orthogonal bases for modeling truth by incorporating orthogonal constraints into the probes. Moreover, we introduce Random Peek, a systematic technique considering an extended range of positions within the sequence, reducing the gap between discerning and generating truth features in LLMs. By employing this approach, we improved the truthfulness of Llama-2-7B from 40.8\% to 74.5\% on TruthfulQA. Likewise, significant improvements are observed in fine-tuned models. We conducted a thorough analysis of truth features using probes. Our visualization results show that orthogonal probes capture complementary truth-related features, forming well-defined clusters that reveal the inherent structure of the dataset.
ChatGPT's FarmVille Moment
ChatGPT has certainly captured the world's imagination since its release at the end of 2022. But in day-to-day life, it is still a relatively niche product--a curiosity that leads people to ask questions that begin "Have you tried …?" or "What do you think about …?" Its maker, OpenAI, has a much more expansive vision. Its aim is seemingly to completely remake how people use the internet. For that to happen, the bot needs to be more than a conversation starter: It has to be a functioning business.
What should I say? -- Interacting with AI and Natural Language Interfaces
As Artificial Intelligence (AI) technology becomes more and more prevalent, it becomes increasingly important to explore how we as humans interact with AI. The Human-AI Interaction (HAI) sub-field has emerged from the Human-Computer Interaction (HCI) field and aims to examine this very notion. Many interaction patterns have been implemented without fully understanding the changes in required cognition as well as the cognitive science implications of using these alternative interfaces that aim to be more human-like in nature. Prior research suggests that theory of mind representations are crucial to successful and effortless communication, however very little is understood when it comes to how theory of mind representations are established when interacting with AI.