Goto

Collaborating Authors

 Large Language Model


NCAA athlete claims she was scolded by AI over message about women's sports

FOX News

College volleyball player Macy Petty reacts to the U.S. House passing a bill that would ban biological males from competing in women's sports on'Fox News @ Night.' An NCAA volleyball player claims ChatGPT scolded her when she asked the artificial intelligence platform to shorten a tweet about the debate over transgender athletes participating in women's sports. "I was trying to explain [in the tweet] that I'm an NCAA athlete, and that it's important to champion the voice of female athletes and to stand up against this ideological war that's going on that's putting women in danger and taking away the opportunities for scholarships," Macy Petty told Fox News Digital in a phone interview Thursday, explaining it was "a lot of information to cram in one tweet." Petty said she is novice when it comes to using ChatGPT - OpenAI's wildly popular chatbot that can mimic human conversation based on prompts - and had seen an Instagram reel touting the importance of using the platform as the future of technology. After watching the reel, Petty said she was presented with a great opportunity to use the system: Allegedly asking ChatGPT to shorten a tweet on women's sports that had gone over the social media platform's character limit.


The Rise of the Chatbots

Communications of the ACM

During the 2016 U.S. presidential race, a Russian "troll-farm" calling itself the Internet Research Agency sought to harm Hillary Clinton's election chances and help Donald Trump reach the White House by using Twitter to spread false news stories and other disinformation, according to a 2020 report from the Senate Intelligence Committee. Most of that content apparently was produced by human beings, a supposition supported by the fact that activity dropped off on Russian holidays. Soon, though, if not already, such propaganda will be produced automatically by artificial intelligence (AI) systems such as ChatGPT, a chatbot capable of creating human-sounding text. "Imagine a scenario where you have ChatGPT generating these tweets. The number of fake accounts you could manage for the same price would be much larger," says V.S. Subrahmanian, a professor of computer science at Northwestern University, whose research focuses on the intersection of AI and security problems.


ToolQA: A Dataset for LLM Question Answering with External Tools

arXiv.org Artificial Intelligence

Large Language Models (LLMs) have demonstrated impressive performance in various NLP tasks, but they still suffer from challenges such as hallucination and weak numerical reasoning. To overcome these challenges, external tools can be used to enhance LLMs' question-answering abilities. However, current evaluation methods do not distinguish between questions that can be answered using LLMs' internal knowledge and those that require external information through tool use. To address this issue, we introduce a new dataset called ToolQA, which is designed to faithfully evaluate LLMs' ability to use external tools for question answering. Our development of ToolQA involved a scalable, automated process for dataset curation, along with 13 specialized tools designed for interaction with external knowledge in order to answer questions. Importantly, we strive to minimize the overlap between our benchmark data and LLMs' pre-training data, enabling a more precise evaluation of LLMs' tool-use reasoning abilities. We conducted an in-depth diagnosis of existing tool-use LLMs to highlight their strengths, weaknesses, and potential improvements. Our findings set a new benchmark for evaluating LLMs and suggest new directions for future advancements. Our data and code are freely available to the broader scientific community on GitHub.


Social AI and the Challenges of the Human-AI Ecosystem

arXiv.org Artificial Intelligence

The rise of large-scale socio-technical systems in which humans interact with artificial intelligence (AI) systems (including assistants and recommenders, in short AIs) multiplies the opportunity for the emergence of collective phenomena and tipping points, with unexpected, possibly unintended, consequences. For example, navigation systems' suggestions may create chaos if too many drivers are directed on the same route, and personalised recommendations on social media may amplify polarisation, filter bubbles, and radicalisation. On the other hand, we may learn how to foster the "wisdom of crowds" and collective action effects to face social and environmental challenges. In order to understand the impact of AI on socio-technical systems and design next-generation AIs that team with humans to help overcome societal problems rather than exacerbate them, we propose to build the foundations of Social AI at the intersection of Complex Systems, Network Science and AI. In this perspective paper, we discuss the main open questions in Social AI, outlining possible technical and scientific challenges and suggesting research avenues.


ChatGPT may excel in States Medical Licensing Examination but falters in basic Linear Algebra

arXiv.org Artificial Intelligence

The emergence of ChatGPT has been rapid, and although it has demonstrated positive impacts in certain domains, its influence is not universally advantageous. Our analysis focuses on ChatGPT's capabilities in Mathematics Education, particularly in teaching basic Linear Algebra. While there are instances where ChatGPT delivers accurate and well-motivated answers, it is crucial to recognize numerous cases where it makes significant mathematical errors and fails in logical inference. These occurrences raise concerns regarding the system's genuine understanding of mathematics, as it appears to rely more on visual patterns rather than true comprehension. Additionally, the suitability of ChatGPT as a teacher for students also warrants consideration.


LLM-Assisted Content Analysis: Using Large Language Models to Support Deductive Coding

arXiv.org Artificial Intelligence

Deductive coding is a widely used qualitative research method for determining the prevalence of themes across documents. While useful, deductive coding is often burdensome and time consuming since it requires researchers to read, interpret, and reliably categorize a large body of unstructured text documents. Large language models (LLMs), like ChatGPT, are a class of quickly evolving AI tools that can perform a range of natural language processing and reasoning tasks. In this study, we explore the use of LLMs to reduce the time it takes for deductive coding while retaining the flexibility of a traditional content analysis. We outline the proposed approach, called LLM-assisted content analysis (LACA), along with an in-depth case study using GPT-3.5 for LACA on a publicly available deductive coding data set. Additionally, we conduct an empirical benchmark using LACA on 4 publicly available data sets to assess the broader question of how well GPT-3.5 performs across a range of deductive coding tasks. Overall, we find that GPT-3.5 can often perform deductive coding at levels of agreement comparable to human coders. Additionally, we demonstrate that LACA can help refine prompts for deductive coding, identify codes for which an LLM is randomly guessing, and help assess when to use LLMs vs. human coders for deductive coding. We conclude with several implications for future practice of deductive coding and related research methods.


Product Information Extraction using ChatGPT

arXiv.org Artificial Intelligence

Structured product data in the form of attribute/value pairs is the foundation of many e-commerce applications such as faceted product search, product comparison, and product recommendation. Product offers often only contain textual descriptions of the product attributes in the form of titles or free text. Hence, extracting attribute/value pairs from textual product descriptions is an essential enabler for e-commerce applications. In order to excel, state-of-the-art product information extraction methods require large quantities of task-specific training data. The methods also struggle with generalizing to out-of-distribution attributes and attribute values that were not a part of the training data. Due to being pre-trained on huge amounts of text as well as due to emergent effects resulting from the model size, Large Language Models like ChatGPT have the potential to address both of these shortcomings. This paper explores the potential of ChatGPT for extracting attribute/value pairs from product descriptions. We experiment with different zero-shot and few-shot prompt designs. Our results show that ChatGPT achieves a performance similar to a pre-trained language model but requires much smaller amounts of training data and computation for fine-tuning.


The Double Helix inside the NLP Transformer

arXiv.org Artificial Intelligence

We introduce a framework for analyzing various types of information in an NLP Transformer. In this approach, we distinguish four layers of information: positional, syntactic, semantic, and contextual. We also argue that the common practice of adding positional information to semantic embedding is sub-optimal and propose instead a Linear-and-Add approach. Our analysis reveals an autogenetic separation of positional information through the deep layers. We show that the distilled positional components of the embedding vectors follow the path of a helix, both on the encoder side and on the decoder side. We additionally show that on the encoder side, the conceptual dimensions generate Part-of-Speech (PoS) clusters. On the decoder side, we show that a di-gram approach helps to reveal the PoS clusters of the next token. Our approach paves a way to elucidate the processing of information through the deep layers of an NLP Transformer.


Deconstructing Classifiers: Towards A Data Reconstruction Attack Against Text Classification Models

arXiv.org Artificial Intelligence

Natural language processing (NLP) models have become increasingly popular in real-world applications, such as text classification. However, they are vulnerable to privacy attacks, including data reconstruction attacks that aim to extract the data used to train the model. Most previous studies on data reconstruction attacks have focused on LLM, while classification models were assumed to be more secure. In this work, we propose a new targeted data reconstruction attack called the Mix And Match attack, which takes advantage of the fact that most classification models are based on LLM. The Mix And Match attack uses the base model of the target model to generate candidate tokens and then prunes them using the classification head. We extensively demonstrate the effectiveness of the attack using both random and organic canaries. This work highlights the importance of considering the privacy risks associated with data reconstruction attacks in classification models and offers insights into possible leakages.


Exploring the Potential of AI-Generated Synthetic Datasets: A Case Study on Telematics Data with ChatGPT

arXiv.org Artificial Intelligence

This research delves into the construction and utilization of synthetic datasets, specifically within the telematics sphere, leveraging OpenAI's powerful language model, ChatGPT. Synthetic datasets present an effective solution to challenges pertaining to data privacy, scarcity, and control over variables - characteristics that make them particularly valuable for research pursuits. The utility of these datasets, however, largely depends on their quality, measured through the lenses of diversity, relevance, and coherence. To illustrate this data creation process, a hands-on case study is conducted, focusing on the generation of a synthetic telematics dataset. The experiment involved an iterative guidance of ChatGPT, progressively refining prompts and culminating in the creation of a comprehensive dataset for a hypothetical urban planning scenario in Columbus, Ohio. Upon generation, the synthetic dataset was subjected to an evaluation, focusing on the previously identified quality parameters and employing descriptive statistics and visualization techniques for a thorough analysis. Despite synthetic datasets not serving as perfect replacements for actual world data, their potential in specific use-cases, when executed with precision, is significant. This research underscores the potential of AI models like ChatGPT in enhancing data availability for complex sectors like telematics, thus paving the way for a myriad of new research opportunities.