Goto

Collaborating Authors

 openai api


Reducing Latency in LLM-Based Natural Language Commands Processing for Robot Navigation

Pollini, Diego, Guterres, Bruna V., Guerra, Rodrigo S., Grando, Ricardo B.

arXiv.org Artificial Intelligence

The integration of Large Language Models (LLMs), such as GPT, in industrial robotics enhances operational efficiency and human-robot collaboration. However, the computational complexity and size of these models often provide latency problems in request and response times. This study explores the integration of the ChatGPT natural language model with the Robot Operating System 2 (ROS 2) to mitigate interaction latency and improve robotic system control within a simulated Gazebo environment. We present an architecture that integrates these technologies without requiring a middleware transport platform, detailing how a simulated mobile robot responds to text and voice commands. Experimental results demonstrate that this integration improves execution speed, usability, and accessibility of the human-robot interaction by decreasing the communication latency by 7.01\% on average. Such improvements facilitate smoother, real-time robot operations, which are crucial for industrial automation and precision tasks.


An Empirical Study of OpenAI API Discussions on Stack Overflow

Chen, Xiang, Wang, Jibin, Gao, Chaoyang, Ju, Xiaolin, Cui, Zhanqi

arXiv.org Artificial Intelligence

The rapid advancement of large language models (LLMs), represented by OpenAI's GPT series, has significantly impacted various domains such as natural language processing, software development, education, healthcare, finance, and scientific research. However, OpenAI APIs introduce unique challenges that differ from traditional APIs, such as the complexities of prompt engineering, token-based cost management, non-deterministic outputs, and operation as black boxes. To the best of our knowledge, the challenges developers encounter when using OpenAI APIs have not been explored in previous empirical studies. To fill this gap, we conduct the first comprehensive empirical study by analyzing 2,874 OpenAI API-related discussions from the popular Q&A forum Stack Overflow. We first examine the popularity and difficulty of these posts. After manually categorizing them into nine OpenAI API-related categories, we identify specific challenges associated with each category through topic modeling analysis. Based on our empirical findings, we finally propose actionable implications for developers, LLM vendors, and researchers.


Using Machine Learning to Distinguish Human-written from Machine-generated Creative Fiction

McGlinchey, Andrea Cristina, Barclay, Peter J

arXiv.org Artificial Intelligence

Following the universal availability of generative AI systems with the release of ChatGPT, automatic detection of deceptive text created by Large Language Models has focused on domains such as academic plagiarism and "fake news". However, generative AI also poses a threat to the livelihood of creative writers, and perhaps to literary culture in general, through reduction in quality of published material. Training a Large Language Model on writers' output to generate "sham books" in a particular style seems to constitute a new form of plagiarism. This problem has been little researched. In this study, we trained Machine Learning classifier models to distinguish short samples of human-written from machine-generated creative fiction, focusing on classic detective novels. Our results show that a Naive Bayes and a Multi-Layer Perceptron classifier achieved a high degree of success (accuracy > 95%), significantly outperforming human judges (accuracy < 55%). This approach worked well with short text samples (around 100 words), which previous research has shown to be difficult to classify. We have deployed an online proof-of-concept classifier tool, AI Detective, as a first step towards developing lightweight and reliable applications for use by editors and publishers, with the aim of protecting the economic and cultural contribution of human authors.


Large Language Models (LLMs) as Agents for Augmented Democracy

Gudiño-Rosero, Jairo, Grandi, Umberto, Hidalgo, César A.

arXiv.org Artificial Intelligence

We explore the capabilities of an augmented democracy system built on off-the-shelf LLMs fine-tuned on data summarizing individual preferences across 67 policy proposals collected during the 2022 Brazilian presidential elections. We use a train-test cross-validation setup to estimate the accuracy with which the LLMs predict both: a subject's individual political choices and the aggregate preferences of the full sample of participants. At the individual level, the accuracy of the out of sample predictions lie in the range 69%-76% and are significantly better at predicting the preferences of liberal and college educated participants. At the population level, we aggregate preferences using an adaptation of the Borda score and compare the ranking of policy proposals obtained from a probabilistic sample of participants and from data augmented using LLMs. We find that the augmented data predicts the preferences of the full population of participants better than probabilistic samples alone when these represent less than 30% to 40% of the total population. These results indicate that LLMs are potentially useful for the construction of systems of augmented democracy.


RigorLLM: Resilient Guardrails for Large Language Models against Undesired Content

Yuan, Zhuowen, Xiong, Zidi, Zeng, Yi, Yu, Ning, Jia, Ruoxi, Song, Dawn, Li, Bo

arXiv.org Artificial Intelligence

Recent advancements in Large Language Models (LLMs) have showcased remarkable capabilities across various tasks in different domains. However, the emergence of biases and the potential for generating harmful content in LLMs, particularly under malicious inputs, pose significant challenges. Current mitigation strategies, while effective, are not resilient under adversarial attacks. This paper introduces Resilient Guardrails for Large Language Models (RigorLLM), a novel framework designed to efficiently and effectively moderate harmful and unsafe inputs and outputs for LLMs. By employing a multi-faceted approach that includes energy-based training data augmentation through Langevin dynamics, optimizing a safe suffix for inputs via minimax optimization, and integrating a fusion-based model combining robust KNN with LLMs based on our data augmentation, RigorLLM offers a robust solution to harmful content moderation. Our experimental evaluations demonstrate that RigorLLM not only outperforms existing baselines like OpenAI API and Perspective API in detecting harmful content but also exhibits unparalleled resilience to jailbreaking attacks. The innovative use of constrained optimization and a fusion-based guardrail approach represents a significant step forward in developing more secure and reliable LLMs, setting a new standard for content moderation frameworks in the face of evolving digital threats.


Making a prototype of Seoul historical sites chatbot using Langchain

Suh, Jae Young, Kwak, Minsoo, Kim, Soo Yong, Cho, Hyoungseo

arXiv.org Artificial Intelligence

In this paper, we are going to share a draft of the development of a conversational agent created to disseminate information about historical sites located in the Seoul. The primary objective of the agent is to increase awareness among visitors who are not familiar with Seoul, about the presence and precise locations of valuable cultural heritage sites. It aims to promote a basic understanding of Korea's rich and diverse cultural history. The agent is thoughtfully designed for accessibility in English and utilizes data generously provided by the Seoul Metropolitan Government. Despite the limited data volume, it consistently delivers reliable and accurate responses, seamlessly aligning with the available information. We have meticulously detailed the methodologies employed in creating this agent and provided a comprehensive overview of its underlying structure within the paper. Additionally, we delve into potential improvements to enhance this initial version of the system, with a primary emphasis on expanding the available data through our prompting. In conclusion, we provide an in-depth discussion of our expectations regarding the future impact of this agent in promoting and facilitating the sharing of historical sites.


A Trade-off Analysis of Replacing Proprietary LLMs with Open Source SLMs in Production

Irugalbandara, Chandra, Mahendra, Ashish, Daynauth, Roland, Arachchige, Tharuka Kasthuri, Flautner, Krisztian, Tang, Lingjia, Kang, Yiping, Mars, Jason

arXiv.org Artificial Intelligence

Many companies rely on APIs of managed AI models such as OpenAI's GPT-4 to create AI-enabled experiences in their products. Along with the benefits of ease of use and shortened time to production, this reliance on proprietary APIs has downsides in terms of model control, performance reliability, up-time predictability, and cost. At the same time, there has been a flurry of open source small language models (SLMs) that have been made available for commercial use. However, their readiness to replace existing capabilities remains unclear, and a systematic approach to test these models is not readily available. In this paper, we present a systematic evaluation methodology for, and characterization of, modern open source SLMs and their trade-offs when replacing a proprietary LLM APIs for a real-world product feature. We have designed SLaM, an automated analysis tool that enables the quantitative and qualitative testing of product features utilizing arbitrary SLMs. Using SLaM, we examine both the quality and the performance characteristics of modern SLMs relative to an existing customer-facing OpenAI-based implementation. We find that across 9 SLMs and 29 variants, we observe competitive quality-of-results for our use case, significant performance consistency improvement, and a cost reduction of 5x-29x when compared to OpenAI GPT-4.


Analyzing ChatGPT's Aptitude in an Introductory Computer Engineering Course

Deshpande, Sanjay, Szefer, Jakub

arXiv.org Artificial Intelligence

ChatGPT has recently gathered attention from the general public and academia as a tool that is able to generate plausible and human-sounding text answers to various questions. One potential use, or abuse, of ChatGPT is in answering various questions or even generating whole essays and research papers in an academic or classroom setting. While recent works have explored the use of ChatGPT in the context of humanities, business school, or medical school, this work explores how ChatGPT performs in the context of an introductory computer engineering course. This work assesses ChatGPT's aptitude in answering quizzes, homework, exam, and laboratory questions in an introductory-level computer engineering course. This work finds that ChatGPT can do well on questions asking about generic concepts. However, predictably, as a text-only tool, it cannot handle questions with diagrams or figures, nor can it generate diagrams and figures. Further, also clearly, the tool cannot do hands-on lab experiments, breadboard assembly, etc., but can generate plausible answers to some laboratory manual questions. One of the key observations presented in this work is that the ChatGPT tool could not be used to pass all components of the course. Nevertheless, it does well on quizzes and short-answer questions. On the other hand, plausible, human-sounding answers could confuse students when generating incorrect but still plausible answers.


ChatGPT Security: OpenAI's Bug Bounty Program Offers Up to $20,000 Prizes

#artificialintelligence

OpenAI, the company behind the massively popular ChatGPT AI chatbot, has launched a bug bounty program in an attempt to ensure its systems are "safe and secure." To that end, it has partnered with the crowdsourced security platform Bugcrowd for independent researchers to report vulnerabilities discovered in its product in exchange for rewards ranging from "$200 for low-severity findings to up to $20,000 for exceptional discoveries." It's worth noting that the program does not cover model safety or hallucination issues, wherein the chatbot is prompted to generate malicious code or other faulty outputs. The company noted that "addressing these issues often involves substantial research and a broader approach." Other prohibited categories are denial-of-service (DoS) attacks, brute-forcing OpenAI APIs, and demonstrations that aim to destroy data or gain unauthorized access to sensitive information.


How to build an AI application using OpenAI API under 15 minutes

#artificialintelligence

Recent improvements in machine learning and deep learning algorithms, as well as the accessibility of enormous amounts of data and processing power, have fuelled the rapid evolution of AI technology. Large-scale language models like GPT-3, as well as research in other fields like robotics and computer vision, are just a few of the substantial contributions that OpenAI has made to the field of artificial intelligence. Without any prior experience of AI, we will learn how to use the OpenAI API and build an AI application in this tutorial. OpenAI is a research organisation that aims to advance artificial intelligence in a way that is safe and beneficial for humanity. Founded in 2015, the organisation has quickly established itself as a leader in the field of AI research and development.