Goto

Collaborating Authors

 Large Language Model


The AI Arms Race: A Deadly Rivalry Between the USA and China

#artificialintelligence

Artificial intelligence (AI) has recently been a hot topic among academics, policymakers, and business titans. There is rising concern that we could be on the verge of an artificial arms race between the US and China due to the development of sophisticated AI models like ChatGPT and the usage of AI in military weaponry. Let's investigate this matter in greater detail. AI weapons have developed into a key source of concern as the globe deals with escalating technological breakthroughs. These weapons are particularly frightening since they have the capability to unleash mass destruction without human participation.


Elon Musk's Latest Venture: A Chatbot to Rival ChatGPT...

#artificialintelligence

Elon Musk is known for his innovative ideas and it seems he may be working on a ChatGPT rival. Find out more about this exciting development here! Elon Musk, known for his involvement in various tech companies, was one of the co-founders of OpenAI, the company responsible for ChatGPT. However, Musk left the company after a few years as he wanted it to be a non-profit organization. Recent reports suggest that Musk is now planning to launch his own AI startup, X.AI, which will compete with OpenAI.


How To Create Your Own Auto-GPT AI Agent

#artificialintelligence

To get good output from ChatGPT or another LLM, you usually have to feed it several prompts. But what if you could just give your AI bot a set of fairly broad goals at the start of a session and then sit back while it generates its own set of tasks to fulfill those goals? That's the idea behind Auto-GPT, a new open-source tool that uses the OpenAI API (same LLM as ChatGPT) to prompt itself, based on your initial input. We've already seen a number of Twitter users talk about how they are using Auto-GPT for everything from creating marketing plans to analyzing market data for investments to preparing topics for a podcast. Based on our hands-on experience, we can't say that it always works well (we asked it to write a Windows 11 how-to and the result was awful), but it's early days and some tasks may work better than others.


Exploring the Use of ChatGPT as a Tool for Learning and Assessment in Undergraduate Computer Science Curriculum: Opportunities and Challenges

arXiv.org Artificial Intelligence

The application of Artificial intelligence for teaching and learning in the academic sphere is a trending subject of interest in the computing education. ChatGPT, as an AI-based tool, provides various advantages, such as heightened student involvement, cooperation, accessibility and availability. This paper addresses the prospects and obstacles associated with utilizing ChatGPT as a tool for learning and assessment in undergraduate Computer Science curriculum in particular to teaching and learning fundamental programming courses. Students having completed the course work for a Data Structures and Algorithms (a sophomore level course) participated in this study. Two groups of students were given programming challenges to solve within a short period of time. The control group (group A) had access to text books and notes of programming courses, however no Internet access was provided. Group B students were given access to ChatGPT and were encouraged to use it to help solve the programming challenges. The challenge was conducted in a computer lab environment using PC2 environment. Each team of students address the problem by writing executable code that satisfies certain number of test cases. Student teams were scored based on their performance in terms of number of successful passed testcases. Results show that students using ChatGPT had an advantage in terms of earned scores, however there were inconsistencies and inaccuracies in the submitted code consequently affecting the overall performance. After a thorough analysis, the paper's findings indicate that incorporating AI in higher education brings about various opportunities and challenges.


Testing the Reliability of ChatGPT for Text Annotation and Classification: A Cautionary Remark

arXiv.org Artificial Intelligence

Recent studies have demonstrated promising potential of ChatGPT for various text annotation and classification tasks. However, ChatGPT is non-deterministic which means that, as with human coders, identical input can lead to different outputs. Given this, it seems appropriate to test the reliability of ChatGPT. Therefore, this study investigates the consistency of ChatGPT's zero-shot capabilities for text annotation and classification, focusing on different model parameters, prompt variations, and repetitions of identical inputs. Based on the real-world classification task of differentiating website texts into news and not news, results show that consistency in ChatGPT's classification output can fall short of scientific thresholds for reliability. For example, even minor wording alterations in prompts or repeating the identical input can lead to varying outputs. Although pooling outputs from multiple repetitions can improve reliability, this study advises caution when using ChatGPT for zero-shot text annotation and underscores the need for thorough validation, such as comparison against human-annotated data. The unsupervised application of ChatGPT for text annotation and classification is not recommended.


Solving Math Word Problems by Combining Language Models With Symbolic Solvers

arXiv.org Artificial Intelligence

Automatically generating high-quality step-by-step solutions to math word problems has many applications in education. Recently, combining large language models (LLMs) with external tools to perform complex reasoning and calculation has emerged as a promising direction for solving math word problems, but prior approaches such as Program-Aided Language model (PAL) are biased towards simple procedural problems and less effective for problems that require declarative reasoning. We propose an approach that combines an LLM that can incrementally formalize word problems as a set of variables and equations with an external symbolic solver that can solve the equations. Our approach achieves comparable accuracy to the original PAL on the GSM8K benchmark of math word problems and outperforms PAL by an absolute 20% on ALGEBRA, a new dataset of more challenging word problems extracted from Algebra textbooks. Our work highlights the benefits of using declarative and incremental representations when interfacing with an external tool for solving complex math word problems. Our data and prompts are publicly available at https://github.com/joyheyueya/declarative-math-word-problem.


Diversity is Definitely Needed: Improving Model-Agnostic Zero-shot Classification via Stable Diffusion

arXiv.org Artificial Intelligence

In this work, we investigate the problem of Model-Agnostic Zero-Shot Classification (MA-ZSC), which refers to training non-specific classification architectures (downstream models) to classify real images without using any real images during training. Recent research has demonstrated that generating synthetic training images using diffusion models provides a potential solution to address MA-ZSC. However, the performance of this approach currently falls short of that achieved by large-scale vision-language models. One possible explanation is a potential significant domain gap between synthetic and real images. Our work offers a fresh perspective on the problem by providing initial insights that MA-ZSC performance can be improved by improving the diversity of images in the generated dataset. We propose a set of modifications to the text-to-image generation process using a pre-trained diffusion model to enhance diversity, which we refer to as our $\textbf{bag of tricks}$. Our approach shows notable improvements in various classification architectures, with results comparable to state-of-the-art models such as CLIP. To validate our approach, we conduct experiments on CIFAR10, CIFAR100, and EuroSAT, which is particularly difficult for zero-shot classification due to its satellite image domain. We evaluate our approach with five classification architectures, including ResNet and ViT. Our findings provide initial insights into the problem of MA-ZSC using diffusion models. All code will be available on GitHub.


SikuGPT: A Generative Pre-trained Model for Intelligent Information Processing of Ancient Texts from the Perspective of Digital Humanities

arXiv.org Artificial Intelligence

The rapid advance in artificial intelligence technology has facilitated the prosperity of digital humanities research. Against such backdrop, research methods need to be transformed in the intelligent processing of ancient texts, which is a crucial component of digital humanities research, so as to adapt to new development trends in the wave of AIGC. In this study, we propose a GPT model called SikuGPT based on the corpus of Siku Quanshu. The model's performance in tasks such as intralingual translation and text classification exceeds that of other GPT-type models aimed at processing ancient texts. SikuGPT's ability to process traditional Chinese ancient texts can help promote the organization of ancient information and knowledge services, as well as the international dissemination of Chinese ancient culture.


Advances in apparent conceptual physics reasoning in GPT-4

arXiv.org Artificial Intelligence

ChatGPT is built on a large language model trained on an enormous corpus of human text to emulate human conversation. Despite lacking any explicit programming regarding the laws of physics, recent work has demonstrated that GPT-3.5 could pass an introductory physics course at some nominal level and register something close to a minimal understanding of Newtonian Mechanics on the Force Concept Inventory. This work replicates those results and also demonstrates that the latest version, GPT-4, has reached a much higher mark in the latter context. Indeed, its responses come quite close to perfectly demonstrating expert-level competence, with a few very notable exceptions and limitations. We briefly comment on the implications of this for the future of physics education and pedagogy.


Towards Better Instruction Following Language Models for Chinese: Investigating the Impact of Training Data and Evaluation

arXiv.org Artificial Intelligence

Recently, significant public efforts have been directed towards developing low-cost models with capabilities akin to ChatGPT, thereby fostering the growth of open-source conversational models. However, there remains a scarcity of comprehensive and in-depth evaluations of these models' performance. In this study, we examine the influence of training data factors, including quantity, quality, and linguistic distribution, on model performance. Our analysis is grounded in several publicly accessible, high-quality instruction datasets, as well as our own Chinese multi-turn conversations. We assess various models using a evaluation set of 1,000 samples, encompassing nine real-world scenarios. Our goal is to supplement manual evaluations with quantitative analyses, offering valuable insights for the continued advancement of open-source chat models. Furthermore, to enhance the performance and training and inference efficiency of models in the Chinese domain, we extend the vocabulary of LLaMA - the model with the closest open-source performance to proprietary language models like GPT-3 - and conduct secondary pre-training on 3.4B Chinese words. We make our model, data, as well as code publicly available.