AITopics | gpt-4 and gpt-3

Collaborating Authors

gpt-4 and gpt-3

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

How is ChatGPT's behavior changing over time?

Chen, Lingjiao, Zaharia, Matei, Zou, James

arXiv.org Artificial IntelligenceOct-31-2023

GPT-3.5 and GPT-4 are the two most widely used large language model (LLM) services. However, when and how these models are updated over time is opaque. Here, we evaluate the March 2023 and June 2023 versions of GPT-3.5 and GPT-4 on several diverse tasks: 1) math problems, 2) sensitive/dangerous questions, 3) opinion surveys, 4) multi-hop knowledge-intensive questions, 5) generating code, 6) US Medical License tests, and 7) visual reasoning. We find that the performance and behavior of both GPT-3.5 and GPT-4 can vary greatly over time. For example, GPT-4 (March 2023) was reasonable at identifying prime vs. composite numbers (84% accuracy) but GPT-4 (June 2023) was poor on these same questions (51% accuracy). This is partly explained by a drop in GPT-4's amenity to follow chain-of-thought prompting. Interestingly, GPT-3.5 was much better in June than in March in this task. GPT-4 became less willing to answer sensitive questions and opinion survey questions in June than in March. GPT-4 performed better at multi-hop questions in June than in March, while GPT-3.5's performance dropped on this task. Both GPT-4 and GPT-3.5 had more formatting mistakes in code generation in June than in March. We provide evidence that GPT-4's ability to follow user instructions has decreased over time, which is one common factor behind the many behavior drifts. Overall, our findings show that the behavior of the "same" LLM service can change substantially in a relatively short amount of time, highlighting the need for continuous monitoring of LLMs.

gpt-3, gpt-4, instruction, (14 more...)

arXiv.org Artificial Intelligence

2307.09009

Country: North America > United States (0.45)

Genre:

Questionnaire & Opinion Survey (1.00)
Research Report > New Finding (0.86)

Industry:

Education (1.00)
Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Probing the Creativity of Large Language Models: Can models produce divergent semantic association?

Chen, Honghua, Ding, Nai

arXiv.org Artificial IntelligenceOct-17-2023

Large language models possess remarkable capacity for processing language, but it remains unclear whether these models can further generate creative content. The present study aims to investigate the creative thinking of large language models through a cognitive perspective. We utilize the divergent association task (DAT), an objective measurement of creativity that asks models to generate unrelated words and calculates the semantic distance between them. We compare the results across different models and decoding strategies. Our findings indicate that: (1) When using the greedy search strategy, GPT-4 outperforms 96% of humans, while GPT-3.5-turbo exceeds the average human level. (2) Stochastic sampling and temperature scaling are effective to obtain higher DAT scores for models except GPT-4, but face a trade-off between creativity and stability. These results imply that advanced large language models have divergent semantic associations, which is a fundamental process underlying creativity.

computational linguistic, creativity, surprisal, (11 more...)

arXiv.org Artificial Intelligence

2310.11158

Country:

Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
North America > United States > New York > New York County > New York City (0.04)
(4 more...)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Neural Authorship Attribution: Stylometric Analysis on Large Language Models

Kumarage, Tharindu, Liu, Huan

arXiv.org Artificial IntelligenceAug-14-2023

Large language models (LLMs) such as GPT-4, PaLM, and Llama have significantly propelled the generation of AI-crafted text. With rising concerns about their potential misuse, there is a pressing need for AI-generated-text forensics. Neural authorship attribution is a forensic effort, seeking to trace AI-generated text back to its originating LLM. The LLM landscape can be divided into two primary categories: proprietary and open-source. In this work, we delve into these emerging categories of LLMs, focusing on the nuances of neural authorship attribution. To enrich our understanding, we carry out an empirical analysis of LLM writing signatures, highlighting the contrasts between proprietary and open-source models, and scrutinizing variations within each group. By integrating stylometric features across lexical, syntactic, and structural aspects of language, we explore their potential to yield interpretable results and augment pre-trained language model-based classifiers utilized in neural authorship attribution. Our findings, based on a range of state-of-the-art LLMs, provide empirical insights into neural authorship attribution, paving the way for future investigations aimed at mitigating the threats posed by AI-generated misinformation.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2308.07305

Country:

North America > United States > Arizona (0.05)
Europe > Norway > Eastern Norway > Oslo (0.04)

Genre: Research Report > New Finding (0.50)

Industry:

Information Technology > Security & Privacy (0.69)
Media > News (0.49)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Large language models and (non-)linguistic recursion

Dąbkowski, Maksymilian, Beguš, Gašper

arXiv.org Artificial IntelligenceJun-12-2023

Recursion is one of the hallmarks of human language. While many design features of language have been shown to exist in animal communication systems, recursion has not. Previous research shows that GPT-4 is the first large language model (LLM) to exhibit metalinguistic abilities (Begu\v{s}, D\k{a}bkowski, and Rhodes 2023). Here, we propose several prompt designs aimed at eliciting and analyzing recursive behavior in LLMs, both linguistic and non-linguistic. We demonstrate that when explicitly prompted, GPT-4 can both produce and analyze recursive structures. Thus, we present one of the first studies investigating whether meta-linguistic awareness of recursion -- a uniquely human cognitive property -- can emerge in transformers with a high number of parameters such as GPT-4.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2306.07195

Country:

North America > United States > New York (0.05)
North America > United States > California > Alameda County > Berkeley (0.04)
North America > United States > Massachusetts (0.04)
(3 more...)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

GPT-4 vs GPT-3.5: The Battle of AI Titans

#artificialintelligenceMar-18-2023, 22:31:12 GMT

Despite their impressive capabilities, both GPT-4 and GPT-3.5 have limitations. These models can occasionally generate incorrect or nonsensical information, especially when dealing with ambiguous or complex queries. Additionally, both models may produce text that is overly verbose, repeating the same information in different ways. It's important to keep these limitations in mind when using these models for critical applications.

gpt-3, gpt-4 and gpt-3, gpt-4 vs gpt-3, (8 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback