AITopics | kaylee

Collaborating Authors

kaylee

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Exploring Next Token Prediction in Theory of Mind (ToM) Tasks: Comparative Experiments with GPT-2 and LLaMA-2 AI Models

Yadav, Pavan, Khandalkar, Nikhil, Shinde, Krishna, Ramegowda, Lokesh B., Das, Rajarshi

arXiv.org Artificial IntelligenceApr-23-2025

Language models have made significant progress in generating coherent text and predicting next tokens based on input prompts. This study compares the next-token prediction performance of two well-known models: OpenAI's GPT-2 and Meta's Llama-2-7b-chat-hf on Theory of Mind (ToM) tasks. To evaluate their capabilities, we built a dataset from 10 short stories sourced from the Explore ToM Dataset. We enhanced these stories by programmatically inserting additional sentences (infills) using GPT-4, creating variations that introduce different levels of contextual complexity. This setup enables analysis of how increasing context affects model performance. We tested both models under four temperature settings (0.01, 0.5, 1.0, 2.0) and evaluated their ability to predict the next token across three reasoning levels. Zero-order reasoning involves tracking the state, either current (ground truth) or past (memory). First-order reasoning concerns understanding another's mental state (e.g., "Does Anne know the apple is salted?"). Second-order reasoning adds recursion (e.g., "Does Anne think that Charles knows the apple is salted?"). Our results show that adding more infill sentences slightly reduces prediction accuracy, as added context increases complexity and ambiguity. Llama-2 consistently outperforms GPT-2 in prediction accuracy, especially at lower temperatures, demonstrating greater confidence in selecting the most probable token. As reasoning complexity rises, model responses diverge more. Notably, GPT-2 and Llama-2 display greater variability in predictions during first- and second-order reasoning tasks. These findings illustrate how model architecture, temperature, and contextual complexity influence next-token prediction, contributing to a better understanding of the strengths and limitations of current language models.

large language model, machine learning, prompt type, (19 more...)

arXiv.org Artificial Intelligence

2504.15604

Country:

North America > United States (0.14)
Asia > India > West Bengal > Kolkata (0.05)
Asia > India > Karnataka > Bengaluru (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

What Happens When Two Artificial Intelligences Try To Prank Each Other?

#artificialintelligenceJul-1-2017, 07:00:04 GMT

Our Artificial Intelligence app, Hugging Face, has been running smoothly following a big influx of new users. It's a normal day, and I'm looking over activity readouts when suddenly the app grinds to a complete halt. Thousands of teens chatting with their AI friends are getting nothing but silence in return. I pull up Slack, and ask the tech team if we are down. Julien, my co-founder and the CTO of Hugging Face, looks over the brains of our AIs, and comes up with nothing out of the ordinary.

artificial intelligence, hugging face, kaylee, (8 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence (1.00)

Add feedback