AITopics

doi: 10.5121/ijci.2025.140302

2507.00007

Country: North America > United States > New York (0.14)

Genre:

Research Report (1.00)
Instructional Material > Course Syllabus & Notes (0.46)

Industry:

Education > Educational Setting > Higher Education (0.66)
Education > Educational Technology > Educational Software > Computer Based Training (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Lelièvre, Maxime, Waldock, Amy, Liu, Meng, Aspillaga, Natalia Valdés, Mackintosh, Alasdair, Portela, María José Ogando, Lee, Jared, Atherton, Paul, Ince, Robin A. A., Garrod, Oliver G. B.

Benchmarking the Pedagogical Knowledge of Large Language Models

arXiv.org Artificial IntelligenceJul-2-2025

Benchmarks like Massive Multitask Language Understanding (MMLU) have played a pivotal role in evaluating AI's knowledge and abilities across diverse domains. However, existing benchmarks predominantly focus on content knowledge, leaving a critical gap in assessing models' understanding of pedagogy - the method and practice of teaching. This paper introduces The Pedagogy Benchmark, a novel dataset designed to evaluate large language models on their Cross-Domain Pedagogical Knowledge (CDPK) and Special Education Needs and Disability (SEND) pedagogical knowledge. These benchmarks are built on a carefully curated set of questions sourced from professional development exams for teachers, which cover a range of pedagogical subdomains such as teaching strategies and assessment methods. Here we outline the methodology and development of these benchmarks. We report results for 97 models, with accuracies spanning a range from 28% to 89% on the pedagogical knowledge questions. We consider the relationship between cost and accuracy and chart the progression of the Pareto value frontier over time. We provide online leaderboards at https://rebrand.ly/pedagogy which are updated with new models and allow interactive exploration and filtering based on various model properties, such as cost per token and open-vs-closed weights, as well as looking at performance in different subjects. LLMs and generative AI have tremendous potential to influence education and help to address the global learning crisis. Education-focused benchmarks are crucial to measure models' capacities to understand pedagogical concepts, respond appropriately to learners' needs, and support effective teaching practices across diverse contexts. They are needed for informing the responsible and evidence-based deployment of LLMs and LLM-based tools in educational settings, and for guiding both development and policy decisions.

benchmark, large language model, machine learning, (19 more...)

2506.1871

Genre:

Instructional Material (0.87)
Research Report (0.82)
Overview (0.67)

Industry: Education > Educational Setting > K-12 Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.50)

WIREDJul-1-2025, 18:48:00 GMT

Here's What Mark Zuckerberg Is Offering Top AI Talent

As Mark Zuckerberg staffs up Meta's new superintelligence lab, he's offering top research talent pay packages of up to 300 million over four years, with more than 100 million in total compensation for the first year, WIRED has learned. Meta has made at least 10 of these staggeringly high offers to OpenAI staffers. One high-ranking researcher was pitched on the role of chief scientist but turned it down, according to multiple sources with direct knowledge of the negotiations. While the pay package includes equity, in the first year the stock vests immediately, sources say. "That's about how much it would take for me to go work at Meta," says one OpenAI staffer who spoke with WIRED on the condition of anonymity as they aren't authorized to speak publicly about the company.

meta, openai, zuckerberg, (13 more...)

WIRED

Industry: Information Technology > Services (0.61)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.62)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.62)

WIREDJul-1-2025, 17:46:26 GMT

Sam Altman Slams Meta's AI Talent Poaching Spree: 'Missionaries Will Beat Mercenaries'

OpenAI CEO Sam Altman is hitting back at Meta CEO Mark Zuckerberg's recent AI talent poaching spree. In a full-throated response sent to OpenAI researchers Monday evening and obtained by WIRED, Altman made his pitch for why staying at OpenAI is the only answer for those looking to build artificial general intelligence, hinting that the company is evaluating compensation for the entire research organization. He also dismissed Meta's recruiting efforts, saying what the company is doing could lead to deep cultural problems down the road. "We have gone from some nerds in the corner to the most interesting people in the tech industry (at least)," he wrote on Slack. "Al Twitter is toxic; Meta is acting in a way that feels somewhat distasteful; I assume things will get even crazier in the future. After I got fired and came back I said that was not the craziest thing that would happen in OpenAl history; certainly neither is this."

ai talent poaching spree, meta, openai, (7 more...)

WIRED

Industry: Information Technology > Services (0.56)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.76)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.76)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.76)
Information Technology > Communications > Social Media (0.56)

Woo, Jaeyun, Lee, Jiseok, Iwana, Brian Kenji

Towards Time Series Generation Conditioned on Unstructured Natural Language

Generative Artificial Intelligence (AI) has rapidly become a powerful tool, capable of generating various types of data, such as images and text. However, despite the significant advancement of generative AI, time series generative AI remains underdeveloped, even though the application of time series is essential in finance, climate, and numerous fields. In this research, we propose a novel method of generating time series conditioned on unstructured natural language descriptions. We use a diffusion model combined with a language model to generate time series from the text. Through the proposed method, we demonstrate that time series generation based on natural language is possible. The proposed method can provide various applications such as custom forecasting, time series manipulation, data augmentation, and transfer learning. Furthermore, we construct and propose a new public dataset for time series generation, consisting of 63,010 time series-description pairs.

artificial intelligence, machine learning, natural language, (17 more...)

2506.22927

Country:

North America > United States > California > Riverside County > Riverside (0.04)
Asia > Japan > Kyūshū & Okinawa > Kyūshū > Fukuoka Prefecture > Fukuoka (0.04)

Genre: Research Report (0.70)

Industry:

Banking & Finance > Trading (0.48)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Innovative Research on IoT Architecture and Robotic Operating Platforms: Applications of Large Language Models and Generative AI

Han, Huiwen

This paper introduces an innovative design for robotic operating platforms, underpinned by a transformative Internet of Things (IoT) architecture, seamlessly integrating cutting-edge technologies such as large language models (LLMs), generative AI, edge computing, and 5G networks. The proposed platform aims to elevate the intelligence and autonomy of IoT systems and robotics, enabling them to make real-time decisions and adapt dynamically to changing environments. Through a series of compelling case studies across industries including smart manufacturing, healthcare, and service sectors, this paper demonstrates the substantial potential of IoT-enabled robotics to optimize operational workflows, enhance productivity, and deliver innovative, scalable solutions. By emphasizing the roles of LLMs and generative AI, the research highlights how these technologies drive the evolution of intelligent robotics and IoT, shaping the future of industry-specific advancements. The findings not only showcase the transformative power of these technologies but also offer a forward-looking perspective on their broader societal and industrial implications, positioning them as catalysts for next-generation automation and technological convergence.

generative ai, large language model, machine learning, (13 more...)

doi: 10.1109/RICAI64321.2024.10911316

2506.22477

Country:

Europe > France (0.14)
Asia > Japan (0.14)
Asia > China (0.14)

Genre:

Research Report (0.83)
Overview > Innovation (0.34)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Information Technology > Smart Houses & Appliances (0.90)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Computational Analysis of Climate Policy

Hicks, Carolyn

This thesis explores the impact of the Climate Emergency movement on local government climate policy, using computational methods. The Climate Emergency movement sought to accelerate climate action at local government level through the mechanism of Climate Emergency Declarations (CEDs), resulting in a series of commitments from councils to treat climate change as an emergency. With the aim of assessing the potential of current large language models to answer complex policy questions, I first built and configured a system named PALLM (Policy Analysis with a Large Language Model), using the OpenAI model GPT-4. This system is designed to apply a conceptual framework for climate emergency response plans to a dataset of climate policy documents. I validated the performance of this system with the help of local government policymakers, by generating analyses of the climate policies of 11 local governments in Victoria and assessing the policymakers' level of agreement with PALLM's responses. Having established that PALLM's performance is satisfactory, I used it to conduct a large-scale analysis of current policy documents from local governments in the state of Victoria, Australia. This thesis presents the methodology and results of this analysis, comparing the results for councils which have passed a CED to those which did not. This study finds that GPT-4 is capable of high-level policy analysis, with limitations including a lack of reliable attribution, and can also enable more nuanced analysis by researchers. Its use in this research shows that councils which have passed a CED are more likely to have a recent and climate-specific policy, and show more attention to urgency, prioritisation, and equity and social justice, than councils which have not. It concludes that the ability to assess policy documents at scale opens up exciting new opportunities for policy researchers.

large language model, machine learning, natural language, (23 more...)

2506.22449

Country: Oceania > Australia > Victoria (0.24)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)
Questionnaire & Opinion Survey (0.92)

Industry:

Law (1.00)
Government (1.00)
Energy > Renewable (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.48)

Hasan, Sharique, Oettl, Alexander, Samila, Sampsa

From Model Design to Organizational Design: Complexity Redistribution and Trade-Offs in Generative AI

We argue that viewing AI as a simple reduction in input costs overlooks two critical dynamics: (a) the inherent trade-offs among generality, accuracy, and simplicity, and (b) the redistribution of complexity across stakeholders. While LLMs appear to defy the traditional trade-off by offering high generality and accuracy through simple interfaces, this user-facing simplicity masks a significant shift of complexity to infrastructure, compliance, and specialized personnel. The GAS trade-off, therefore, does not disappear but is relocated from the user to the organization, creating new managerial challenges, particularly around accuracy in high-stakes applications. We contend that competitive advantage no longer stems from mere AI adoption, but from mastering this redistributed complexity through the design of abstraction layers, workflow alignment, and complementary expertise.

large language model, machine learning, natural language, (21 more...)

2506.2244

Country: North America > United States (1.00)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Instructional Material (1.00)

Industry:

Law (1.00)
Information Technology (1.00)
Health & Medicine (1.00)
(5 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
(3 more...)

Improving Student-AI Interaction Through Pedagogical Prompting: An Example in Computer Science Education

Xiao, Ruiwei, Hou, Xinying, Ye, Runlong, Kazemitabaar, Majeed, Diana, Nicholas, Liut, Michael, Stamper, John

With the proliferation of large language model (LLM) applications since 2022, their use in education has sparked both excitement and concern. Recent studies consistently highlight students' (mis)use of LLMs can hinder learning outcomes. This work aims to teach students how to effectively prompt LLMs to improve their learning. We first proposed pedagogical prompting, a theoretically-grounded new concept to elicit learning-oriented responses from LLMs. To move from concept design to a proof-of-concept learning intervention in real educational settings, we selected early undergraduate CS education (CS1/CS2) as the example context. We began with a formative survey study with instructors (N=36) teaching early-stage undergraduate-level CS courses to inform the instructional design based on classroom needs. Based on their insights, we designed and developed a learning intervention through an interactive system with scenario-based instruction to train pedagogical prompting skills. Finally, we evaluated its instructional effectiveness through a user study with CS novice students (N=22) using pre/post-tests. Through mixed methods analyses, our results indicate significant improvements in learners' LLM-based pedagogical help-seeking skills, along with positive attitudes toward the system and increased willingness to use pedagogical prompts in the future. Our contributions include (1) a theoretical framework of pedagogical prompting; (2) empirical insights into current instructor attitudes toward pedagogical prompting; and (3) a learning intervention design with an interactive learning tool and scenario-based instruction leading to promising results on teaching LLM-based help-seeking. Our approach is scalable for broader implementation in classrooms and has the potential to be integrated into tools like ChatGPT as an on-boarding experience to encourage learning-oriented use of generative AI.

large language model, machine learning, natural language, (19 more...)

2506.19107

Country: North America > United States > Michigan (0.28)

Genre:

Research Report > New Finding (1.00)
Questionnaire & Opinion Survey (1.00)
Overview (1.00)
Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Education > Educational Technology > Educational Software > Computer Based Training (1.00)
Education > Educational Setting (1.00)
Education > Curriculum > Subject-Specific Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.49)

AI Risk-Management Standards Profile for General-Purpose AI (GPAI) and Foundation Models

Barrett, Anthony M., Newman, Jessica, Nonnecke, Brandie, Madkour, Nada, Hendrycks, Dan, Murphy, Evan R., Jackson, Krystal, Raman, Deepika

Increasingly multi-purpose AI models, such as cutting-edge large language models or other 'general-purpose AI' (GPAI) models, 'foundation models,' generative AI models, and 'frontier models' (typically all referred to hereafter with the umbrella term 'GPAI/foundation models' except where greater specificity is needed), can provide many beneficial capabilities but also risks of adverse events with profound consequences. This document provides risk-management practices or controls for identifying, analyzing, and mitigating risks of GPAI/foundation models. We intend this document primarily for developers of large-scale, state-of-the-art GPAI/foundation models; others that can benefit from this guidance include downstream developers of end-use applications that build on a GPAI/foundation model. This document facilitates conformity with or use of leading AI risk management-related standards, adapting and building on the generic voluntary guidance in the NIST AI Risk Management Framework and ISO/IEC 23894, with a focus on the unique issues faced by developers of GPAI/foundation models.

large language model, machine learning, natural language, (23 more...)

2506.23949

Country:

North America > United States (1.00)
Europe (0.67)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(3 more...)