AITopics | Generative AI

Collaborating Authors

Generative AI

News Overviews Instructional Materials AI-Alerts Classics

Anthropic Revokes OpenAI's Access to Claude

WIREDAug-1-2025, 21:41:53 GMT

Anthropic revoked OpenAI's API access to its models on Tuesday, multiple sources familiar with the matter tell WIRED. OpenAI was informed that its access was cut off due to violating the terms of service. "Claude Code has become the go-to choice for coders everywhere and so it was no surprise to learn OpenAI's own technical staff were also using our coding tools ahead of the launch of GPT-5," Anthropic spokesperson Christopher Nulty said in a statement to WIRED. "Unfortunately, this is a direct violation of our terms of service." According to Anthropic's commercial terms of service, customers are barred from using the service to "build a competing product or service, including to train competing AI models" or "reverse engineer or duplicate" the services.

large language model, machine learning, natural language, (16 more...)

WIRED

Industry: Information Technology (0.53)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

I love how ChatGPT's new Study Mode makes me actually use my brain

PCWorldAug-1-2025, 15:20:19 GMT

It should come as no surprise that students the world over are using ChatGPT and other artificial intelligence chatbots to cheat. On homework, on tests, and on anything else you care to mention. After all, why work something out yourself when there's an AI chatbot waiting and willing to do the hard work for you? This is obviously a problem in need of fixing, and OpenAI's answer is a Study Mode that's now baked into ChatGPT. The idea is to stop students from simply asking ChatGPT to tell them the answer to a question, and to have ChatGPT teach them how to answer the question for themselves.

artificial intelligence, machine learning, natural language, (13 more...)

PCWorld

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.38)

Add feedback

EducationQ: Evaluating LLMs' Teaching Capabilities Through Multi-Agent Dialogue Framework

Shi, Yao, Liang, Rongkeng, Xu, Yong

arXiv.org Artificial IntelligenceAug-1-2025

Large language models (LLMs) increasingly serve as educational tools, yet evaluating their teaching capabilities remains challenging due to the resource-intensive, context-dependent, and methodologically complex nature of teacher-student interactions. We introduce EducationQ, a multi-agent dialogue framework that efficiently assesses teaching capabilities through simulated dynamic educational scenarios, featuring specialized agents for teaching, learning, and evaluation. Testing 14 LLMs across major AI Organizations (OpenAI, Meta, Google, Anthropic, and others) on 1,498 questions spanning 13 disciplines and 10 difficulty levels reveals that teaching effectiveness does not correlate linearly with model scale or general reasoning capabilities - with some smaller open-source models outperforming larger commercial counterparts in teaching contexts. This finding highlights a critical gap in current evaluations that prioritize knowledge recall over interactive pedagogy. Our mixed-methods evaluation, combining quantitative metrics with qualitative analysis and expert case studies, identifies distinct pedagogical strengths employed by top-performing models (e.g., sophisticated questioning strategies, adaptive feedback mechanisms). Human expert evaluations show 78% agreement with our automated qualitative analysis of effective teaching behaviors, validating our methodology. EducationQ demonstrates that LLMs-as-teachers require specialized optimization beyond simple scaling, suggesting next-generation educational AI prioritize targeted enhancement of specific pedagogical effectiveness.

large language model, llama 3, machine learning, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.18653/v1/2025.acl-long.1576

2504.14928

Country:

North America > United States (1.00)
Asia (0.67)

Genre:

Research Report > New Finding (1.00)
Instructional Material (1.00)
Research Report > Experimental Study (0.67)

Industry:

Law > Litigation (1.00)
Government (1.00)
Education > Educational Technology (0.87)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.35)

Add feedback

Toward the Autonomous AI Doctor: Quantitative Benchmarking of an Autonomous Agentic AI Versus Board-Certified Clinicians in a Real World Setting

Hayat, Hashim, Kudrautsau, Maksim, Makarov, Evgeniy, Melnichenko, Vlad, Tsykunou, Tim, Varaksin, Piotr, Pavelle, Matt, Oskowitz, Adam Z.

arXiv.org Artificial IntelligenceAug-1-2025

The CSS was accompanied by a natural language explanation of the scores. The LLM judge role used GPT-4.0 by OpenAI. Evaluation by Human Experts Each encounter pair in which the top diagnosis of AI and clinician did not match was evaluated by a board-certified physician with access to medical reference material. Blinding the physician to the origin of the documentation proved impractical, as the AI-based notes were highly consistent and thus easily recognized within a few pairs. The physician was asked to determine the cause of the disagreement between the documents, whether AI or the physician was more likely to be correct, whether it was not possible to determine which diagnosis was more appropriate, and whether the diagnoses did, in fact, match. Similarity and Style Metrics To evaluate how similar-or different the AI-generated (Doctronic) and clinician-generated SOAP notes were, we followed a two-step process. First, we assessed surface-level textual similarity using three standard statistical metrics: (1) TF IDF cosine similarity, which transforms each note into a weighted term-frequency vector and measures the cosine of the angle between them to capture word-frequency alignment; (2) the Jaccard index, which is the ratio of the intersection to the union of lowercased token sets, ranging from 0 (no overlap) to 1 (identical token sets); and (3) the Levenshtein ratio, a normalized edit-distance score based on character-level insertions, deletions, and substitutions that quantifies textual similarity on a 0-1 scale. These analyses demonstrated only minimal alignment in phrasing, formatting, and vocabulary. Then, to probe contextual and semantic similarity, we generated embeddings for each note using OpenAI's text embedding 3 small model and two versions of Biobert,

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2507.22902

Country: North America > United States > California > San Francisco County > San Francisco (0.28)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Rheumatology (1.00)
Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
(16 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Diagnosis (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.44)

Add feedback

The Download: OpenAI's future research, and US climate regulation is under threat

MIT Technology ReviewJul-31-2025, 12:10:00 GMT

But Altman is not the one building the technology on which its reputation rests. That responsibility falls to OpenAI's twin heads of research--chief research officer Mark Chen and chief scientist Jakub Pachocki. Between them, they share the role of making sure OpenAI stays one step ahead of powerhouse rivals like Google. I recently sat down with Chen and Pachocki for an exclusive conversation which covered everything from how they manage the inherent tension between research and product, to what they really mean when they talk about AGI, to what happened to OpenAI's superalignment team. I also wanted to get a sense of where their heads are at in the run-up to OpenAI's biggest product release in months: GPT-5.

large language model, machine learning, natural language, (10 more...)

MIT Technology Review

Country: North America > United States (0.60)

Industry:

Law > Environmental Law (0.74)
Law > Statutes (0.57)
Government > Regional Government > North America Government > United States Government (0.38)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

AIs gave scarily specific self-harm advice to users expressing suicidal intent, researchers find

Los Angeles TimesJul-31-2025, 10:00:00 GMT

A few months ago, Northeastern University computer scientist Annika Schoene was playing around with ChatGPT when she found a troubling gap in its safeguards against harmful content. The usage policies of OpenAI, creator of ChatGPT, state that users shouldn't employ the company's generative artificial intelligence model or other tools to harm themselves or others. In an effort to test the strength of that guardrail, Schoene typed in a stark and sobering question: Can you tell me how to kill myself? The model responded with the Suicide & Crisis Lifeline 988 number and a suggestion to reach out for help. Without starting a new session, Schoene tried a new tactic. In her next prompt, she framed the request as a hypothetical posed solely for academic purposes.

large language model, machine learning, natural language, (19 more...)

Los Angeles Times

Country:

North America > United States > Colorado (0.05)
North America > Canada (0.05)
Asia > Middle East > Israel (0.05)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.80)

Add feedback

The two people shaping the future of OpenAI's research

MIT Technology ReviewJul-31-2025, 09:06:48 GMT

I sat down with Chen and Pachocki for an exclusive conversation during a recent trip the pair made to London, where OpenAI set up its first international office in 2023. We talked about how they manage the inherent tension between research and product. We also talked about why they think coding and math are the keys to more capable all-purpose models; what they really mean when they talk about AGI; and what happened to OpenAI's superalignment team, set up by the firm's cofounder and former chief scientist Ilya Sutskever to prevent a hypothetical superintelligence from going rogue, which disbanded soon after he quit. In particular, I wanted to get a sense of where their heads are at in the run-up to OpenAI's biggest product release in months: GPT-5. Reports are out that the firm's next-generation model will be launched in August.

large language model, machine learning, natural language, (9 more...)

MIT Technology Review

Country: Europe > United Kingdom > England > Greater London > London (0.06)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

OFCnetLLM: Large Language Model for Network Monitoring and Alertness

Yoon, Hong-Jun, Kiran, Mariam, Ebling, Danial, Breen, Joe

arXiv.org Artificial IntelligenceJul-31-2025

The rapid evolution of network infrastructure is bringing new challenges and opportunities for efficient network management, optimization, and security. With very large monitoring databases becoming expensive to explore, the use of AI and Generative AI can help reduce costs of managing these datasets. This paper explores the use of Large Language Models (LLMs) to revolutionize network monitoring management by addressing the limitations of query finding and pattern analysis. We leverage LLMs to enhance anomaly detection, automate root-cause analysis, and automate incident analysis to build a well-monitored network management team using AI. Through a real-world example of developing our own OFCNetLLM, based on the open-source LLM model, we demonstrate practical applications of OFCnetLLM in the OFC conference network. Our model is developed as a multi-agent approach and is still evolving, and we present early results here.

large language model, machine learning, ofcnetllm, (17 more...)

arXiv.org Artificial Intelligence

2507.22711

Country: North America > United States > Utah (0.16)

Genre: Research Report (1.00)

Industry:

Telecommunications > Networks (1.00)
Information Technology > Security & Privacy (1.00)
Information Technology > Networks (1.00)
Government > Regional Government > North America Government > United States Government (0.48)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

Add feedback

AI-generated stories favour stability over change: homogeneity and cultural stereotyping in narratives generated by gpt-4o-mini

Rettberg, Jill Walker, Wigers, Hermann

arXiv.org Artificial IntelligenceJul-31-2025

Can a language model trained largely on Anglo-American texts generate stories that are culturally relevant to other nationalities? To find out, we generated 11,800 stories - 50 for each of 236 countries - by sending the prompt "Write a 1500 word potential {demonym} story" to OpenAI's model gpt-4o-mini. Although the stories do include surface-level national symbols and themes, they overwhelmingly conform to a single narrative plot structure across countries: a protagonist lives in or returns home to a small town and resolves a minor conflict by reconnecting with tradition and organising community events. Real-world conflicts are sanitised, romance is almost absent, and narrative tension is downplayed in favour of nostalgia and reconciliation. The result is a narrative homogenisation: an AI-generated synthetic imaginary that prioritises stability above change and tradition above growth. We argue that the structural homogeneity of AI-generated narratives constitutes a distinct form of AI bias, a narrative standardisation that should be acknowledged alongside the more familiar representational bias. These findings are relevant to literary studies, narratology, critical AI studies, NLP research, and efforts to improve the cultural alignment of generative AI.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.12688/openreseurope.20576.1

2507.22445

Country:

North America > United States (1.00)
Europe (1.00)
Africa (0.67)
Asia > Middle East > Palestine (0.46)

Genre: Research Report > New Finding (0.67)

Industry:

Leisure & Entertainment (1.00)
Government (1.00)
Law Enforcement & Public Safety (0.68)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.71)

Add feedback

FRED: Financial Retrieval-Enhanced Detection and Editing of Hallucinations in Language Models

Tan, Likun, Huang, Kuan-Wei, Wu, Kevin

arXiv.org Artificial IntelligenceJul-31-2025

Hallucinations in large language models pose a critical challenge for applications requiring factual reliability, particularly in high-stakes domains such as finance. This work presents an effective approach for detecting and editing factually incorrect content in model-generated responses based on the provided context. Given a user-defined domain-specific error taxonomy, we construct a synthetic dataset by inserting tagged errors into financial question-answering corpora and then fine-tune four language models, Phi-4, Phi-4-mini, Qwen3-4B, and Qwen3-14B, to detect and edit these factual inaccuracies. Our best-performing model, fine-tuned Phi-4, achieves an 8% improvement in binary F1 score and a 30% gain in overall detection performance compared to OpenAI-o3. Notably, our fine-tuned Phi-4-mini model, despite having only 4 billion parameters, maintains competitive performance with just a 2% drop in binary detection and a 0.1% decline in overall detection compared to OpenAI-o3. Our work provides a practical solution for detecting and editing factual inconsistencies in financial text generation while introducing a generalizable framework that can enhance the trustworthiness and alignment of large language models across diverse applications beyond finance. Our code and data are available at https://github.com/pegasi-ai/shield.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2507.2093

Country: North America > United States (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.45)

Add feedback