AITopics | Large Language Model

Collaborating Authors

Large Language Model

News Overviews Instructional Materials AI-Alerts Classics

Inside Anduril and Meta's quest to make smart glasses for warfare

MIT Technology ReviewMay-18-2026, 16:01:39 GMT

Inside Anduril and Meta's quest to make smart glasses for warfare It's been a year since the duo entered the US Army's troubled augmented-reality contest. Here's what it looks like so far. The defense-tech company Anduril has shared new details about the augmented-reality headset for the military it's prototyping with Meta, including a vision for ordering drone strikes via eye-tracking and voice commands. Quay Barnett, who leads the efforts as a vice president at Anduril following a career in the Army's Special Operations Command, says his fundamental goal is to optimize "the human as a weapons system." The vision is undoubtedly cyborg-inspired: Barnett wants drones and soldiers to see together, share information seamlessly, and make decisions as one. Anduril actually has two such projects in the works.

anduril, large language model, machine learning, (19 more...)

MIT Technology Review

Country: North America > United States (0.90)

Industry:

Government > Military > Army (1.00)
Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Human Computer Interaction > Interfaces > Virtual Reality (0.76)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.72)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.50)
(2 more...)

Add feedback

The Download: Musk v. Altman week 3, and Trump's tech trading

MIT Technology ReviewMay-18-2026, 12:10:00 GMT

Musk v. Altman week 3: Musk and Altman traded blows over each other's credibility. Now the jury will pick a side. In the final week of the Musk v. Altman trial, lawyers attacked the credibility of the two tech leaders. Sam Altman was accused of lying and self-dealing, while Elon Musk was portrayed as a power-seeker trying to control artificial general intelligence. The case unearthed new details about the two arch-rivals and OpenAI's contested nonprofit status, as well as a golden trophy of a donkey's ass awarded to an employee who challenged Musk. Michelle Kim, who's also a lawyer, has been in court throughout the Musk v. Altman trial.

large language model, machine learning, natural language, (19 more...)

MIT Technology Review

Country:

North America > United States (0.48)
Asia > Middle East > Iran (0.15)

Industry:

Energy (1.00)
Law (0.88)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.51)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.51)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback

AI Has Broken Containment

The Atlantic - TechnologyMay-18-2026, 11:30:00 GMT

Once-speculative concerns about the technology have now become pressing matters. AI has ascended to the role of main character. When Donald Trump traveled to Beijing for an historic summit last week, AI was one of the central topics of his discussions with Xi Jinping. As the two nations remain locked in a technological arms race, the president brought along some of the United States' most powerful AI executives, including Elon Musk and Nvidia's Jensen Huang. A continent away, the European Union has been unsuccessfully petitioning Anthropic to grant access to its advanced cybersecurity model, Mythos. Back in the United States, millions of students and teachers are dealing with the fallout of a devastating ransomware attack on the software platform Canvas--a hack that was likely aided by AI tools.

large language model, machine learning, popular latest newsletter, (13 more...)

The Atlantic - Technology

Country:

North America > United States (1.00)
Asia > China > Beijing > Beijing (0.25)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.51)
Information Technology > Communications > Social Media (0.49)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.32)

Add feedback

The Prehistory of A.I. Slop

The New YorkerMay-18-2026, 10:00:00 GMT

Jill Lepore chronicles the rise of machine-generated writing, from a Hollywood plot-writing grift and Cold War computer poetry to the age of ChatGPT.

large language model, natural language, poetry, (13 more...)

The New Yorker

Country: North America > United States (0.14)

Industry:

Media (1.00)
Health & Medicine > Therapeutic Area (0.48)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.34)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.34)

Add feedback

Representation Without Reward: A JEPA Audit for LLM Fine-Tuning

Sengupta, Biswa

arXiv.org Machine LearningMay-18-2026

Joint-embedding predictive architectures (JEPAs) propose that a model should learn more useful abstractions when trained to predict latent representations rather than observed outputs. For autoregressive language-model fine-tuning the principle entails a stricter requirement: the induced hidden-state geometry must reach the language-model head \emph{and} improve the decoded task metric. We test that requirement under a fixed Llama-3.2-1B-Instruct LoRA harness on natural-language-to-regex generation, comparing twenty-two training-time auxiliaries across trajectory-shape regularisation, distributional constraints, predictor/target asymmetry, Fisher-metric Jacobi residuals, and a decoder-visible JEPA objective constructed to lie in cross-entropy's positive cone. The empirical answer is a structured null: several auxiliaries clear single-cell paired $α= 0.10$ without correction (T3-Local at $Δ= +2.53$~pp, $p = 0.003$ being the strongest), but none survives Bonferroni or Holm--Bonferroni at the relevant family-wise threshold, even though many change curvature, anisotropy, variance, and gradient direction. Decoder-visible JEPA yields the first positive auxiliary--cross-entropy gradient cosine in the study, yet exact match remains inside seed noise; a full-fine-tuning replication of the same auxiliary at $n = 5$ seeds reproduces the null on both benchmarks (TURK: $Δ= +0.04$~pp, $p_{\text{paired}} = 0.96$; SYNTH: $Δ= +0.52$~pp, $p_{\text{paired}} = 0.28$), so the null is robust across LoRA and full fine-tuning for the decoder-visible construction. Hidden-state representation work and decoded-task accuracy are therefore weakly coupled in this regime; we accordingly reframe LLM-domain JEPA evaluation as a coupling problem, in which the operative question is under which metrics useful hidden geometry becomes decoder-visible task signal.

large language model, machine learning, natural language, (21 more...)

arXiv.org Machine Learning

2605.15394

Genre:

Research Report > New Finding (0.48)
Research Report > Experimental Study (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.72)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback

$ϕ$-Balancing for Mixture-of-Experts Training

Chen, Lizhang, Li, Jonathan, Wang, Qi, Liao, Runlong, Li, Shuozhe, Liang, Chen, Lao, Ni, Liu, Qiang

arXiv.org Machine LearningMay-18-2026

Mixture-of-Experts (MoE) models rely on balanced expert utilization to fully realize their scalability. However, existing load-balancing methods are largely heuristic and operate on noisy mini-batch assignment statistics, introducing bias relative to population-level objectives. We propose $ϕ$-balancing, a principled framework that directly targets population-level expert balance by minimizing a strictly convex, symmetric, and differentiable potential of the expected routing distribution. Using convex duality, we derive an equivalent min-max formulation and obtain a simple online algorithm via mirror descent, yielding an efficient EMA-based routing adjustment with negligible overhead. Across large-scale pretraining and downstream fine-tuning, $ϕ$-balancing consistently outperforms prior Switch-style and loss-free baselines, demonstrating more stable and effective expert utilization.

large language model, machine learning, natural language, (21 more...)

arXiv.org Machine Learning

2605.15403

Country:

North America > United States (0.28)
Asia (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)

Add feedback

Reasoning Models Don't Just Think Longer, They Move Differently

Gjølbye, Anders, Hansen, Lars Kai, Koyejo, Sanmi

arXiv.org Machine LearningMay-18-2026

Reasoning-trained language models often spend more tokens on harder problems, but longer chains of thought do not show whether a model is merely computing for more steps or following a different internal trajectory. We study this distinction through hidden-state trajectories during chain-of-thought generation across competitive programming, mathematics, and Boolean satisfiability. Raw trajectory geometry is strongly shaped by generation length: longer generations mechanically alter path statistics, so difficulty-dependent comparisons are misleading without adjustment. After residualizing trajectory statistics on length, difficulty remains systematically coupled to corrected trajectory geometry across all domains studied. The clearest reasoning-specific separation appears in the code domain, where harder problems show more direct corrected trajectories and less heterogeneous local curvature in reasoning-trained models than in matched instruction-tuned baselines. Corrected difficulty-geometry coupling is weaker, but still present, in mathematics and Boolean satisfiability. Prompt-stage linear probes do not mirror the code-domain separation, and behavioral annotations show that stronger corrected coupling co-occurs with strategy shifts and uncertainty monitoring. Together, these findings establish length correction as a prerequisite for generation-time trajectory analysis and show that reasoning training can be associated with distinct corrected trajectory geometry, with the strength of the effect depending on the domain.

large language model, machine learning, trajectory, (22 more...)

arXiv.org Machine Learning

2605.15454

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.69)

Add feedback

OpenAI is offering ChatGPT Plus to citizens of Malta for a year

EngadgetMay-16-2026, 16:34:06 GMT

OpenAI has signed deals with fintech startups, tech giants and even Disney, but it's breaking new ground by announcing a world's first partnership with the country of Malta. In a post on its website, OpenAI said that it would provide ChatGPT Plus for one year to every Maltese resident or citizen. Malta is the first country to launch a partnership of this scale because we refuse to let our citizens stay behind in the digital age, Silvio Schembri, Malta's minister for Economy, Enterprise and Strategic Projects, said in a statement. We are putting our people at the very forefront of global change. For the approximately 574,250 residents living in Malta, they'll have to complete a course developed by the University of Malta before launching the ChatGPT Plus subscription, which costs $20 a month in the US.

large language model, machine learning, natural language, (12 more...)

Engadget

Country: Europe > Middle East > Malta (1.00)

Genre: Instructional Material > Course Syllabus & Notes (0.57)

Industry: Leisure & Entertainment > Games > Computer Games (0.77)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.99)

Add feedback

What we learned from the cringey courtroom drama between Elon Musk and Sam Altman

The GuardianMay-16-2026, 15:00:12 GMT

Both Musk and Altman took the stand for hours, facing combative cross-examinations that painted them each as untrustworthy. Both Musk and Altman took the stand for hours, facing combative cross-examinations that painted them each as untrustworthy. Two of the world's richest people faced an airing of their dirty laundry amid their messy, bitter feud over OpenAI A nine-person jury is set to decide whether Elon Musk's allegations of "stealing a charity" against Sam Altman and OpenAI are legitimate, with deliberations to begin in earnest on Monday. Whatever its outcome, the case has been an illuminating, at times exhausting, look behind the scenes at the history of OpenAI and how some of the most powerful figures in the tech industry operate. Attorneys for both sides have introduced reams of private text messages, emails and even diary entries to support their arguments.

large language model, machine learning, musk, (18 more...)

The Guardian

Country: North America > United States > California (0.31)

Industry:

Information Technology (1.00)
Law > Litigation (0.89)
Leisure & Entertainment > Sports (0.71)
Government > Regional Government > North America Government > United States Government (0.31)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.78)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.78)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.78)
Information Technology > Communications > Social Media (0.71)

Add feedback

Cybercriminal Twins Caught After They Forgot to Turn Off Microsoft Teams Recording

WIREDMay-16-2026, 10:30:00 GMT

Plus: Instructure's Canvas ransomware debacle comes to a close, an alleged dark net market kingpin gets arrested, OpenAI workers fall victim to a supply chain attack, and more. The worst part of your iPhone getting stolen may not be the theft itself. Instead, it's the phishing attacks waged against people in your contacts. New research this week shows that there's a thriving ecosystem for tools that let criminals unlock iPhones and target the phone numbers they find inside. Foxconn, the electronics manufacturing giant known for its role in building iPhones, revealed this week that it recently "suffered a cyberattack."

artificial intelligence, large language model, natural language, (15 more...)

WIRED

Country:

North America > United States (0.70)
Asia > Middle East > Iran (0.29)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (0.35)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Communications > Mobile (0.91)
Information Technology > Communications > Social Media (0.85)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.36)

Add feedback