AITopics | Large Language Model

Collaborating Authors

Large Language Model

News Overviews Instructional Materials AI-Alerts Classics

When the AI bubble bursts, humans will finally have their chance to take back control Rafael Behr

The GuardianDec-23-2025, 06:00:49 GMT

The US economy is pumped up on tech-bro vanity. I f AI did not change your life in 2025, next year it will. That is one of few forecasts that can be made with confidence in unpredictable times. This is not an invitation to believe the hype about what the technology can do today, or may one day achieve. The hype doesn't need your credence.

ai bubble burst, intelligence, take back control rafael behr, (7 more...)

The Guardian

Country:

North America > United States > California (0.07)
Oceania > Australia (0.05)
Europe > United Kingdom (0.05)
(2 more...)

Industry:

Leisure & Entertainment > Sports (0.71)
Government > Regional Government > North America Government > United States Government (0.69)

Technology:

Information Technology > Communications > Social Media (0.74)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.52)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.51)

Add feedback

Secret mixtures of experts inside your LLM

Boix-Adsera, Enric

arXiv.org Machine LearningDec-23-2025

Despite being one of the earliest neural network layers, the Multilayer Perceptron (MLP) is arguably one of the least understood parts of the transformer architecture due to its dense computation and lack of easy visualization. This paper seeks to understand the MLP layers in dense LLM models by hypothesizing that these layers secretly approximately perform a sparse computation -- namely, that they can be well approximated by sparsely-activating Mixture of Experts (MoE) layers. Our hypothesis is based on a novel theoretical connection between MoE models and Sparse Autoencoder (SAE) structure in activation space. We empirically validate the hypothesis on pretrained LLMs, and demonstrate that the activation distribution matters -- these results do not hold for Gaussian data, but rather rely crucially on structure in the distribution of neural network activations. Our results shine light on a general principle at play in MLP layers inside LLMs, and give an explanation for the effectiveness of modern MoE-based transformers. Additionally, our experimental explorations suggest new directions for more efficient MoE architecture design based on low-rank routers.

arxiv preprint arxiv, mlp layer, moe, (14 more...)

arXiv.org Machine Learning

2512.18452

Country:

North America > United States > Pennsylvania (0.04)
North America > United States > District of Columbia > Washington (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

OpenAI's Child Exploitation Reports Increased Sharply This Year

WIREDDec-22-2025, 16:32:20 GMT

OpenAI's Child Exploitation Reports Increased Sharply This Year The company made 80 times as many reports to the National Center for Missing & Exploited Children during the first six months of 2025 as it did in the same period a year prior. OpenAI sent 80 times as many child exploitation incident reports to the National Center for Missing & Exploited Children during the first half of 2025 as it did during a similar time period in 2024, according to a recent update from the company. The NCMEC's CyberTipline is a Congressionally authorized clearinghouse for reporting child sexual abuse material (CSAM) and other forms of child exploitation. Companies are required by law to report apparent child exploitation to the CyberTipline. When a company sends a report, NCMEC reviews it and then forwards it to the appropriate law enforcement agency for investigation.

chatgpt, child exploitation report, openai, (12 more...)

WIRED

Country:

North America > United States > California (0.05)
Europe > Slovakia (0.05)
Europe > Czechia (0.05)
(2 more...)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Law (1.00)
Information Technology (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

Five AI Developments That Changed Everything This Year

TIME - TechDec-22-2025, 13:00:00 GMT

President Donald Trump speaks in the Roosevelt Room flanked by Masayoshi Son, Larry Ellison, and Sam Altman at the White House on January 21, 2025. President Donald Trump speaks in the Roosevelt Room flanked by Masayoshi Son, Larry Ellison, and Sam Altman at the White House on January 21, 2025. In case you missed it, 2025 was a big year for AI. It became an economic force, propping up the stock market, and a geopolitical pawn, redrawing the frontlines of Great Power competition. It had both global and deeply personal effects, changing the ways that we think, write, and relate.

advertisement, ai development, openai, (15 more...)

TIME - Tech

Country:

Asia > China (0.43)
Europe > France (0.05)
North America > United States > Missouri (0.05)
(2 more...)

Industry: Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.82)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.81)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.58)

Add feedback

Brain Gear Is the Hot New Wearable

WIREDDec-22-2025, 11:00:00 GMT

Smartwatches are cool and all, but have you considered wearable neurotech? Ten years ago, a Fitbit was about as sophisticated a wearable as you could get. Then came the sleeker, more unassuming Oura ring . Now there's a new breed of wearables--built for your head. Instead of tracking your step count, heart rate, and skin temperature, these devices are designed to read your brain waves.

brain gear, brain signal, hot new wearable, (10 more...)

WIRED

Country:

Europe > United Kingdom (0.05)
Asia > China (0.05)
Oceania > Australia (0.05)
(7 more...)

Genre: Research Report (0.50)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.96)

Technology:

Information Technology > Hardware (0.91)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.48)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.30)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.30)

Add feedback

Tech Disrupted Friendship. It's Time to Bring It Back

WIREDDec-22-2025, 11:00:00 GMT

Two decades ago, social media promised to connect people with pals far and wide. Twenty years online has left us turning to AI for kinship. IRL companionship is the future. Anyone looking for a vibe check on the populace's current feelings about AI would do well to check out the walls of the New York City subway system. This fall, alongside posters for everything from dating apps to Skechers, a newcomer made its debut: Friend .

chatbot, tech disrupted friendship, time, (14 more...)

WIRED

Country:

North America > United States > New York (0.25)
North America > United States > California (0.15)
Asia > China (0.05)
(3 more...)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)
Information Technology > Services (0.95)
Transportation (0.69)
Media (0.69)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.75)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Mitigating Forgetting in Low Rank Adaptation

Sliwa, Joanna, Schneider, Frank, Hennig, Philipp, Hernandez-Lobato, Jose Miguel

arXiv.org Machine LearningDec-22-2025

Parameter-efficient fine-tuning methods, such as Low-Rank Adaptation (LoRA), enable fast specialization of large pre-trained models to different downstream applications. However, this process often leads to catastrophic forgetting of the model's prior domain knowledge. We address this issue with LaLoRA, a weight-space regularization technique that applies a Laplace approximation to Low-Rank Adaptation. Our approach estimates the model's confidence in each parameter and constrains updates in high-curvature directions, preserving prior knowledge while enabling efficient target-domain learning. By applying the Laplace approximation only to the LoRA weights, the method remains lightweight. We evaluate LaLoRA by fine-tuning a Llama model for mathematical reasoning and demonstrate an improved learning-forgetting trade-off, which can be directly controlled via the method's regularization strength. We further explore different loss landscape curvature approximations for estimating parameter confidence, analyze the effect of the data used for the Laplace approximation, and study robustness across hyperparameters.

accuracy, approximation, fine-tuning, (13 more...)

arXiv.org Machine Learning

2512.1772

Country:

Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > United States > Virginia (0.04)
(2 more...)

Genre: Research Report > New Finding (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Smoothing DiLoCo with Primal Averaging for Faster Training of LLMs

Defazio, Aaron, Mishchenko, Konstantin, Raman, Parameswaran, Shi, Hao-Jun Michael, Xiao, Lin

arXiv.org Machine LearningDec-22-2025

We propose Generalized Primal Averaging (GPA), an extension of Nesterov's method in its primal averaging formulation that addresses key limitations of recent averaging-based optimizers such as single-worker DiLoCo and Schedule-Free (SF) in the non-distributed setting. These two recent algorithmic approaches improve the performance of base optimizers, such as AdamW, through different iterate averaging strategies. Schedule-Free explicitly maintains a uniform average of past weights, while single-worker DiLoCo performs implicit averaging by periodically aggregating trajectories, called pseudo-gradients, to update the model parameters. However, single-worker DiLoCo's periodic averaging introduces a two-loop structure, increasing its memory requirements and number of hyperparameters. GPA overcomes these limitations by decoupling the interpolation constant in the primal averaging formulation of Nesterov. This decoupling enables GPA to smoothly average iterates at every step, generalizing and improving upon single-worker DiLoCo. Empirically, GPA consistently outperforms single-worker DiLoCo while removing the two-loop structure, simplifying hyperparameter tuning, and reducing its memory overhead to a single additional buffer. On the Llama-160M model, GPA provides a 24.22% speedup in terms of steps to reach the baseline (AdamW's) validation loss. Likewise, GPA achieves speedups of 12% and 27% on small and large batch setups, respectively, to attain AdamW's validation accuracy on the ImageNet ViT workload. Furthermore, we prove that for any base optimizer with regret bounded by $O(\sqrt{T})$, where $T$ is the number of iterations, GPA can match or exceed the convergence guarantee of the original optimizer, depending on the choice of interpolation constants.

diloco, formulation, optimizer, (15 more...)

arXiv.org Machine Learning

2512.17131

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > California > San Mateo County > Menlo Park (0.04)
Europe > Russia (0.04)
Asia > Russia (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

You can now tweak how warm and enthusiastic ChatGPT's responses are

EngadgetDec-21-2025, 20:40:04 GMT

LG TVs add'delete' option for Copilot You can now tweak how warm and enthusiastic ChatGPT's responses are OpenAI is letting users decide between more, less or default options to adjust ChatGPT's personality. OpenAI gave its AI chatbot a professional makeover with the latest GPT-5.2 For anyone who's finding ChatGPT rude or sassy, OpenAI has some welcome news since it's letting users further customize its personality with extra warmth or enthusiasm. You can now adjust specific characteristics in ChatGPT, like warmth, enthusiasm, and emoji use. In a post on X, OpenAI revealed that users can adjust characteristics under new Warm, Enthusiastic, Header & Lists and Emoji options found in the Personalization settings.

enthusiastic chatgpt, openai, term and privacy policy, (8 more...)

Engadget

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

Get This All-In-One AI Platform for 86% Off Before the New Year

PCWorldDec-21-2025, 08:00:00 GMT

When you purchase through links in our articles, we may earn a small commission. Lifetime access to the 1minAI Advanced Business Plan is now $74.97 (MSRP $540). The end of the year is when people start hunting for upgrades that will actually make life easier in January. If you have spent most of 2024 bouncing between ChatGPT, Gemini, Claude, and your favorite image generator, this holiday deal rolls everything into one place. For $74.97, you can lock in lifetime access to the 1minAI Advanced Business Plan and begin 2025 with every AI tool you could possibly need .

gaming laptop mobile monitor pc, mobile monitor pc, security software storage streaming wi-fi, (7 more...)

PCWorld

Country: North America > United States > California (0.05)

Industry: Information Technology > Security & Privacy (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.51)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback