AITopics | Large Language Model

SoftBank CEO Masayoshi Son (left) and OpenAI CEO Sam Altman attend an event in Tokyo in February 2025. SoftBank's investment gain on OpenAI stood at an estimated $19.8 billion as of December. SoftBank Group sprang back to a quarterly profit after investment gains from OpenAI neared $20 billion, a promising start for one of CEO Masayoshi Son's signature gambles alongside ByteDance and Alibaba Group Holding. The Tokyo-based company has invested about $34.6 billion in OpenAI, accumulating an 11% stake as of December, and has been in talks to invest as much as $30 billion more in a round that would value the startup at about $750 billion to $830 billion. As of December, SoftBank's investment gain on OpenAI stood at $19.8 billion, the company said Thursday.

large language model, machine learning, natural language, (17 more...)

The Japan Times

Country: Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.48)

Genre: Press Release (0.36)

Industry:

Telecommunications (1.00)
Information Technology (1.00)
Media > News (0.31)
Leisure & Entertainment > Sports (0.31)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Add feedback

fb7451e43f9c1c35b774bcfad7a5714b-Paper-Conference.pdf

Neural Information Processing SystemsFeb-13-2026, 01:16:11 GMT

arxiv preprint arxiv, generalization, length generalization, (13 more...)

Neural Information Processing Systems

Country: North America > Canada > Ontario > Toronto (0.14)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Quantum Circuit Generation via test-time learning with large language models

Macarone-Palmieri, Adriano, Franco, Rosario Lo

arXiv.org Machine LearningFeb-13-2026

Large language models (LLMs) can generate structured artifacts, but using them as dependable optimizers for scientific design requires a mechanism for iterative improvement under black-box evaluation. Here, we cast quantum circuit synthesis as a closed-loop, test-time optimization problem: an LLM proposes edits to a fixed-length gate list, and an external simulator evaluates the resulting state with the Meyer-Wallach (MW) global entanglement measure. We introduce a lightweight test-time learning recipe that can reuse prior high-performing candidates as an explicit memory trace, augments prompts with a score-difference feedback, and applies restart-from-the-best sampling to escape potential plateaus. Across fixed 20-qubit settings, the loop without feedback and restart-from-the-best improves random initial circuits over a range of gate budgets. To lift up this performance and success rate, we use the full learning strategy. For the 25-qubit, it mitigates a pronounced performance plateau when naive querying is used. Beyond raw scores, we analyze the structure of synthesized states and find that high MW solutions can correspond to stabilizer or graph-state-like constructions, but full connectivity is not guaranteed due to the metric property and prompt design. These results illustrate both the promise and the pitfalls of memory evaluator-guided LLM optimization for circuit synthesis, highlighting the critical role of prior human-made theoretical theorems to optimally design a custom tool in support of research.

cnot, large language model, natural language, (18 more...)

arXiv.org Machine Learning

2602.03466

Country:

North America > United States (0.04)
Europe > Switzerland (0.04)
Europe > Italy > Veneto > Venice (0.04)
Europe > Italy > Sicily > Palermo (0.04)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Deriving Neural Scaling Laws from the statistics of natural language

Cagnetta, Francesco, Raventós, Allan, Ganguli, Surya, Wyart, Matthieu

arXiv.org Machine LearningFeb-13-2026

Despite the fact that experimental neural scaling laws have substantially guided empirical progress in large-scale machine learning, no existing theory can quantitatively predict the exponents of these important laws for any modern LLM trained on any natural language dataset. We provide the first such theory in the case of data-limited scaling laws. We isolate two key statistical properties of language that alone can predict neural scaling exponents: (i) the decay of pairwise token correlations with time separation between token pairs, and (ii) the decay of the next-token conditional entropy with the length of the conditioning context. We further derive a simple formula in terms of these statistics that predicts data-limited neural scaling exponents from first principles without any free parameters or synthetic data models. Our theory exhibits a remarkable match with experimentally measured neural scaling laws obtained from training GPT-2 and LLaMA style models from scratch on two qualitatively different benchmarks, TinyStories and WikiText.

large language model, machine learning, natural language, (17 more...)

arXiv.org Machine Learning

2602.07488

Country: