AITopics | floattensor

Collaborating Authors

floattensor

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Discovering Preference Optimization Algorithms with and for Large Language Models Chris Lu

Neural Information Processing SystemsFeb-16-2026, 23:50:24 GMT

Typically, preference optimization is approached as an offline supervised learning task using manually crafted convex loss functions. While these methods are based on theoretical insights, they are inherently constrained by human creativity, so the large search space of possible loss functions remains under-explored.

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
(3 more...)

Genre: Research Report > Experimental Study (1.00)

Industry: Media (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Fν(x) xi exp(h t) xi dt |t=0 =nX

Neural Information Processing SystemsFeb-11-2026, 02:12:37 GMT

For the synthetic regression task, it suffices to use only0.25% of the training samples to extract the symmetry as precisely as when utilizing all available training data. Symmetries are extracted from the models with the highest validation accuracyover300training epochs.

artificial intelligence, machine learning, torch, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.52)

Add feedback

Understanding Diffusion Models via Code Execution

Yu, Cheng

arXiv.org Artificial IntelligenceDec-9-2025

Diffusion models have achieved remarkable performance in generative modeling, yet their theoretical foundations are often intricate, and the gap between mathematical formulations in papers and practical open-source implementations can be difficult to bridge. Existing tutorials primarily focus on deriving equations, offering limited guidance on how diffusion models actually operate in code. To address this, we present a concise implementation of approximately 300 lines that explains diffusion models from a code-execution perspective. Our minimal example preserves the essential components -- including forward diffusion, reverse sampling, the noise-prediction network, and the training loop -- while removing unnecessary engineering details. This technical report aims to provide researchers with a clear, implementation-first understanding of how diffusion models work in practice and how code and theory correspond. Our code and pre-trained models are available at: https://github.com/disanda/GM/tree/main/DDPM-DDIM-ClassifierFree.

artificial intelligence, machine learning, torch, (19 more...)

arXiv.org Artificial Intelligence

2512.07201

Country: Asia > China (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)

Add feedback

UniRL-Zero: Reinforcement Learning on Unified Models with Joint Language Model and Diffusion Model Experts

Wang, Fu-Yun, Zhang, Han, Gharbi, Michael, Li, Hongsheng, Park, Taesung

arXiv.org Artificial IntelligenceOct-22-2025

W e present UniRL-Zero, a unified reinforcement learning (RL) framework that boosts, multimodal language model understanding and reasoning, diffusion model multimedia generation, and their beneficial interaction capabilities within a unified model. Our work defines six scenarios for unified model reinforcement learning, providing systematic baselines for reinforcement learning of unified understanding and generation model. Our code is available at https://github.com/G-UN/UniRL.

arxiv preprint arxiv, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2510.17937

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.94)

Add feedback

9d88b87b31986f8293bb0067a841579e-Paper-Conference.pdf

Neural Information Processing SystemsOct-10-2025, 11:21:42 GMT

floattensor, logp, torch, (14 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(3 more...)

Genre: Research Report > Experimental Study (1.00)

Industry: Media (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.98)
(3 more...)

Add feedback

Humanline: Online Alignment as Perceptual Loss

Liu, Sijia, Muennighoff, Niklas, Ethayarajh, Kawin

arXiv.org Artificial IntelligenceSep-30-2025

Online alignment (e.g., GRPO) is generally more performant than offline alignment (e.g., DPO) -- but why? Drawing on prospect theory from behavioral economics, we propose a human-centric explanation. We prove that online on-policy sampling better approximates the human-perceived distribution of what the model can produce, and PPO/GRPO-style clipping -- originally introduced to just stabilize training -- recovers a perceptual bias in how humans perceive probability. In this sense, PPO/GRPO act as perceptual losses already. Our theory further suggests that the online/offline dichotomy is itself incidental to maximizing human utility, since we can achieve the same effect by selectively training on any data in a manner that mimics human perception, rather than restricting ourselves to online on-policy data. Doing so would allow us to post-train more quickly, cheaply, and flexibly without sacrificing performance. To this end, we propose a design pattern that explicitly incorporates perceptual distortions of probability into objectives like DPO/KTO/GRPO, creating humanline variants of them. Surprisingly, we find that these humanline variants, even when trained with offline off-policy data, can match the performance of their online counterparts on both verifiable and unverifiable tasks.

arxiv preprint arxiv, large language model, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2509.24207

Genre: Research Report (0.82)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)

Add feedback

LMR-BENCH: Evaluating LLM Agent's Ability on Reproducing Language Modeling Research

Yan, Shuo, Li, Ruochen, Luo, Ziming, Wang, Zimu, Li, Daoyang, Jing, Liqiang, He, Kaiyu, Wu, Peilin, Michalopoulos, George, Zhang, Yue, Zhang, Ziyang, Zhang, Mian, Chen, Zhiyu, Du, Xinya

arXiv.org Artificial IntelligenceJun-24-2025

Large language model (LLM) agents have demonstrated remarkable potential in advancing scientific discovery. However, their capability in the fundamental yet crucial task of reproducing code from research papers, especially in the NLP domain, remains underexplored. This task includes unique complex reasoning challenges in the intellectual synthesis of abstract concepts and the comprehension of code repositories with interdependent files. Motivated by this gap, we present LMR-BENCH, a benchmark designed to systematically evaluate the capability of LLM agents on code reproduction from Language Modeling Research. It consists of 28 code reproduction tasks derived from 23 research papers published in top-tier NLP venues over the past five years, spanning nine fundamental categories. Models are provided with a research paper, a code repository containing one or more masked functions, and instructions for implementing these functions. We conduct extensive experiments in standard prompting and LLM agent settings with state-of-the-art LLMs, evaluating the accuracy of unit tests and performing LLM-based evaluation of code correctness. Experimental results reveal that even the most advanced models still exhibit persistent limitations in scientific reasoning and code synthesis, highlighting critical gaps in LLM agents' ability to autonomously reproduce scientific research

artificial intelligence, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2506.17335

Country:

North America > United States (0.93)
Asia > Middle East > UAE (0.46)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.68)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Discovering Preference Optimization Algorithms with and for Large Language Models

Lu, Chris, Holt, Samuel, Fanconi, Claudio, Chan, Alex J., Foerster, Jakob, van der Schaar, Mihaela, Lange, Robert Tjarko

arXiv.org Artificial IntelligenceJun-12-2024

Offline preference optimization is a key method for enhancing and controlling the quality of Large Language Model (LLM) outputs. Typically, preference optimization is approached as an offline supervised learning task using manually-crafted convex loss functions. While these methods are based on theoretical insights, they are inherently constrained by human creativity, so the large search space of possible loss functions remains under explored. We address this by performing LLM-driven objective discovery to automatically discover new state-of-the-art preference optimization algorithms without (expert) human intervention. Specifically, we iteratively prompt an LLM to propose and implement new preference optimization loss functions based on previously-evaluated performance metrics. This process leads to the discovery of previously-unknown and performant preference optimization algorithms. The best performing of these we call Discovered Preference Optimization (DiscoPOP), a novel algorithm that adaptively blends logistic and exponential losses. Experiments demonstrate the state-of-the-art performance of DiscoPOP and its successful transfer to held-out tasks.

floattensor, logp, torch, (14 more...)

arXiv.org Artificial Intelligence

2406.08414

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
(2 more...)

Genre: Research Report (0.50)

Industry: Media (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Pytorch : Everything you need to know in 10 mins Latest Updates Cuelogic Blog

#artificialintelligenceFeb-10-2019, 18:06:12 GMT

It is increasingly making it easier for developers to build Machine Learning capabilities into their applications while testing their code is real time. In this piece about Pytorch Tutorial, I talk about the new platform in Deep Learning. The latest version of the platform brings a lot of new capabilities to the table and is clocking vibrant support from the whole industry. It is remarkable how Pytorch is being touted as a serious contender to Google's Tensorflow just within a couple of years of its release. Its popularity is mainly being driven by a smoother learning curve and a cleaner interface, which is providing developers with a more intuitive approach to build neural networks.

artificial intelligence, machine learning, torch, (14 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback