AITopics | correct output

Collaborating Authors

correct output

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Uniform-Correct Policy Optimization: Breaking RLVR's Indifference to Diversity

Lochab, Anamika, Li, Bolian, Zhang, Ruqi

arXiv.org Machine LearningMay-4-2026

Reinforcement Learning with Verifiable Rewards (RLVR) has achieved substantial gains in single-attempt accuracy (Pass@1) on reasoning tasks, yet often suffers from reduced multi-sample coverage (Pass@K), indicating diversity collapse. We identify a structural cause for this degradation: common RLVR objectives, such as GRPO, are indifferent to how probability mass is distributed among correct solutions. Combined with stochastic training dynamics, this indifference induces a self-reinforcing collapse, in which probability mass concentrates on a narrow subset of correct outputs while alternative valid solutions are suppressed. We formalize this collapse mechanism and further characterize the optimal policy structure under two complementary criteria: robustness and entropy-regularized optimality, which identify the Uniform-Correct Policy as uniquely optimal. Motivated by this analysis, we propose Uniform-Correct Policy Optimization (UCPO), a modification to GRPO that adds a conditional uniformity penalty on the policy's distribution over correct solutions. The penalty redistributes gradient signal toward underrepresented correct responses, encouraging uniform allocation of probability mass within the correct set. Across three models (1.5B-7B parameters) and five mathematical reasoning benchmarks, UCPO improves Pass@K and diversity while maintaining competitive Pass@1, achieving up to +10\% absolute improvement on AIME24 at Pass@64 and up to 45\% higher equation-level diversity within the correct set. The code is available at https://github.com/AnamikaLochab/UCPO.

correct solution, large language model, machine learning, (18 more...)

arXiv.org Machine Learning

2605.00365

Genre: Research Report (0.83)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.92)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.30)

Add feedback

Wisdom of the Ensemble: Improving Consistency of Deep Learning Models

Neural Information Processing SystemsDec-24-2025, 19:36:42 GMT

Deep learning classifiers are assisting humans in making decisions and hence the user's trust in these models is of paramount importance. Trust is often a function of constant behavior. From an AI model perspective it means given the same input the user would expect the same output, especially for correct outputs, or in other words consistently correct outputs. This paper studies a model behavior in the context of periodic retraining of deployed models where the outputs from successive generations of the models might not agree on the correct labels assigned to the same input. We formally define consistency and correct-consistency of a learning model. We prove that consistency and correct-consistency of an ensemble learner is not less than the average consistency and correct-consistency of individual learners and correct-consistency can be improved with a probability by combining learners with accuracy not less than the average accuracy of ensemble component learners. To validate the theory using three datasets and two state-of-the-art deep learning classifiers we also propose an efficient dynamic snapshot ensemble method and demonstrate its value.

consistency and correct-consistency, deep learning model, name change, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

4 Proofs of positive results: universality of deep learning 4.1 Emulation of arbitrary algorithms

Neural Information Processing SystemsAug-17-2025, 02:02:30 GMT

We then train a neural network to learn the parity label of these images with a random initalization.

artificial intelligence, machine learning, vertex, (19 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

Why Is Anything Conscious?

Bennett, Michael Timothy, Welsh, Sean, Ciaunica, Anna

arXiv.org Artificial IntelligenceDec-18-2024

We tackle the hard problem of consciousness taking the naturally selected, embodied organism as our starting point. We provide a formalism describing how biological systems self-organise to hierarchically interpret unlabelled sensory information according to valence. Such interpretations imply behavioural policies which are differentiated from each other only by the qualitative aspect of information processing. Natural selection favours systems that intervene in the world to achieve homeostatic and reproductive goals. Quality is a property arising in such systems to link cause to affect to motivate interventions. This produces interoceptive and exteroceptive classifiers and determines priorities. In formalising the seminal distinction between access and phenomenal consciousness, we claim that access consciousness at the human level requires the ability to hierarchically model i) the self, ii) the world/others and iii) the self as modelled by others, and that this requires phenomenal consciousness. Phenomenal without access consciousness is likely common, but the reverse is implausible. To put it provocatively: death grounds meaning, and Nature does not like zombies. We then describe the multilayered architecture of self-organisation from rocks to Einstein, illustrating how our argument applies. Our proposal lays the foundation of a formal science of consciousness, closer to human fact than zombie fiction.

artificial intelligence, machine learning, order self, (19 more...)

arXiv.org Artificial Intelligence

2409.14545

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(6 more...)

Genre: Research Report (0.50)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Issues > Philosophy (0.66)

Add feedback

Comment on Is Complexity an Illusion?

Simmons, Gabriel

arXiv.org Artificial IntelligenceOct-28-2024

The paper "Is Complexity an Illusion?" (Bennett, 2024) provides a formalism for complexity, learning, inference, and generalization, and introduces a formal definition for a "policy". This reply shows that correct policies do not exist for a simple task of supervised multi-class classification, via mathematical proof and exhaustive search. Implications of this result are discussed, as well as possible responses and amendments to the theory.

declarative program, extension, frozenset, (15 more...)

arXiv.org Artificial Intelligence

2411.08897

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Wisdom of the Ensemble: Improving Consistency of Deep Learning Models

Neural Information Processing SystemsOct-11-2024, 15:22:44 GMT

consistency and correct-consistency, deep learning model, ensemble, (3 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

ImProver: Agent-Based Automated Proof Optimization

Ahuja, Riyaz, Avigad, Jeremy, Tetali, Prasad, Welleck, Sean

arXiv.org Artificial IntelligenceOct-7-2024

Large language models (LLMs) have been used to generate formal proofs of mathematical theorems in proofs assistants such as Lean. However, we often want to optimize a formal proof with respect to various criteria, depending on its downstream use. For example, we may want a proof to adhere to a certain style, or to be readable, concise, or modularly structured. Having suitably optimized proofs is also important for learning tasks, especially since human-written proofs may not optimal for that purpose. To this end, we study a new problem of automated proof optimization: rewriting a proof so that it is correct and optimizes for an arbitrary criterion, such as length or readability. As a first method for automated proof optimization, we present ImProver, a large-language-model agent that rewrites proofs to optimize arbitrary user-defined metrics in Lean. We find that naively applying LLMs to proof optimization falls short, and we incorporate various improvements into ImProver, such as the use of symbolic Lean context in a novel Chain-of-States technique, as well as error-correction and retrieval. We test ImProver on rewriting real-world undergraduate, competition, and research-level mathematics theorems, finding that ImProver is capable of rewriting proofs so that they are substantially shorter, more modular, and more readable.

improver, optimization, theorem, (15 more...)

arXiv.org Artificial Intelligence

2410.04753

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > Canada > Ontario > Toronto (0.04)
Europe > Monaco (0.04)

Genre:

Research Report (0.40)
Overview (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.72)

Add feedback

Do Efficient Transformers Really Save Computation?

Yang, Kai, Ackermann, Jan, He, Zhenyu, Feng, Guhao, Zhang, Bohang, Feng, Yunzhen, Ye, Qiwei, He, Di, Wang, Liwei

arXiv.org Machine LearningFeb-21-2024

As transformer-based language models are trained on increasingly large datasets and with vast numbers of parameters, finding more efficient alternatives to the standard Transformer has become very valuable. While many efficient Transformers and Transformer alternatives have been proposed, none provide theoretical guarantees that they are a suitable replacement for the standard Transformer. This makes it challenging to identify when to use a specific model and what directions to prioritize for further investigation. In this paper, we aim to understand the capabilities and limitations of efficient Transformers, specifically the Sparse Transformer and the Linear Transformer. We focus on their reasoning capability as exhibited by Chain-of-Thought (CoT) prompts and follow previous works to model them as Dynamic Programming (DP) problems. Our results show that while these models are expressive enough to solve general DP tasks, contrary to expectations, they require a model size that scales with the problem size. Nonetheless, we identify a class of DP problems for which these models can be more efficient than the standard Transformer. We confirm our theoretical results through experiments on representative DP tasks, adding to the understanding of efficient Transformers' practical strengths and weaknesses.

dimension, efficient transformer, transformer, (15 more...)

arXiv.org Machine Learning

2402.13934

Country:

North America > United States > New York (0.04)
Europe > Switzerland > Zürich > Zürich (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

Bounds on the price of feedback for mistake-bounded online learning

Geneson, Jesse, Tang, Linus

arXiv.org Artificial IntelligenceJan-17-2024

We improve several worst-case bounds for various online learning scenarios from (Auer and Long, Machine Learning, 1999). In particular, we sharpen an upper bound for delayed ambiguous reinforcement learning by a factor of 2 and an upper bound for learning compositions of families of functions by a factor of 2.41. We also improve a lower bound from the same paper for learning compositions of $k$ families of functions by a factor of $\Theta(\ln{k})$, matching the upper bound up to a constant factor. In addition, we solve a problem from (Long, Theoretical Computer Science, 2020) on the price of bandit feedback with respect to standard feedback for multiclass learning, and we improve an upper bound from (Feng et al., Theoretical Computer Science, 2023) on the price of $r$-input delayed ambiguous reinforcement learning by a factor of $r$, matching a lower bound from the same paper up to the leading term.

adversary, codomain, opt std, (15 more...)

arXiv.org Artificial Intelligence

2401.05794

Country: Asia > India > West Bengal > Kolkata (0.04)

Genre: Research Report (0.50)

Industry: Education > Educational Setting > Online (1.00)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.44)

Add feedback

DP-OPT: Make Large Language Model Your Privacy-Preserving Prompt Engineer

Hong, Junyuan, Wang, Jiachen T., Zhang, Chenhui, Li, Zhangheng, Li, Bo, Wang, Zhangyang

arXiv.org Artificial IntelligenceNov-26-2023

Large Language Models (LLMs) have emerged as dominant tools for various tasks, particularly when tailored for a specific target by prompt tuning. Nevertheless, concerns surrounding data privacy present obstacles due to the tuned prompts' dependency on sensitive private information. A practical solution is to host a local LLM and optimize a soft prompt privately using data. Yet, hosting a local model becomes problematic when model ownership is protected. Alternative methods, like sending data to the model's provider for training, intensify these privacy issues facing an untrusted provider. In this paper, we present a novel solution called Differentially-Private Offsite Prompt Tuning (DP-OPT) to address this challenge. Our approach involves tuning a discrete prompt on the client side and then applying it to the desired cloud models. We demonstrate that prompts suggested by LLMs themselves can be transferred without compromising performance significantly. To ensure that the prompts do not leak private information, we introduce the first private prompt generation mechanism, by a differentially-private (DP) ensemble of in-context learning with private demonstrations. With DP-OPT, generating privacypreserving prompts by Vicuna-7b can yield competitive performance compared to non-private in-context learning on GPT3.5 or local private prompt tuning. When Large Language Models gain vast knowledge and versatile ability from large-scale pre-training, prompt engineering has surfaced as the most effective, cost-efficient, and adaptable method to tailor LLMs for a range of downstream applications. In contrast to the resource-heavy optimization of model parameters, prompt engineering merely necessitates API access and iteratively refines prompts based on the validation of training instances. Though manual prompt engineering has achieved impressive performance in various tasks (Petroni et al., 2019; Zhou et al., 2022), it often requires decent human experience in prompt designing and domain knowledge for downstream tasks, including legal judgement (Trautmann et al., 2022), healthcare (Wang et al., 2023b) and art (Oppenlaender et al., 2023). To mitigate the high costs, data-driven prompt tuning was proposed to automate the process. The most prominent example of this is soft prompt tuning, where prompts are characterized as trainable embedding vectors and are refined using a collection of training instances (Houlsby et al., 2019; Roberts et al., 2019; Brown et al., 2020; Chen et al., 2022). However, one major barrier to the applications of prompt tuning is data privacy. When searching for a validate prompt for an LLM API, such as ChatGPT, there is a need to upload a multitude of training samples for evaluation queries. In privacy-sensitive scenarios, the operation could be prohibited due to two concerns.

arxiv preprint arxiv, correct output, dp-opt, (14 more...)

arXiv.org Artificial Intelligence

2312.03724

Country:

North America > United States > California (0.04)
Europe > Italy (0.04)
Europe > Germany (0.04)
(7 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback