AITopics | primal problem

Lift Yourself Up: Retrieval-augmented Text Generation with Self-Memory

Neural Information Processing SystemsFeb-15-2026, 17:20:10 GMT

With direct access to human-written reference as memory, retrieval-augmented generation has achieved much progress in a wide range of text generation tasks.

large language model, machine learning, proc, (20 more...)

Neural Information Processing Systems

Country:

Asia > China (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(5 more...)

Genre: Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.89)

Add feedback

b691334ccf10d4ab144d672f7783c8a3-Supplemental.pdf

Neural Information Processing SystemsFeb-10-2026, 20:47:50 GMT

cv ar null 0 1, dataset, erm null 0 1, (16 more...)

Neural Information Processing Systems

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Lift Yourself Up: Retrieval-augmented Text Generation with Self-Memory

Neural Information Processing SystemsDec-26-2025, 07:03:25 GMT

With direct access to human-written reference as memory, retrieval-augmented generation has achieved much progress in a wide range of text generation tasks. Since better memory would typically prompt better generation (we define this as primal problem). The traditional approach for memory retrieval involves selecting memory that exhibits the highest similarity to the input. However, this method is constrained by the quality of the fixed corpus from which memory is retrieved. In this paper, by exploring the duality of the primal problem: better generation also prompts better memory, we propose a novel framework, selfmem, which addresses this limitation by iteratively employing a retrieval-augmented generator to create an unbounded memory pool and using a memory selector to choose one output as memory for the subsequent generation round. This enables the model to leverage its own output, referred to as self-memory, for improved generation. We evaluate the effectiveness of selfmem on three distinct text generation tasks: neural machine translation, abstractive text summarization, and dialogue generation, under two generation paradigms: fine-tuned small model and few-shot LLM. Our approach achieves state-of-the-art results in four directions in JRC-Acquis translation dataset, 50.3 ROUGE-1 in XSum, and 62.9 ROUGE-1 in BigPatent, demonstrating the potential of self-memory in enhancing retrieval-augmented generation models. Furthermore, we conduct thorough analyses of each component in the selfmem framework to identify current system bottlenecks and provide insights for future research.

name change, retrieval-augmented text generation, self-memory, (6 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.96)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Supplementary Material for the Paper " Sampling-Decomposable Generative Adversarial Recommender "

Neural Information Processing SystemsOct-9-2025, 15:49:17 GMT

In the appendix, we start from the proofs of theorem 2.1 and theorem 2.2 in section A. Then, we prove the correctness of proposition 2.2 and proposition 2.3 in section B. After that, the detailed derivation of our proposed loss is provided in section C. At last, the sensitivity of some important Before providing the proofs of the theorems, we restate some important notations first. Here, we also restate some important notations first. Here, we illustrate the detailed derivation of our approximated loss for learning the discriminator. Figure 1(a) demonstrates the effects of the embeddings size (i.e., Figure 1(b) shows the effects of the number of item sample set for learning the discriminator. Figure 1(c) reports the effects of the number of item and context sample set for learning the generator.

artificial intelligence, exp, sampling-decomposable generative adversarial recommender, (13 more...)

Neural Information Processing Systems

Country:

Asia > China (0.05)
North America > Canada (0.04)

Technology: Information Technology > Artificial Intelligence (0.47)

Add feedback

Lift Yourself Up: Retrieval-augmented Text Generation with Self-Memory

Neural Information Processing SystemsOct-9-2025, 00:39:32 GMT

With direct access to human-written reference as memory, retrieval-augmented generation has achieved much progress in a wide range of text generation tasks.

large language model, machine learning, proc, (20 more...)

Neural Information Processing Systems

Country:

Asia > China (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(5 more...)

Genre: Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.89)

Add feedback

Boosted CV aR Classification (Supplementary Material)

Neural Information Processing SystemsAug-17-2025, 01:19:57 GMT

On the COMP AS dataset, we use a three-layer feed-forward neural network activated by ReLU as the classification model. For optimization we use momentum SGD with learning rate 0.01 and The batch size is 128. On the CelebA dataset, we use a ResNet18 as the classification model. The remaining 45000 training samples consist the training set. The batch size is 128.

artificial intelligence, cv ar null 0 1, machine learning, (15 more...)

Neural Information Processing Systems

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.54)

Add feedback

Lift Yourself Up: Retrieval-augmented Text Generation with Self-Memory

Neural Information Processing SystemsJan-19-2025, 13:37:15 GMT

With direct access to human-written reference as memory, retrieval-augmented generation has achieved much progress in a wide range of text generation tasks. Since better memory would typically prompt better generation (we define this as primal problem). The traditional approach for memory retrieval involves selecting memory that exhibits the highest similarity to the input. However, this method is constrained by the quality of the fixed corpus from which memory is retrieved. In this paper, by exploring the duality of the primal problem: better generation also prompts better memory, we propose a novel framework, selfmem, which addresses this limitation by iteratively employing a retrieval-augmented generator to create an unbounded memory pool and using a memory selector to choose one output as memory for the subsequent generation round.

Add feedback

Data-Driven Priors in the Maximum Entropy on the Mean Method for Linear Inverse Problems

King-Roskamp, Matthew, Choksi, Rustum, Hoheisel, Tim

arXiv.org Machine LearningDec-23-2024

We establish the theoretical framework for implementing the maximumn entropy on the mean (MEM) method for linear inverse problems in the setting of approximate (data-driven) priors. We prove a.s. convergence for empirical means and further develop general estimates for the difference between the MEM solutions with different priors $\mu$ and $\nu$ based upon the epigraphical distance between their respective log-moment generating functions. These estimates allow us to establish a rate of convergence in expectation for empirical means. We illustrate our results with denoising on MNIST and Fashion-MNIST data sets.

artificial intelligence, convergence, machine learning, (17 more...)

arXiv.org Machine Learning

2412.17916

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Italy (0.04)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Maximum Entropy (0.40)

Add feedback

Variational formulation based on duality to solve partial differential equations: Use of B-splines and machine learning approximants

Sukumar, N., Acharya, Amit

arXiv.org Artificial IntelligenceDec-2-2024

Many partial differential equations (PDEs) such as Navier--Stokes equations in fluid mechanics, inelastic deformation in solids, and transient parabolic and hyperbolic equations do not have an exact, primal variational structure. Recently, a variational principle based on the dual (Lagrange multiplier) field was proposed. The essential idea in this approach is to treat the given PDE as constraints, and to invoke an arbitrarily chosen auxiliary potential with strong convexity properties to be optimized. This leads to requiring a convex dual functional to be minimized subject to Dirichlet boundary conditions on dual variables, with the guarantee that even PDEs that do not possess a variational structure in primal form can be solved via a variational principle. The vanishing of the first variation of the dual functional is, up to Dirichlet boundary conditions on dual fields, the weak form of the primal PDE problem with the dual-to-primal change of variables incorporated. We derive the dual weak form for the linear, one-dimensional, transient convection-diffusion equation. A Galerkin discretization is used to obtain the discrete equations, with the trial and test functions chosen as linear combination of either RePU activation functions (shallow neural network) or B-spline basis functions; the corresponding stiffness matrix is symmetric. For transient problems, a space-time Galerkin implementation is used with tensor-product B-splines as approximating functions. Numerical results are presented for the steady-state and transient convection-diffusion equation, and transient heat conduction. The proposed method delivers sound accuracy for ODEs and PDEs and rates of convergence are established in the $L^2$ norm and $H^1$ seminorm for the steady-state convection-diffusion problem.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2412.01232

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > California > Yolo County > Davis (0.14)

Genre: Research Report (0.64)

Industry: Energy > Oil & Gas > Upstream (0.54)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Add feedback

Representation and Regression Problems in Neural Networks: Relaxation, Generalization, and Numerics

Liu, Kang, Zuazua, Enrique

arXiv.org Artificial IntelligenceDec-2-2024

In this work, we address three non-convex optimization problems associated with the training of shallow neural networks (NNs) for exact and approximate representation, as well as for regression tasks. Through a mean-field approach, we convexify these problems and, applying a representer theorem, prove the absence of relaxation gaps. We establish generalization bounds for the resulting NN solutions, assessing their predictive performance on test datasets and, analyzing the impact of key hyperparameters on these bounds, propose optimal choices. On the computational side, we examine the discretization of the convexified problems and derive convergence rates. For low-dimensional datasets, these discretized problems are efficiently solvable using the simplex method. For high-dimensional datasets, we propose a sparsification algorithm that, combined with gradient descent for over-parameterized shallow NNs, yields effective solutions to the primal problems.

artificial intelligence, machine learning, reg, (19 more...)

arXiv.org Artificial Intelligence

2412.01619

Country:

Europe > Spain > Galicia > Madrid (0.04)
North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
Europe > Spain > Basque Country > Biscay Province > Bilbao (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Filters

Collaborating Authors

primal problem

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

Lift Yourself Up: Retrieval-augmented Text Generation with Self-Memory

b691334ccf10d4ab144d672f7783c8a3-Supplemental.pdf

Lift Yourself Up: Retrieval-augmented Text Generation with Self-Memory

Supplementary Material for the Paper " Sampling-Decomposable Generative Adversarial Recommender "

Lift Yourself Up: Retrieval-augmented Text Generation with Self-Memory

Boosted CV aR Classification (Supplementary Material)

Lift Yourself Up: Retrieval-augmented Text Generation with Self-Memory

Data-Driven Priors in the Maximum Entropy on the Mean Method for Linear Inverse Problems

Variational formulation based on duality to solve partial differential equations: Use of B-splines and machine learning approximants

Representation and Regression Problems in Neural Networks: Relaxation, Generalization, and Numerics