Goto

Collaborating Authors

 Information Technology


Diffusion-based Reinforcement Learning via Q-weighted Variational Policy Optimization

Neural Information Processing Systems

Diffusion models have garnered widespread attention in Reinforcement Learning (RL) for their powerful expressiveness and multimodality. It has been verified that utilizing diffusion policies can significantly improve the performance of RL algorithms in continuous control tasks by overcoming the limitations of unimodal policies, such as Gaussian policies. Furthermore, the multimodality of diffusion policies also shows the potential of providing the agent with enhanced exploration capabilities. However, existing works mainly focus on applying diffusion policies in offline RL, while their incorporation into online RL has been less investigated. The diffusion model's training objective, known as the variational lower bound, cannot be applied directly in online RL due to the unavailability of'good' samples (actions).


Acceleration through Optimistic No-Regret Dynamics

Neural Information Processing Systems

We consider the problem of minimizing a smooth convex function by reducing the optimization to computing the Nash equilibrium of a particular zero-sum convexconcave game. Zero-sum games can be solved using online learning dynamics, where a classical technique involves simulating two no-regret algorithms that play against each other and, after T rounds, the average iterate is guaranteed to solve the original optimization problem with error decaying as O(log T/T).


On the cohesion and separability of average-link for hierarchical agglomerative clustering

Neural Information Processing Systems

Average-link is widely recognized as one of the most popular and effective methods for building hierarchical agglomerative clustering. The available theoretical analyses show that this method has a much better approximation than other popular heuristics, as single-linkage and complete-linkage, regarding variants of Dasgupta's cost function [STOC 2016]. However, these analyses do not separate average-link from a random hierarchy and they are not appealing for metric spaces since every hierarchical clustering has a 1/2 approximation with regard to the variant of Dasgupta's function that is employed for dissimilarity measures [Moseley and Yang 2020]. In this paper, we present a comprehensive study of the performance of average-link in metric spaces, regarding several natural criteria that capture separability and cohesion, and are more interpretable than Dasgupta's cost function and its variants. We also present experimental results with real datasets that, together with our theoretical analyses, suggest that average-link is a better choice than other related methods when both cohesion and separability are important goals.


ReFIR: Grounding Large Restoration Models with Retrieval Augmentation Hang Guo 1 Tao Dai

Neural Information Processing Systems

Recent advances in diffusion-based Large Restoration Models (LRMs) have significantly improved photo-realistic image restoration by leveraging the internal knowledge embedded within model weights. However, existing LRMs often suffer from the hallucination dilemma, i.e., producing incorrect contents or textures when dealing with severe degradations, due to their heavy reliance on limited internal knowledge. In this paper, we propose an orthogonal solution called the Retrieval-augmented Framework for Image Restoration (ReFIR), which incorporates retrieved images as external knowledge to extend the knowledge boundary of existing LRMs in generating details faithful to the original scene. Specifically, we first introduce the nearest neighbor lookup to retrieve content-relevant high-quality images as reference, after which we propose the cross-image injection to modify existing LRMs to utilize high-quality textures from retrieved images. Thanks to the additional external knowledge, our ReFIR can well handle the hallucination challenge and facilitate faithfully results. Extensive experiments demonstrate that ReFIR can achieve not only high-fidelity but also realistic restoration results. Importantly, our ReFIR requires no training and is adaptable to various LRMs.



Combining Generative and Discriminative Models for Hybrid Inference

Neural Information Processing Systems

A graphical model is a structured representation of the data generating process. The traditional method to reason over random variables is to perform inference in this graphical model. However, in many cases the generating process is only a poor approximation of the much more complex true data generating process, leading to suboptimal estimations. The subtleties of the generative process are however captured in the data itself and we can "learn to infer", that is, learn a direct mapping from observations to explanatory latent variables. In this work we propose a hybrid model that combines graphical inference with a learned inverse model, which we structure as in a graph neural network, while the iterative algorithm as a whole is formulated as a recurrent neural network. By using cross-validation we can automatically balance the amount of work performed by graphical inference versus learned inference. We apply our ideas to the Kalman filter, a Gaussian hidden Markov model for time sequences, and show, among other things, that our model can estimate the trajectory of a noisy chaotic Lorenz Attractor much more accurately than either the learned or graphical inference run in isolation.



SVFT: Parameter-Efficient Fine-Tuning with Singular Vectors Vijay Lingam Aditya Vavre

Neural Information Processing Systems

Popular parameter-efficient fine-tuning (PEFT) methods, such as LoRA and its variants, freeze pre-trained model weights W and inject learnable matrices W. These W matrices are structured for efficient parameterization, often using techniques like low-rank approximations or scaling vectors. However, these methods typically exhibit a performance gap compared to full fine-tuning. While recent PEFT methods have narrowed this gap, they do so at the expense of additional learnable parameters.



AI could already be conscious. Are we ready for it?

BBC News

But quite recently, in the real world there has been a rapid tipping point in thinking on machine consciousness, where credible voices have become concerned that this is no longer the stuff of science fiction. The sudden shift has been prompted by the success of so-called large language models (LLMs), which can be accessed through apps on our phones such as Gemini and Chat GPT. The ability of the latest generation of LLMs to have plausible, free-flowing conversations has surprised even their designers and some of the leading experts in the field. There is a growing view among some thinkers that as AI becomes even more intelligent, the lights will suddenly turn on inside the machines and they will become conscious. Others, such as Prof Anil Seth who leads the Sussex University team, disagree, describing the view as "blindly optimistic and driven by human exceptionalism".