AITopics | Rezende, Danilo

Collaborating Authors

Rezende, Danilo

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Metacognitive Capabilities of LLMs: An Exploration in Mathematical Problem Solving

Didolkar, Aniket, Goyal, Anirudh, Ke, Nan Rosemary, Guo, Siyuan, Valko, Michal, Lillicrap, Timothy, Rezende, Danilo, Bengio, Yoshua, Mozer, Michael, Arora, Sanjeev

arXiv.org Artificial IntelligenceMay-20-2024

Metacognitive knowledge refers to humans' intuitive knowledge of their own thinking and reasoning processes. Today's best LLMs clearly possess some reasoning processes. The paper gives evidence that they also have metacognitive knowledge, including ability to name skills and procedures to apply given a task. We explore this primarily in context of math reasoning, developing a prompt-guided interaction procedure to get a powerful LLM to assign sensible skill labels to math questions, followed by having it perform semantic clustering to obtain coarser families of skill labels. These coarse skill labels look interpretable to humans. To validate that these skill labels are meaningful and relevant to the LLM's reasoning processes we perform the following experiments. (a) We ask GPT-4 to assign skill labels to training questions in math datasets GSM8K and MATH. (b) When using an LLM to solve the test questions, we present it with the full list of skill labels and ask it to identify the skill needed. Then it is presented with randomly selected exemplar solved questions associated with that skill label. This improves accuracy on GSM8k and MATH for several strong LLMs, including code-assisted models. The methodology presented is domain-agnostic, even though this article applies it to math problems.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2405.12205

Country:

Asia (0.28)
North America > United States > California (0.14)
North America > Canada > Quebec (0.14)

Genre: Research Report > New Finding (0.67)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

DiscoGen: Learning to Discover Gene Regulatory Networks

Ke, Nan Rosemary, Dunn, Sara-Jane, Bornschein, Jorg, Chiappa, Silvia, Rey, Melanie, Lespiau, Jean-Baptiste, Cassirer, Albin, Wang, Jane, Weber, Theophane, Barrett, David, Botvinick, Matthew, Goyal, Anirudh, Mozer, Mike, Rezende, Danilo

arXiv.org Artificial IntelligenceApr-12-2023

Accurately inferring Gene Regulatory Networks (GRNs) is a critical and challenging task in biology. GRNs model the activatory and inhibitory interactions between genes and are inherently causal in nature. To accurately identify GRNs, perturbational data is required. However, most GRN discovery methods only operate on observational data. Recent advances in neural network-based causal discovery methods have significantly improved causal discovery, including handling interventional data, improvements in performance and scalability. However, applying state-of-the-art (SOTA) causal discovery methods in biology poses challenges, such as noisy data and a large number of samples. Thus, adapting the causal discovery methods is necessary to handle these challenges. In this paper, we introduce DiscoGen, a neural network-based GRN discovery method that can denoise gene expression measurements and handle interventional data. We demonstrate that our model outperforms SOTA neural network-based causal discovery methods.

artificial intelligence, discogen, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2304.05823

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Integrable Nonparametric Flows

Pfau, David, Rezende, Danilo

arXiv.org Machine LearningDec-3-2020

We introduce a method for reconstructing an infinitesimal normalizing flow given only an infinitesimal change to a (possibly unnormalized) probability distribution. This reverses the conventional task of normalizing flows -- rather than being given samples from a unknown target distribution and learning a flow that approximates the distribution, we are given a perturbation to an initial distribution and aim to reconstruct a flow that would generate samples from the known perturbed distribution. While this is an underdetermined problem, we find that choosing the flow to be an integrable vector field yields a solution closely related to electrostatics, and a solution can be computed by the method of Green's functions. Unlike conventional normalizing flows, this flow can be represented in an entirely nonparametric manner. We validate this derivation on low-dimensional problems, and discuss potential applications to problems in quantum Monte Carlo and machine learning.

artificial intelligence, machine learning, probability distribution, (15 more...)

arXiv.org Machine Learning

2012.02035

Country: North America > Canada (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.49)

Add feedback

Towards a Definition of Disentangled Representations

Higgins, Irina, Amos, David, Pfau, David, Racaniere, Sebastien, Matthey, Loic, Rezende, Danilo, Lerchner, Alexander

arXiv.org Machine LearningDec-5-2018

How can intelligent agents solve a diverse set of tasks in a data-efficient manner? The disentangled representation learning approach posits that such an agent would benefit from separating out (disentangling) the underlying structure of the world into disjoint parts of its representation. However, there is no generally agreed-upon definition of disentangling, not least because it is unclear how to formalise the notion of world structure beyond toy datasets with a known ground truth generative process. Here we propose that a principled solution to characterising disentangled representations can be found by focusing on the transformation properties of the world. In particular, we suggest that those transformations that change only some properties of the underlying world state, while leaving all other properties invariant, are what gives exploitable structure to any kind of data. Similar ideas have already been successfully applied in physics, where the study of symmetry transformations has revolutionised the understanding of the world structure. By connecting symmetry transformations to vector representations using the formalism of group and representation theory we arrive at the first formal definition of disentangled representations. Our new definition is in agreement with many of the current intuitions about disentangling, while also providing principled resolutions to a number of previous points of contention. While this work focuses on formally defining disentangling - as opposed to solving the learning problem - we believe that the shift in perspective to studying data transformations can stimulate the development of better representation learning algorithms.

deep learning, neural network, representation, (20 more...)

arXiv.org Machine Learning

1812.0223

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Unsupervised Predictive Memory in a Goal-Directed Agent

Wayne, Greg, Hung, Chia-Chun, Amos, David, Mirza, Mehdi, Ahuja, Arun, Grabska-Barwinska, Agnieszka, Rae, Jack, Mirowski, Piotr, Leibo, Joel Z., Santoro, Adam, Gemici, Mevlana, Reynolds, Malcolm, Harley, Tim, Abramson, Josh, Mohamed, Shakir, Rezende, Danilo, Saxton, David, Cain, Adam, Hillier, Chloe, Silver, David, Kavukcuoglu, Koray, Botvinick, Matt, Hassabis, Demis, Lillicrap, Timothy

arXiv.org Machine LearningMar-28-2018

Animals execute goal-directed behaviours despite the limited range and scope of their sensors. To cope, they explore environments and store memories maintaining estimates of important information that is not presently available. Recently, progress has been made with artificial intelligence (AI) agents that learn to perform tasks from sensory input, even at a human level, by merging reinforcement learning (RL) algorithms with deep neural networks, and the excitement surrounding these results has led to the pursuit of related ideas as explanations of non-human animal learning. However, we demonstrate that contemporary RL algorithms struggle to solve simple tasks when enough information is concealed from the sensors of the agent, a property called "partial observability". An obvious requirement for handling partially observed tasks is access to extensive memory, but we show memory is not enough; it is critical that the right information be stored in the right format. We develop a model, the Memory, RL, and Inference Network (MERLIN), in which memory formation is guided by a process of predictive modeling. MERLIN facilitates the solution of tasks in 3D virtual reality environments for which partial observability is severe and memories must be maintained over long durations. Our model demonstrates a single learning agent architecture that can solve canonical behavioural tasks in psychology and neurobiology without strong simplifying assumptions about the dimensionality of sensory input or the duration of experiences.

agent, deep learning, neural network, (21 more...)

arXiv.org Machine Learning

1803.1076

Country:

Europe > Norway > Norwegian Sea (0.24)
North America > Canada > Ontario > Toronto (0.14)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Industry:

Health & Medicine (0.68)
Leisure & Entertainment (0.67)
Media (0.45)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Normalizing Flows on Riemannian Manifolds

Gemici, Mevlana C., Rezende, Danilo, Mohamed, Shakir

arXiv.org Machine LearningNov-9-2016

We consider the problem of density estimation on Riemannian manifolds. Density estimation on manifolds has many applications in fluid-mechanics, optics and plasma physics and it appears often when dealing with angular variables (such as used in protein folding, robot limbs, gene-expression) and in general directional statistics. In spite of the multitude of algorithms available for density estimation in the Euclidean spaces $\mathbf{R}^n$ that scale to large n (e.g. normalizing flows, kernel methods and variational approximations), most of these methods are not immediately suitable for density estimation in more general Riemannian manifolds. We revisit techniques related to homeomorphisms from differential geometry for projecting densities to sub-manifolds and use it to generalize the idea of normalizing flows to more general Riemannian manifolds. The resulting algorithm is scalable, simple to implement and suitable for use with automatic differentiation. We demonstrate concrete examples of this method on the n-sphere $\mathbf{S}^n$.

deep learning, neural network, variational inference, (17 more...)

arXiv.org Machine Learning

1611.02304

Country: North America > United States > Arizona (0.15)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.51)

Add feedback