AITopics | Zeynep Akata

Learning What and Where to Draw

Scott E. Reed, Zeynep Akata, Santosh Mohan, Samuel Tenka, Bernt Schiele, Honglak Lee

Neural Information Processing SystemsJun-2-2025, 05:17:26 GMT

Generative Adversarial Networks (GANs) have recently demonstrated the capability to synthesize compelling real-world images, such as room interiors, album covers, manga, faces, birds, and flowers. While existing models can synthesize images based on global constraints such as a class label or caption, they do not provide control over pose or object location. We propose a new model, the Generative Adversarial What-Where Network (GAWWN), that synthesizes images given instructions describing what content to draw in which location. We show high-quality 128 128 image synthesis on the Caltech-UCSD Birds dataset, conditioned on both informal text descriptions and also object location. Our system exposes control over both the bounding box around the bird and its constituent parts. By modeling the conditional distributions over part locations, our system also enables conditioning on arbitrary subsets of parts (e.g.

artificial intelligence, keypoint, machine learning, (18 more...)

Neural Information Processing Systems

Country:

Europe (0.46)
North America > United States > Michigan (0.14)

Industry: Leisure & Entertainment > Sports (0.47)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

Combining Generative and Discriminative Models for Hybrid Inference

Victor Garcia Satorras, Zeynep Akata, Max Welling

Neural Information Processing SystemsJun-1-2025, 07:03:25 GMT

A graphical model is a structured representation of the data generating process. The traditional method to reason over random variables is to perform inference in this graphical model. However, in many cases the generating process is only a poor approximation of the much more complex true data generating process, leading to suboptimal estimations. The subtleties of the generative process are however captured in the data itself and we can "learn to infer", that is, learn a direct mapping from observations to explanatory latent variables. In this work we propose a hybrid model that combines graphical inference with a learned inverse model, which we structure as in a graph neural network, while the iterative algorithm as a whole is formulated as a recurrent neural network. By using cross-validation we can automatically balance the amount of work performed by graphical inference versus learned inference. We apply our ideas to the Kalman filter, a Gaussian hidden Markov model for time sequences, and show, among other things, that our model can estimate the trajectory of a noisy chaotic Lorenz Attractor much more accurately than either the learned or graphical inference run in isolation.

artificial intelligence, deep learning, machine learning, (16 more...)

Neural Information Processing Systems

Country: Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Combining Generative and Discriminative Models for Hybrid Inference

Victor Garcia Satorras, Zeynep Akata, Max Welling

Neural Information Processing SystemsMay-26-2025, 01:33:15 GMT

A graphical model is a structured representation of the data generating process. The traditional method to reason over random variables is to perform inference in this graphical model. However, in many cases the generating process is only a poor approximation of the much more complex true data generating process, leading to suboptimal estimations. The subtleties of the generative process are however captured in the data itself and we can "learn to infer", that is, learn a direct mapping from observations to explanatory latent variables. In this work we propose a hybrid model that combines graphical inference with a learned inverse model, which we structure as in a graph neural network, while the iterative algorithm as a whole is formulated as a recurrent neural network. By using cross-validation we can automatically balance the amount of work performed by graphical inference versus learned inference. We apply our ideas to the Kalman filter, a Gaussian hidden Markov model for time sequences, and show, among other things, that our model can estimate the trajectory of a noisy chaotic Lorenz Attractor much more accurately than either the learned or graphical inference run in isolation.

artificial intelligence, deep learning, machine learning, (16 more...)

Neural Information Processing Systems

Country: Europe > Germany > Baden-Württemberg > Tübingen Region > Tübingen (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Supplementary Materials: In-Context Impersonation Reveals Large Language Models' Strengths and Biases

Leonard Salewski, Stephan Alaniz, Isabel Rio-Torto, Eric Schulz, Zeynep Akata

Neural Information Processing SystemsMay-25-2025, 14:42:48 GMT

large language model, machine learning, persona, (21 more...)

Neural Information Processing Systems

Country: North America (0.28)

Genre: Research Report > New Finding (0.67)

Industry:

Education > Curriculum > Subject-Specific Education (1.00)
Automobiles & Trucks (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.82)

Add feedback

In-Context Impersonation Reveals Large Language Models' Strengths and Biases

Leonard Salewski, Stephan Alaniz, Isabel Rio-Torto, Eric Schulz, Zeynep Akata

Neural Information Processing SystemsMay-25-2025, 14:42:44 GMT

large language model, machine learning, persona, (18 more...)

Neural Information Processing Systems

Country: North America > United States (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Education (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback

Supplementary Materials: In-Context Impersonation Reveals Large Language Models' Strengths and Biases

Leonard Salewski, Stephan Alaniz, Isabel Rio-Torto, Eric Schulz, Zeynep Akata

Neural Information Processing SystemsFeb-11-2025, 14:31:22 GMT

We compare the task expert results with the average of all neutral personas, the average of all domain expert personas, the average of all nondomain expert personas and the random baseline (horizontal line). The first plot shows the average over all STEM tasks, while the remaining plots show the results for each STEM task individually. All 95% confidence intervals are computed over the average task accuracy.

large language model, machine learning, persona, (21 more...)

Neural Information Processing Systems

Country: North America (0.28)

Genre: Research Report > New Finding (0.67)

Industry:

Education > Curriculum > Subject-Specific Education (1.00)
Automobiles & Trucks (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.82)

Add feedback

In-Context Impersonation Reveals Large Language Models' Strengths and Biases

Leonard Salewski, Stephan Alaniz, Isabel Rio-Torto, Eric Schulz, Zeynep Akata

Neural Information Processing SystemsFeb-11-2025, 14:31:19 GMT

In everyday conversations, humans can take on different roles and adapt their vocabulary to their chosen roles. We explore whether LLMs can take on, that is impersonate, different roles when they generate text in-context. We ask LLMs to assume different personas before solving vision and language tasks. We do this by prefixing the prompt with a persona that is associated either with a social identity or domain expertise. In a multi-armed bandit task, we find that LLMs pretending to be children of different ages recover human-like developmental stages of exploration. In a language-based reasoning task, we find that LLMs impersonating domain experts perform better than LLMs impersonating non-domain experts.

large language model, machine learning, persona, (18 more...)

Neural Information Processing Systems

Country: North America (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Education (0.47)

Technology: