Goto

Collaborating Authors

 torralba






LearningRepresentationsfromAudio-Visual SpatialAlignment

Neural Information Processing Systems

While these approaches learn high-quality representations for downstream tasks such as action recognition, their training objectives disregard spatial cues naturally occurring in audio and visual signals.




Improved Generalized Planning with LLMs through Strategy Refinement and Reflection

arXiv.org Artificial Intelligence

LLMs have recently been used to generate Python programs representing generalized plans in PDDL planning, i.e., plans that generalize across the tasks of a given PDDL domain. Previous work proposed a framework consisting of three steps: the LLM first generates a summary and then a strategy for the domain, both in natural language, and then implements that strategy as a Python program, that gets debugged on example planning tasks. In that work, only one strategy is generated and passed directly to the program generation. If the strategy is incorrect, its implementation will therefore result in an incorrect generalized plan. Here, we introduce an approach that generates the strategy in the form of pseudocode and enables automatic debugging of the pseudocode, hence allowing us to identify and fix errors prior to the generation of the generalized plan itself. Additionally, we extend the Python debugging phase with a reflection step prompting the LLM to pinpoint the reason for the observed plan failure. Finally, we take inspiration from LLM code generation to produce several program variants and pick the best one. Running experiments on 17 benchmark domains, we show that these extensions substantially improve (and never deteriorate) the quality of the generalized plans. In 12 of the domains, our best Python programs solve all tasks that can be generated with the respective instance generator.


Transfer Learning by Borrowing Examples for Multiclass Object Detection

Neural Information Processing Systems

Despite the recent trend of increasingly large datasets for object detection, there still exist many classes with few training examples. To overcome this lack of training data for certain classes, we propose a novel way of augmenting the training data for each class by borrowing and transforming examples from other classes. Our model learns which training instances from other classes to borrow and how to transform the borrowed examples so that they become more similar to instances from the target class. Our experimental results demonstrate that our new object detector, with borrowed and transformed examples, improves upon the current state-of-the-art detector on the challenging SUN09 object detection dataset.


52292e0c763fd027c6eba6b8f494d2eb-Reviews.html

Neural Information Processing Systems

Reviewer response to rebuttal: I have read through the author's rebuttal and I am happy with the proposed changes. I have not changed my review as I already recommended this paper for acceptance. Previous Review: In this work, the authors develop a hierarchical generative model for producing and classifying written characters with the goal of achieving a high level of performance with just one training example. The model is rooted in learning the compositional structure of characters and the causal relationship that dictates how characters are produced. The model is compared to a simpler version of the model that does not represent character strokes, a deep boltzmann machine approach, and a hierarchical deep learning method.