thought
Tree of Thoughts: Deliberate Problem Solving with Large Language Models
Language models are increasingly being deployed for general problem solving across a wide range of tasks, but are still confined to token-level, left-to-right decision-making processes during inference. This means they can fall short in tasks that require exploration, strategic lookahead, or where initial decisions play a pivotal role. To surmount these challenges, we introduce a new framework for language model inference, Tree of Thoughts (ToT), which generalizes over the popular Chain of Thought approach to prompting language models, and enables exploration over coherent units of text (thoughts) that serve as intermediate steps toward problem solving. ToT allows LMs to perform deliberate decision making by considering multiple different reasoning paths and self-evaluating choices to decide the next course of action, as well as looking ahead or backtracking when necessary to make global choices.Our experiments show that ToT significantly enhances language models' problem-solving abilities on three novel tasks requiring non-trivial planning or search: Game of 24, Creative Writing, and Mini Crosswords. For instance, in Game of 24, while GPT-4 with chain-of-thought prompting only solved 4\% of tasks, our method achieved a success rate of 74\%.
Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models
Humans draw to facilitate reasoning: we draw auxiliary lines when solving geometry problems; we mark and circle when reasoning on maps; we use sketches to amplify our ideas and relieve our limited-capacity working memory. However, such actions are missing in current multimodal language models (LMs). Current chain-of-thought and tool-use paradigms only use text as intermediate reasoning steps. In this work, we introduce Sketchpad, a framework that gives multimodal LMs a visual sketchpad and tools to draw on the sketchpad. The LM conducts planning and reasoning according to the visual artifacts it has drawn.
Thought of Search: Planning with Language Models Through The Lens of Efficiency
Among the most important properties of algorithms investigated in computer science are soundness, completeness, and complexity. These properties, however, are rarely analyzed for the vast collection of recently proposed methods for planning with large language models. In this work, we alleviate this gap. We analyse these properties of using LLMs for planning and highlight that recent trends abandon both soundness and completeness for the sake of inefficiency. We propose a significantly more efficient approach that can, at the same time, maintain both soundness and completeness.
VLM Agents Generate Their Own Memories: Distilling Experience into Embodied Programs of Thought
Large-scale generative language and vision-language models (LLMs and VLMs) excel in few-shot in-context learning for decision making and instruction following. However, they require high-quality exemplar demonstrations to be included in their context window. In this work, we ask: Can LLMs and VLMs generate their own examples from generic, sub-optimal demonstrations? We propose In-Context Abstraction Learning (ICAL), a method that builds a memory of multimodal experience from sub-optimal demonstrations and human feedback. Given a task demonstration that may contain inefficiencies or mistakes, a VLM abstracts the trajectory into a generalized program by correcting inefficient actions and annotating cognitive abstractions: causal relationships, object state changes, temporal subgoals, and task-relevant visual elements.
AI-Powered Brain Implant Smashes Speed Record for Turning Thoughts Into Text
We speak at a rate of roughly 160 words every minute. That speed is incredibly difficult to achieve for speech brain implants. Decades in the making, speech implants use tiny electrode arrays inserted into the brain to measure neural activity, with the goal of transforming thoughts into text or sound. They're invaluable for people who lose their ability to speak due to paralysis, disease, or other injuries. Like a slow-loading web page or audio file, the delay can get frustrating for everyday conversations.
Thoughts about AI Text-to-Image Creation and the Future
Over the last couple of years, the impacts of AI for creative purposes were coming closer but still felt far away for actual use. Tech demos and papers gave us a glimpse into the black box of AI, but it always felt like tests or gimmicks. The earlier image-to-text recognition tests, with their sometimes racial biases and errors, always had a strange vibe. Google's dreaming AI was indeed impressive on its own but it was hard to see a use for this other than filters. Then there was the sheer endless amount of GAN morphs, which would have been much more impressive if I had understood anything of the tech behind them.
Ebay sues Amazon, saying it tried to poach its sellers
The first book sold on Amazon was'Fluid Concepts and Creative Analogies: Computer Models of the Fundamental Mechanisms of Thought' by Douglas Hofstadter. Bezos chose the name Amazon in reference to the Amazon River, the biggest river in the world, as he hoped Amazon would be the biggest bookstore in the world. The first book sold on Amazon was titled'Fluid Concepts and Creative Analogies: Computer Models of the Fundamental Mechanisms of Thought' by Douglas Hofstadter. The firm opens up sales of music, movies, consumer electronics, video games, toys and more. The logo is meant to suggest that Amazon sells every kind of product from A to Z.
- Retail (1.00)
- Information Technology > Services (0.91)
- Media > Publishing (0.79)