Goto

Collaborating Authors

 allo






Low-Redundant Optimization for Large Language Model Alignment

Chen, Zhipeng, Zhou, Kun, Zhao, Wayne Xin, Wang, Jingyuan, Wen, Ji-Rong

arXiv.org Artificial Intelligence

Large language models (LLMs) are still struggling in aligning with human preference in complex tasks and scenarios. They are prone to overfit into the unexpected patterns or superficial styles in the training data. We conduct an empirical study that only selects the top-10\% most updated parameters in LLMs for alignment training, and see improvements in the convergence process and final performance. It indicates the existence of redundant neurons in LLMs for alignment training. To reduce its influence, we propose a low-redundant alignment method named \textbf{ALLO}, focusing on optimizing the most related neurons with the most useful supervised signals. Concretely, we first identify the neurons that are related to the human preference data by a gradient-based strategy, then identify the alignment-related key tokens by reward models for computing loss. Besides, we also decompose the alignment process into the forgetting and learning stages, where we first forget the tokens with unaligned knowledge and then learn aligned knowledge, by updating different ratios of neurons, respectively. Experimental results on 10 datasets have shown the effectiveness of ALLO. Our code and data are available at \url{https://github.com/RUCAIBox/ALLO}.


Allo: A Programming Model for Composable Accelerator Design

Chen, Hongzheng, Zhang, Niansong, Xiang, Shaojie, Zeng, Zhichen, Dai, Mengjia, Zhang, Zhiru

arXiv.org Artificial Intelligence

Special-purpose hardware accelerators are increasingly pivotal for sustaining performance improvements in emerging applications, especially as the benefits of technology scaling continue to diminish. However, designers currently lack effective tools and methodologies to construct complex, high-performance accelerator architectures in a productive manner. Existing high-level synthesis (HLS) tools often require intrusive source-level changes to attain satisfactory quality of results. Despite the introduction of several new accelerator design languages (ADLs) aiming to enhance or replace HLS, their advantages are more evident in relatively simple applications with a single kernel. Existing ADLs prove less effective for realistic hierarchical designs with multiple kernels, even if the design hierarchy is flattened. In this paper, we introduce Allo, a composable programming model for efficient spatial accelerator design. Allo decouples hardware customizations, including compute, memory, communication, and data type from algorithm specification, and encapsulates them as a set of customization primitives. Allo preserves the hierarchical structure of an input program by combining customizations from different functions in a bottom-up, type-safe manner. This approach facilitates holistic optimizations that span across function boundaries. We conduct comprehensive experiments on commonly-used HLS benchmarks and several realistic deep learning models. Our evaluation shows that Allo can outperform state-of-the-art HLS tools and ADLs on all test cases in the PolyBench. For the GPT2 model, the inference latency of the Allo generated accelerator is 1.7x faster than the NVIDIA A100 GPU with 5.4x higher energy efficiency, demonstrating the capability of Allo to handle large-scale designs.


Reflexion: Language Agents with Verbal Reinforcement Learning

Shinn, Noah, Cassano, Federico, Berman, Edward, Gopinath, Ashwin, Narasimhan, Karthik, Yao, Shunyu

arXiv.org Artificial Intelligence

Large language models (LLMs) have been increasingly used to interact with external environments (e.g., games, compilers, APIs) as goal-driven agents. However, it remains challenging for these language agents to quickly and efficiently learn from trial-and-error as traditional reinforcement learning methods require extensive training samples and expensive model fine-tuning. We propose Reflexion, a novel framework to reinforce language agents not by updating weights, but instead through linguistic feedback. Concretely, Reflexion agents verbally reflect on task feedback signals, then maintain their own reflective text in an episodic memory buffer to induce better decision-making in subsequent trials. Reflexion is flexible enough to incorporate various types (scalar values or free-form language) and sources (external or internally simulated) of feedback signals, and obtains significant improvements over a baseline agent across diverse tasks (sequential decision-making, coding, language reasoning). For example, Reflexion achieves a 91% pass@1 accuracy on the HumanEval coding benchmark, surpassing the previous state-of-the-art GPT-4 that achieves 80%. We also conduct ablation and analysis studies using different feedback signals, feedback incorporation methods, and agent types, and provide insights into how they affect performance.


The chatbot bubble has officially burst

#artificialintelligence

Allo did all the things you'd expect from a new messaging app: It incorporated richer fonts and multimedia tools, much like competitors iMessage or WhatsApp. But its coup de grâce against its competition was supposed to be The Power of Google Itself, with Search manifesting itself within the app as an omnipresent third wheel that was always listening, always ready to pipe up at moment's notice to help find a restaurant and book reservations. That third wheel was powered by the Google Assistant, which Rebecca Michael, then Google's head of marketing of messaging products, described as "an ongoing dialog between you and Google that helps you get things done in your world" as she debuted Allo onstage in 2016. What did that mean in a practical sense? More or less, I'd say "let's go out to dinner," and suddenly Allo would prompt me with a pop-up button: "DID SOMEONE SAY ITALIAN FOOD!?!?!?" I suspected Allo might not be long for this world, and two years later, the experiment is over.


Google may shut down its Allo messaging app 'soon'

Engadget

The'classic' version of Hangouts might not be the only Google chat service on the chopping block. A source talking to 9to5Google claims the company will shut down Allo "soon." While the apparent insider didn't explicitly say why it would switch off the messaging service, it's most likely due to both shifting priorities at the company and a general lack of interest. Virtually the "entire" Allo team has reportedly moved on to Android Messages, and division lead Anit Fulay jumped ship for Facebook in January. There hasn't been much work on Allo since the company said it was "pausing" development earlier in 2018.


8 things you didn't know you could do with Google Assistant

#artificialintelligence

Google Assistant keeps on growing. New features and functionality are constantly appearing and that's made keeping track of all the service's little quirks and features tougher than ever. You can do much more than just searches these days. What is Google Assistant, and what devices use it? You don't need a Google Home to be the ear in your living room, Google Assistant on your smartphone can also control the various smart doodads dotted around your home.