Goto

Collaborating Authors

 redesigning


Redesigning the Transformer Architecture with Insights from Multi-particle Dynamical Systems

Neural Information Processing Systems

The Transformer and its variants have been proven to be efficient sequence learners in many different domains. Despite their staggering success, a critical issue has been the enormous number of parameters that must be trained (ranging from $10^7$ to $10^{11}$) along with the quadratic complexity of dot-product attention. In this work, we investigate the problem of approximating the two central components of the Transformer --- multi-head self-attention and point-wise feed-forward transformation, with reduced parameter space and computational complexity. We build upon recent developments in analyzing deep neural networks as numerical solvers of ordinary differential equations. Taking advantage of an analogy between Transformer stages and the evolution of a dynamical system of multiple interacting particles, we formulate a temporal evolution scheme, \name, to bypass costly dot-product attention over multiple stacked layers. We perform exhaustive experiments with \name\ on well-known encoder-decoder as well as encoder-only tasks. We observe that the degree of approximation (or inversely, the degree of parameter reduction) has different effects on the performance, depending on the task. While in the encoder-decoder regime, \name\ delivers performances comparable to the original Transformer, in encoder-only tasks it consistently outperforms Transformer along with several subsequent variants.


Liquid AI Is Redesigning the Neural Network

WIRED

Artificial intelligence might now be solving advanced math, performing complex reasoning, and even using personal computers, but today's algorithms could still learn a thing or two from microscopic worms. Liquid AI, a startup spun out of MIT, will today reveal several new AI models based on a novel type of "liquid" neural network that has the potential to be more efficient, less power-hungry, and more transparent than the ones that underpin everything from chatbots to image generators to facial recognition systems. Liquid AI's new models include one for detecting fraud in financial transactions, another for controlling self-driving cars, and a third for analyzing genetic data. The company touted the new models, which it is licensing to outside companies, at an event held at MIT today. The company has received funding from investors that include Samsung and Shopify, both of which are also testing its technology.


Redesigning the Transformer Architecture with Insights from Multi-particle Dynamical Systems

Neural Information Processing Systems

The Transformer and its variants have been proven to be efficient sequence learners in many different domains. Despite their staggering success, a critical issue has been the enormous number of parameters that must be trained (ranging from 10 7 to 10 {11}) along with the quadratic complexity of dot-product attention. In this work, we investigate the problem of approximating the two central components of the Transformer --- multi-head self-attention and point-wise feed-forward transformation, with reduced parameter space and computational complexity. We build upon recent developments in analyzing deep neural networks as numerical solvers of ordinary differential equations. Taking advantage of an analogy between Transformer stages and the evolution of a dynamical system of multiple interacting particles, we formulate a temporal evolution scheme, ame, to bypass costly dot-product attention over multiple stacked layers.


How Artificial Intelligence is Redesigning the Retail Industry

#artificialintelligence

In an era of fast fashion, retailers are producing cheap clothes quickly in order to get the latest trends in store. However, this fashion concept comes at a enormous cost to our health, planet, and economy. Dana Thomas, author of'Fashionopolis: The Price of Fast Fashion and the Future of Clothes,' joined Cheddar to discuss the negative impact of inexpensive clothes and what we can do as consumers to create a more sustainable future.


Redesigning Your Tech Careers For The AI Era

#artificialintelligence

We are not strangers to the way AI is impacting our daily lives – both personally and professionally. We have been audience to the paranoia about AI taking away jobs and the optimism of AI creating jobs. While the jury is still out on the long-term impact of AI, we're already seeing automation of several activities, ranging from driving to radiology. It's becoming evident that no job will remain untouched by AI, though the degree of impact will differ from occupation to occupation. The technology industry, which is the catalyst of this change, will itself get transformed by AI.