Goto

Collaborating Authors

 connection


Federated Hyperparameter Tuning: Challenges, Baselines, and Connections to Weight-Sharing

Neural Information Processing Systems

Tuning hyperparameters is a crucial but arduous part of the machine learning pipeline. Hyperparameter optimization is even more challenging in federated learning, where models are learned over a distributed network of heterogeneous devices; here, the need to keep data on device and perform local training makes it difficult to efficiently train and evaluate configurations. In this work, we investigate the problem of federated hyperparameter tuning. We first identify key challenges and show how standard approaches may be adapted to form baselines for the federated setting. Then, by making a novel connection to the neural architecture search technique of weight-sharing, we introduce a new method, FedEx, to accelerate federated hyperparameter tuning that is applicable to widely-used federated optimization methods such as FedAvg and recent variants. Theoretically, we show that a FedEx variant correctly tunes the on-device learning rate in the setting of online convex optimization across devices. Empirically, we show that FedEx can outperform natural baselines for federated hyperparameter tuning by several percentage points on the Shakespeare, FEMNIST, and CIFAR-10 benchmarks--obtaining higher accuracy using the same training budget.


O(n) Connections are Expressive Enough: Universal Approximability of Sparse Transformers

Neural Information Processing Systems

Recently, Transformer networks have redefined the state of the art in many NLP tasks. However, these models suffer from quadratic computational cost in the input sequence length $n$ to compute pairwise attention in each layer. This has prompted recent research into sparse Transformers that sparsify the connections in the attention layers. While empirically promising for long sequences, fundamental questions remain unanswered: Can sparse Transformers approximate any arbitrary sequence-to-sequence function, similar to their dense counterparts? How does the sparsity pattern and the sparsity level affect their performance? In this paper, we address these questions and provide a unifying framework that captures existing sparse attention models. We propose sufficient conditions under which we prove that a sparse attention model can universally approximate any sequence-to-sequence function. Surprisingly, our results show that sparse Transformers with only $O(n)$ connections per attention layer can approximate the same function class as the dense model with $n^2$ connections.


Thirteenth International Distributed AI Workshop

AI Magazine

This article discusses the Thirteenth International Distributed AI Workshop. An overview of the workshop is given as well as concerns and goals for the technology. The central problem in DAI is how to achieve coordinated action among such agents, so that they can accomplish more as a group than as individuals. The DAI workshop is dedicated to advancing the state of the art in this field. This year's workshop took place on the Olympic Peninsula in Washington State on 28 to 30 July 1994 and included 45 participants from North America, Europe, and the Pacific Rim.



Review of The Computational Beauty of Nature

AI Magazine

Its basic premise is that these "most interesting computational topics today" are deeply interrelated, and in some heretofore undescribed ways. The text is well crafted, and the scholarship is both broad and deep. The author is clearly a renaissance man as well as a wonderful teacher. He is equally good at succinct summaries and painting the big picture, and he makes particularly effective use of examples. Best of all is his infectious joy about his subject: The text is full of percolations of delight at the beauty of some concept or equation or at the sheer fun of hacking code.


Assembly Sequence Planning

AI Magazine

Assembly plays a fundamental role in the manufacturing of most products. Parts that have been individually formed or machined to meet designed specifications are assembled into a configuration that achieves the functions of the final product or mechanism. The economic importance of assembly as a manufacturing process has led to extensive efforts to improve the efficiency and cost effectiveness of assembly operations. The sequence of mating operations that can be carried out to assemble a group of parts is constrained by the geometric and mechanical properties of the parts, their assembled configuration, and the stability of the resulting subassemblies. An approach to representation and reasoning about these sequences is described here and leads to several alternative explicit and implicit plan representations.


?utm_content=buffer0509f&utm_medium=social&utm_source=twitter.com&utm_campaign=buffer

#artificialintelligence

Deep learning is changing the way we use and think about machines. Current incarnations are better than humans at all kinds of tasks, from chess and Go to face recognition and object recognition. In particular, humans have the extraordinary ability to constantly update their memories with the most important knowledge while overwriting information that is no longer useful. The world provides a never-ending source of data, much of which is irrelevant to the tricky business of survival, and most of which is impossible to store in a limited memory. So humans and other creatures have evolved ways to retain important skills while forgetting irrelevant ones.


AI helps computers hone the fine art of forgetting

#artificialintelligence

Deep learning is changing the way we use and think about machines. Current incarnations are better than humans at all kinds of tasks, from chess and Go to face recognition and object recognition. In particular, humans have the extraordinary ability to constantly update their memories with the most important knowledge while overwriting information that is no longer useful. The world provides a never-ending source of data, much of which is irrelevant to the tricky business of survival, and most of which is impossible to store in a limited memory. So humans and other creatures have evolved ways to retain important skills while forgetting irrelevant ones.


?utm_content=buffer1153b&utm_medium=social&utm_source=twitter.com&utm_campaign=buffer

#artificialintelligence

Globally recognized IoT, Machine Learning, Predictive Analytics, and Data Science Top 10 Influencer. Helping businesses achieve digital transformation to become data driven and improve the Customer Experience, and implement the capabilities and processes to foster an eco-system of data. It was a natural progression from Data and Analytics to IoT. All connected devices now generate massive amounts of data from a variety of sources and sensors. Having a background in Data and Analytics gives me a unique perspective regarding how IoT devices are reshaping the data landscape through a network of connections between people, devices, and the internet.


AI helps computers hone the fine art of forgetting

#artificialintelligence

Deep learning is changing the way we use and think about machines. Current incarnations are better than humans at all kinds of tasks, from chess and Go to face recognition and object recognition. In particular, humans have the extraordinary ability to constantly update their memories with the most important knowledge while overwriting information that is no longer useful. The world provides a never-ending source of data, much of which is irrelevant to the tricky business of survival, and most of which is impossible to store in a limited memory. So humans and other creatures have evolved ways to retain important skills while forgetting irrelevant ones.