Goto

Collaborating Authors

 wallach


Review for NeurIPS paper: Deep Reinforcement and InfoMax Learning

Neural Information Processing Systems

Strengths: The deep information maximization objective combined with noise contrastive estimation (InfoNCE) is a fairly new unsupervised learning loss that has yet to be thoroughly explored in deep reinforcement learning. The main value of the paper is the study of the representations learned when optimizing the InfoNCE loss and how those representations can be used for continual learning. Moreover, the paper introduces a novel architecture that uses the action information as part of the InfoNCE loss. These two ideas are novel and, to my knowledge, they haven't been presented in the literature before. In terms of significance, there has been growing interest in the representations learned by the InfoNCE loss in the context of reinforcement learning; see, Oord, Li, and Vinyals (2018), Anand et.


Council Post: Three Trends All Leaders Should Know About AI

#artificialintelligence

I recently read a book called The Age of AI. The book was intended to provide an overview of the approaching trends for artificial intelligence (AI). The purpose of this article is not meant to provide a book report, but rather to share with you what I am learning about AI. I'd like to share with you three key themes I am learning about that might improve your upcoming dinner party conversation or spark an idea that you can use in your work or industry. I'll divide these three themes into topics, trends and timing.


APMSqueeze: A Communication Efficient Adam-Preconditioned Momentum SGD Algorithm

Tang, Hanlin, Gan, Shaoduo, Rajbhandari, Samyam, Lian, Xiangru, Liu, Ji, He, Yuxiong, Zhang, Ce

arXiv.org Machine Learning

Adam is the important optimization algorithm to guarantee efficiency and accuracy for training many important tasks such as BERT and ImageNet. However, Adam is generally not compatible with information (gradient) compression technology. Therefore, the communication usually becomes the bottleneck for parallelizing Adam. In this paper, we propose a communication efficient {\bf A}DAM {\bf p}reconditioned {\bf M}omentum SGD algorithm-- named APMSqueeze-- through an error compensated method compressing gradients. The proposed algorithm achieves a similar convergence efficiency to Adam in term of epochs, but significantly reduces the running time per epoch. In terms of end-to-end performance (including the full-precision pre-condition step), APMSqueeze is able to provide {sometimes by up to $2-10\times$ speed-up depending on network bandwidth.} We also conduct theoretical analysis on the convergence and efficiency.


$\texttt{DeepSqueeze}$: Parallel Stochastic Gradient Descent with Double-Pass Error-Compensated Compression

Tang, Hanlin, Lian, Xiangru, Qiu, Shuang, Yuan, Lei, Zhang, Ce, Zhang, Tong, Liu, Ji

arXiv.org Machine Learning

Communication is a key bottleneck in distributed training. Recently, an \emph{error-compensated} compression technology was particularly designed for the \emph{centralized} learning and receives huge successes, by showing significant advantages over state-of-the-art compression based methods in saving the communication cost. Since the \emph{decentralized} training has been witnessed to be superior to the traditional \emph{centralized} training in the communication restricted scenario, therefore a natural question to ask is "how to apply the error-compensated technology to the decentralized learning to further reduce the communication cost." However, a trivial extension of compression based centralized training algorithms does not exist for the decentralized scenario. key difference between centralized and decentralized training makes this extension extremely non-trivial. In this paper, we propose an elegant algorithmic design to employ error-compensated stochastic gradient descent for the decentralized scenario, named $\texttt{DeepSqueeze}$. Both the theoretical analysis and the empirical study are provided to show the proposed $\texttt{DeepSqueeze}$ algorithm outperforms the existing compression based decentralized learning algorithms. To the best of our knowledge, this is the first time to apply the error-compensated compression to the decentralized learning.


Taking a hard look in the mirror to examine bias – humanity vs. artificial intelligence (AI)

#artificialintelligence

I'm amazed at how rapidly the process of hiring employees and looking for work is changing. For generations, a big part of that process has been a person's connections. Who you know has been as important as what you know. Services such as LinkedIn were created in part to help us manage our connections and find out who might be able to help us get a foot in the door to pursue a desired position. However, this network effect has always held inherent bias.


Debugging data: Microsoft researchers look at ways to train AI systems to reflect the real world - The AI Blog

#artificialintelligence

Artificial intelligence is already helping people do things like type faster texts and take better pictures, and it's increasingly being used to make even bigger decisions, such as who gets a new job and who goes to jail. That's prompting researchers across Microsoft and throughout the machine learning community to ensure that the data used to develop AI systems reflect the real world, are safeguarded against unintended bias and handled in ways that are transparent and respectful of privacy and security. Data is the food that fuels machine learning. It's the representation of the world that is used to train machine learning models, explained Hanna Wallach, a senior researcher in Microsoft's New York research lab. Wallach is a program co-chair of the Annual Conference on Neural Information Processing Systems from Dec. 4 to Dec. 9 in Long Beach, California.


How can we enhance the privacy, security and ethics of Artificial Intelligence? - ITU News

#artificialintelligence

Twenty years ago, Artificial Intelligence (AI) made headlines when IBM's Deep Blue won a prized chess match against the world's leading (human) chess player, Garry Kasparov. And notably last year, Google's AI beat the top player at the complex game of Go. In the two decades between those two victories, AI has come a long way from the basements and back rooms of Computer Science departments to the forefront of global discussions at the United Nations, such as this week's AI for Good Global Summit in Geneva, Switzerland. Indeed, AI is now poised to impact nearly every area of society. But as we prepare to reap the massive benefits of this "Golden Age" of AI in which 62% of organizations will be using AI technologies, experts have warned that it is critical that privacy, security, and ethical questions are brought to the forefront.


'Moral' Robots: the Future of War or Dystopian Fiction?

AITopics Original Links

The dawn of the 21st century has been called the decade of the drone. Unmanned aerial vehicles, remotely operated by pilots in the United States, rain Hellfire missiles on suspected insurgents in South Asia and the Middle East. Now a small group of scholars is grappling with what some believe could be the next generation of weaponry: lethal autonomous robots. At the center of the debate is Ronald C. Arkin, a Georgia Tech professor who has hypothesized lethal weapons systems that are ethically superior to human soldiers on the battlefield. A professor of robotics and ethics, he has devised algorithms for an "ethical governor" that he says could one day guide an aerial drone or ground robot to either shoot or hold its fire in accordance with internationally agreed-upon rules of war. But some scholars have dismissed Mr. Arkin's ethical governor as "vaporware," arguing that current technology is nowhere near the level of complexity that would be needed for a military robotic system to make life-and-death ethical judgments.


The Women Changing The Face Of AI

#artificialintelligence

In 2005, Hanna Wallach, a machine-learning researcher, found herself bunking with colleagues to attend the Neural Information Systems Processing (NIPS) conference. Wallach had been working in the field since 2001 and had attended numerous conferences, but this was the first time she had roomed with other women who specialized in machine learning, a branch of artificial intelligence that researches how computer programs can learn and grow. As a discipline, it is overwhelmingly male: Wallach estimates that only 13.5% of the entire machine learning field is female. At the conference, Wallach and her roommates, Jennifer Wortman Vaughan, Lisa Wainer, and Angela Yu, began discussing their experiences and commiserating about the lack of female allies. "We couldn't believe that there were four of us [at the conference]," Wallach says.


IAB Reveals Winners of Data Rockstar Awards

#artificialintelligence

IAB (Interactive Advertising Bureau) and its Data Center of Excellence today announced the winners of the inaugural IAB Data Rockstar Awards, celebrating top industry leaders and practitioners who have demonstrated achievement in data science or technology. The top finalists were selected by the IAB Data Center of Excellence Board of Directors and were evaluated based on demonstrated excellence, creativity or forward-thinking approaches to solving problems in data science, as well as the impact their contributions have made to their company or industry. Chalasani developed a highly efficient, distributed, extreme-scale, single-pass online logistic regression learning system in Scala/Spark, using variants of Stochastic Gradient Descent, capable of handling hundreds of millions of sparse features and billions of training observations. His system incorporates a number of state-of-the-art techniques that do not exist together in any other machine learning system, including adaptive feature-scaling, adaptive gradients, feature-interactions and feature-hashing. Chalasani work is central to MediaMath's vision for every addressable interaction between a marketer and a consumer to be driven by Machine Learning optimization against all available, relevant data at that moment, to maximize long-term marketer business outcomes.