Collaborating Authors


Raising Robovoices

Communications of the ACM

In a critical episode of The Mandalorian, a TV series set in the Star Wars universe, a mysterious Jedi fights his way through a horde of evil robots. As the heroes of the show wait anxiously to learn the identity of their cloaked savior, he lowers his hood, and--spoiler alert-- they meet a young Luke Skywalker. Actually, what we see is an animated, de-aged version of the Jedi. Then Luke speaks, in a voice that sounds very much like the 1980s-era rendition of the character, thanks to the use of an advanced machine learning model developed by the voice technology startup Respeecher. "No one noticed that it was generated by a machine," says Dmytro Bielievtsov, chief technology officer at Respeecher.

The Robot Brains Podcast: Eric Horvitz of Microsoft on AI for the greater good on Apple Podcasts


On Episode 15 of Season 2, we're joined by Eric Horvitz, Microsoft's first ever Chief Scientific Officer. His research spans theoretical and practical challenges with developing systems that perceive, learn, and reason. He's the company's top inventor since joining in 1993 with over 300 patents filed. He has been elected Fellow of the Association for the Advancement of Artificial Intelligence (AAAI), Fellow of the National Academy of Engineering (NAE), Fellow of the American Academy of Arts and Sciences, and Fellow of the American Association for the Advancement of Science (AAAS). He was a member of the National Security Commission on AI and he also co-founded important groups like the Partnership on AI, a non-profit organization bringing together Apple, Amazon, Facebook, Google, DeepMind, IBM, and Microsoft to document the quality and impact of AI systems on things like criminal justice, the economy, and media integrity.

Reinforcement Learning: from trial & error to deep Q-learning


My objective with this article is to demystify a few foundational Reinforcement Learning (RL) concepts with hands-on examples. We are going to apply RL to the infamous Glass Bridge challenge from the Netflix series Squid Game episode 7. Although no previous RL knowledge is required, solid Python coding skills and basic machine learning understanding are necessary to follow the content of this article. The code can be found here. In simple words, RL is a computational approach used to achieve a pre-defined goal, which can be winning a chess game, optimizing a medical treatment, or improving a financial trading strategy.

Building Machine Learning Infrastructure at Netflix and beyond


Savin Goyal is CTO and co-founder of Outerbounds, a startup building infrastructure to help teams streamline how they build machine learning applications. Prior to starting Outerbounds, Savin and team worked at Netflix, where they were instrumental in the creation and release of Metaflow, an open source Python framework that addresses some of the challenges data scientists face around scalability and version control. The machine learning universe is really fast moving. So how can we make sure that we're not making a bet, that would hinder our progress, two years or four years further down the line. Deep learning is super popular, but tomorrow there could be a new way of doing machine learning.

Netflix's Newest No. 1 Is an Insult to Its Own Subject


It's the moment of truth on Netflix's new baking competition show Is It Cake?. The judges face a display of sneakers, all seemingly inedible, as sneakers generally are. They consult one another, after which they pronounce one of them to be made of cake. The host comes over with a large knife and lowers it onto the chosen sneaker. It sticks into the material: It's a sneaker. He moves to the judges' second guess.

A breakthrough unfolds – DeepMind: The Podcast (Season 2, Episode 1)


In December 2019, DeepMind's AI system, AlphaFold, solved a 50-year-old grand challenge in biology, known as the protein-folding problem. A headline in the journal Nature read, "It will change everything" and the President of the UK's Royal Society called it a "stunning advance [that arrived] decades before many in the field would have predicted". In this episode, Hannah uncovers the inside story of AlphaFold from the people who made it happen and finds out how it could help transform the future of healthcare and medicine. Thank you to everyone who made this season possible! Find Seasons 1 & 2 on YouTube:

Virtual Meeting: Machine Learning in Visual Effects


Autodesk's Will Harris, Foundry's Mathieu Mazerolle and Unity Technologies' Brian Gaffney will discuss how their companies are incorporating machine learning into software tools to make higher quality and more realistic visual effects and boost production speed. Visual Effects Supervisor Ryan Laney will describe the novel way artificial intelligence and machine learning were used to mask the identities of interview subjects in the award-winning HBO documentary Welcome to Chechnya. "Machine learning is poised to transform visual effects production, accelerating workflows and paving the way for a new generation of astonishingly real visual effects," says Barry Goch, who will moderate the discussion. "Will Harris, Mathieu Mazerolle and Brian Gaffney will demonstrate game-changing technologies. Ryan Laney will share his experience in applying machine learning to a real-world production."

Artificial Intelligence at Netflix - Two Current Use-Cases


Netflix launched in 1997 as a mail-based DVD rental business. Alongside the growing US DVD market in the late 1990s and early 2000s, Netflix's business grew and the company went public in 2002. Netflix posted its first profit a year later. By 2007, Netflix introduced its streaming service, and by 2013, the company began producing original content. Today, Netflix is one of the world's largest entertainment services with over 200 million paid memberships spanning 190 countries, according to the company's 2020 Annual Report.

Resolving Camera Position for a Practical Application of Gaze Estimation on Edge Devices Artificial Intelligence

Most Gaze estimation research only works on a setup condition that a camera perfectly captures eyes gaze. They have not literarily specified how to set up a camera correctly for a given position of a person. In this paper, we carry out a study on gaze estimation with a logical camera setup position. We further bring our research in a practical application by using inexpensive edge devices with a realistic scenario. That is, we first set up a shopping environment where we want to grasp customers gazing behaviors. This setup needs an optimal camera position in order to maintain estimation accuracy from existing gaze estimation research. We then apply the state-of-the-art of few-shot learning gaze estimation to reduce training sampling in the inference phase. In the experiment, we perform our implemented research on NVIDIA Jetson TX2 and achieve a reasonable speed, 12 FPS which is faster compared with our reference work, without much degradation of gaze estimation accuracy. The source code is released at

Can OTT platforms succeed with machine learning services? An insight


Whether you're acquainted with devices such as the Amazon Fire Stick, Chromecast, Spotify, Youtube, or SlingTV, you presumably already have a general understanding of what over-the-top (OTT) television is. Numerous television viewers are rapidly turning away from traditional programming and toward over-the-top (OTT) services. Because this transformation seems to be unavoidable, the majority of media firms are publicly welcoming the transition. But what exactly is OTT? In this article, we'll go over all there is to know about over-the-top television, beginning with its description and rising significance in the television business.