Europe
Synthesizing the preferred inputs for neurons in neural networks via deep generator networks
Anh Nguyen, Alexey Dosovitskiy, Jason Yosinski, Thomas Brox, Jeff Clune
Deep neural networks (DNNs) have demonstrated state-of-the-art results on many pattern recognition tasks, especially vision classification problems. Understanding the inner workings of such computational brains is both fascinating basic science that is interesting in its own right--similar to why we study the human brain--and will enable researchers to further improve DNNs. One path to understanding how a neural network functions internally is to study what each of its neurons has learned to detect. One such method is called activation maximization (AM), which synthesizes an input (e.g. an image) that highly activates a neuron. Here we dramatically improve the qualitative state of the art of activation maximization by harnessing a powerful, learned prior: a deep generator network (DGN). The algorithm (1) generates qualitatively state-of-the-art synthetic images that look almost real, (2) reveals the features learned by each neuron in an interpretable way, (3) generalizes well to new datasets and somewhat well to different network architectures without requiring the prior to be relearned, and (4) can be considered as a high-quality generative method (in this case, by generating novel, creative, interesting, recognizable images).
Stochastic Three-Composite Convex Minimization
Alp Yurtsever, Bang Cong Vu, Volkan Cevher
We propose a stochastic optimization method for the minimization of the sum of three convex functions, one of which has Lipschitz continuous gradient as well as restricted strong convexity. Our approach is most suitable in the setting where it is computationally advantageous to process smooth term in the decomposition with its stochastic gradient estimate and the other two functions separately with their proximal operators, such as doubly regularized empirical risk minimization problems. We prove the convergence characterization of the proposed algorithm in expectation under the standard assumptions for the stochastic gradient estimate of the smooth term. Our method operates in the primal space and can be considered as a stochastic extension of the three-operator splitting method. Numerical evidence supports the effectiveness of our method in real-world problems.
Launch and Iterate: Reducing Prediction Churn
Mahdi Milani Fard, Quentin Cormier, Kevin Canini, Maya Gupta
Practical applications of machine learning often involve successive training iterations with changes to features and training examples. Ideally, changes in the output of any new model should only be improvements (wins) over the previous iteration, but in practice the predictions may change neutrally for many examples, resulting in extra net-zero wins and losses, referred to as unnecessary churn. These changes in the predictions are problematic for usability for some applications, and make it harder and more expensive to measure if a change is statistically significant positive. In this paper, we formulate the problem and present a stabilization operator to regularize a classifier towards a previous classifier. We use a Markov chain Monte Carlo stabilization operator to produce a model with more consistent predictions without adversely affecting accuracy. We investigate the properties of the proposal with theoretical analysis. Experiments on benchmark datasets for different classification algorithms demonstrate the method and the resulting reduction in churn.
Dual Learning for Machine Translation
Di He, Yingce Xia, Tao Qin, Liwei Wang, Nenghai Yu, Tie-Yan Liu, Wei-Ying Ma
While neural machine translation (NMT) is making good progress in the past two years, tens of millions of bilingual sentence pairs are needed for its training. However, human labeling is very costly. To tackle this training data bottleneck, we develop a dual-learning mechanism, which can enable an NMT system to automatically learn from unlabeled data through a dual-learning game. This mechanism is inspired by the following observation: any machine translation task has a dual task, e.g., English-to-French translation (primal) versus French-to-English translation (dual); the primal and dual tasks can form a closed loop, and generate informative feedback signals to train the translation models, even if without the involvement of a human labeler. In the dual-learning mechanism, we use one agent to represent the model for the primal task and the other agent to represent the model for the dual task, then ask them to teach each other through a reinforcement learning process. Based on the feedback signals generated during this process (e.g., the languagemodel likelihood of the output of a model, and the reconstruction error of the original sentence after the primal and dual translations), we can iteratively update the two models until convergence (e.g., using the policy gradient methods). We call the corresponding approach to neural machine translation dual-NMT. Experiments show that dual-NMT works very well on English French translation; especially, by learning from monolingual data (with 10% bilingual data for warm start), it achieves a comparable accuracy to NMT trained from the full bilingual data for the French-to-English translation task.
Tractable Operations for Arithmetic Circuits of Probabilistic Models
Yujia Shen, Arthur Choi, Adnan Darwiche
We consider tractable representations of probability distributions and the polytime operations they support. In particular, we consider a recently proposed arithmetic circuit representation, the Probabilistic Sentential Decision Diagram (PSDD). We show that PSDDs support a polytime multiplication operator, while they do not support a polytime operator for summing-out variables. A polytime multiplication operator makes PSDDs suitable for a broader class of applications compared to classes of arithmetic circuits that do not support multiplication. As one example, we show that PSDD multiplication leads to a very simple but effective compilation algorithm for probabilistic graphical models: represent each model factor as a PSDD, and then multiply them.
Hassan Took a Bike Ride. Now He's One of the Thousands Missing in Gaza
In a place denied access to basic forensic technology--and where people disappear into Israeli detention--the fate of thousands remains unknown. One of them is an autistic teenager. In the early morning dark, Abeer Skaik turned to her husband, Ali Al-Qatta, and said that today would be the day they would find their son. Ali nodded in silence, and she handed him the stack of flyers. Each bore a photograph of 16-year-old Hassan smiling widely, his shoulders loose, wearing a plain red T-shirt. He is looking directly at the camera, unguarded. On top of the page, in large letters, Abeer had written a single word in bold red ink: --an appeal. Abeer watched as Ali stepped into a car with a few close friends and drove away. They started the 30-kilometer trip south, from al-Tuffah, east of Gaza City, to the European Hospital in Khan Younis. They had heard that a group of people detained by Israel, including children, would be released there. The gate was already crowded. Families stood shoulder to shoulder, wrapped in blankets against the cold, clutching photographs and ID cards. Ali distributed the flyers among his friends. When the buses of released detainees arrived, he and the others moved slowly through the narrow gaps between clusters of people. Some of those who had just been released were being pulled into embraces. Ali waited at the edge of each reunion. "Have you seen my son?" he asked. One after another, people shook their heads.
Don't Listen to Anyone Who Thinks Secession Will Solve Anything
Don't Listen to Anyone Who Thinks Secession Will Solve Anything Americans increasingly fantasize about a divorce between red and blue states--but they dread the thought of civil war. You can't have one without the other. It's become almost like a histamine response: After a shocking national event like the assassination of Charlie Kirk, or Donald Trump's deployment of the military to Los Angeles last June, mentions of the term " civil war " and calls for secession surge online. This kind of talk flared again in January, when two citizens were shot and killed by immigration agents on the streets of Minneapolis, and governor Tim Walz mobilized the Minnesota National Guard to be ready to support local law enforcement. "I mean, is this a Fort Sumter?" Walz said in an interview with The Atlantic, invoking the battle that sparked the Civil War.
Meet the Gods of AI Warfare
In its early days, the AI initiative known as Project Maven had its fair share of skeptics at the Pentagon. Today, many of them are true believers. The rise of AI warfare speaks to the biggest moral and practical question there is: Who--or what--gets to decide to take a human life? And who bears that cost? In 2018, more than 3,000 Google workers protested the company's involvement in "the business of war" after finding out the company was part of Project Maven, then a nascent Pentagon effort to use computer vision to rifle through copious video footage taken in America's overseas drone wars. They feared Project Maven's AI could one day be used for lethal targeting. In my yearslong effort to uncover the full story of Project Maven for my book,, I learned that is exactly what happened, and that the undertaking was just as controversial inside the Pentagon. Today, the tool known as Maven Smart System is being used in US operations against Iran . How the US military's top brass moved from skepticism about the use of AI in war to true believers has a lot to do with a Marine colonel named Drew Cukor. In early September 2024, during the cocktail hour at a private retreat for tech investors and defense leaders, Vice Admiral Frank "Trey" Whitworth found his way to Drew Cukor. Now Project Maven's founding leader and his skeptical successor were standing face-to-face. Three years earlier, Whitworth had been the Pentagon's top military official for intelligence, advising the chairman of the Joint Chiefs of Staff and running one of the most sensitive and potentially lethal parts of any military process: targeting.