Goto

Collaborating Authors

 Media



WATCH: Wall-climbing robot swarms crawl US Navy warships as China's fleet surges

FOX News

Navy robots from Gecko Robotics will inspect U.S. warships in $71 million effort to reduce maintenance delays as only 60% of fleet remains operational amid China's naval expansion.


How to Set Up Your Own NAS Server for Backups and Content Streaming

WIRED

The app reads your email inbox and your meeting calendar, then gives you a short audio summary. It can help you spend less time scrolling, but of course, there are privacy drawbacks to consider.


SEN WICKER: Ending China's drone dominance with a made-in-America revival

FOX News

America's drone industry lags behind China's dominance, but Congress and Trump's $2.5 billion investment aims to rebuild U.S. military and commercial drone production by 2027.


Blind Regression: Nonparametric Regression for Latent Variable Models via Collaborative Filtering

Neural Information Processing Systems

We introduce the framework of {\em blind regression} motivated by {\em matrix completion} for recommendation systems: given $m$ users, $n$ movies, and a subset of user-movie ratings, the goal is to predict the unobserved user-movie ratings given the data, i.e., to complete the partially observed matrix. Following the framework of non-parametric statistics, we posit that user $u$ and movie $i$ have features $x_1(u)$ and $x_2(i)$ respectively, and their corresponding rating $y(u,i)$ is a noisy measurement of $f(x_1(u), x_2(i))$ for some unknown function $f$. In contrast with classical regression, the features $x = (x_1(u), x_2(i))$ are not observed, making it challenging to apply standard regression methods to predict the unobserved ratings. Inspired by the classical Taylor's expansion for differentiable functions, we provide a prediction algorithm that is consistent for all Lipschitz functions. In fact, the analysis through our framework naturally leads to a variant of collaborative filtering, shedding insight into the widespread success of collaborative filtering in practice. Assuming each entry is sampled independently with probability at least $\max(m^{-1+\delta},n^{-1/2+\delta})$ with $\delta > 0$, we prove that the expected fraction of our estimates with error greater than $\epsilon$ is less than $\gamma^2 / \epsilon^2$ plus a polynomially decaying term, where $\gamma^2$ is the variance of the additive entry-wise noise term. Experiments with the MovieLens and Netflix datasets suggest that our algorithm provides principled improvements over basic collaborative filtering and is competitive with matrix factorization methods.


A photo of Iran's bombed schoolgirl graveyard went around the world. Was it real, or AI?

The Guardian

Graves being prepared for the victims of an airstrike on a school in Minab in southern Iran, 2 March 2026. Graves being prepared for the victims of an airstrike on a school in Minab in southern Iran, 2 March 2026. A photo of Iran's bombed schoolgirl graveyard went around the world. T he graves, freshly dug, lie in neat rows of 20 across. More than 60 have already been carved out of the earth, with a few clusters of people standing gathered around them.


India's outsourcing industry is worth 300bn. Can it survive AI?

BBC News

India's outsourcing industry is worth $300bn. Indian technology stocks have seen an unprecedented rout over the past few weeks over fears of artificial intelligence upending the traditional outsourcing model that powers the country's $300bn (£223bn) back-office industry. The sell-off - part of a global correction in traditional software and IT stocks - preceded the market nervousness caused by recent geopolitical uncertainty, and is particularly significant for India. Over the past three-and-a-half decades, India's software industry has created millions of white-collar jobs, spawning a new middle class driven by high ambition and strong purchasing power. This, in turn, has fuelled demand for apartments, cars and restaurants across top-tier cities such as Bengaluru, Hyderabad and Gurugram over the past 30 years.


DeepExposure: Learning to Expose Photos with Asynchronously Reinforced Adversarial Learning

Neural Information Processing Systems

The accurate exposure is the key of capturing high-quality photos in computational photography, especially for mobile phones that are limited by sizes of camera modules. Inspired by luminosity masks usually applied by professional photographers, in this paper, we develop a novel algorithm for learning local exposures with deep reinforcement adversarial learning. To be specific, we segment an image into sub-images that can reflect variations of dynamic range exposures according to raw low-level features. Based on these sub-images, a local exposure for each sub-image is automatically learned by virtue of policy network sequentially while the reward of learning is globally designed for striking a balance of overall exposures. The aesthetic evaluation function is approximated by discriminator in generative adversarial networks. The reinforcement learning and the adversarial learning are trained collaboratively by asynchronous deterministic policy gradient and generative loss approximation. To further simply the algorithmic architecture, we also prove the feasibility of leveraging the discriminator as the value function. Further more, we employ each local exposure to retouch the raw input image respectively, thus delivering multiple retouched images under different exposures which are fused with exposure blending. The extensive experiments verify that our algorithms are superior to state-of-the-art methods in terms of quantitative accuracy and visual illustration.



SING: Symbol-to-Instrument Neural Generator

Neural Information Processing Systems

Recent progress in deep learning for audio synthesis opens the way to models that directly produce the waveform, shifting away from the traditional paradigm of relying on vocoders or MIDI synthesizers for speech or music generation. Despite their successes, current state-of-the-art neural audio synthesizers such as WaveNet and SampleRNN suffer from prohibitive training and inference times because they are based on autoregressive models that generate audio samples one at a time at a rate of 16kHz. In this work, we study the more computationally efficient alternative of generating the waveform frame-by-frame with large strides. We present a lightweight neural audio synthesizer for the original task of generating musical notes given desired instrument, pitch and velocity. Our model is trained end-to-end to generate notes from nearly 1000 instruments with a single decoder, thanks to a new loss function that minimizes the distances between the log spectrograms of the generated and target waveforms. On the generalization task of synthesizing notes for pairs of pitch and instrument not seen during training, SING produces audio with significantly improved perceptual quality compared to a state-of-the-art autoencoder based on WaveNet as measured by a Mean Opinion Score (MOS), and is about 32 times faster for training and 2, 500 times faster for inference.