Goto

Collaborating Authors

 Asia


Greedy and Random Quasi-Newton Methods with Faster Explicit Superlinear Convergence

Neural Information Processing Systems

In this paper, we follow Rodomanov and Nesterov [19]'s work to study quasiNewton methods. We focus on the common SR1 and BFGS quasi-Newton methods to establish better explicit (local) superlinear convergence rates. First, based on the greedy quasi-Newton update which greedily selects the direction to maximize a certain measure of progress, we improve the convergence rate to a conditionnumber-free superlinear convergence rate. Second, based on the random quasiNewton update that selects the direction randomly from a spherically symmetric distribution, we show the same superlinear convergence rate established as above. Our analysis is closely related to the approximation of a given Hessian matrix, unconstrained quadratic objective, as well as the general strongly convex, smooth and strongly self-concordant functions.




344ef5151be171062f42f03e69663ecf-Paper.pdf

Neural Information Processing Systems

Neural Transducer (e.g., RNN-T) has been widely used in automatic speech recognition (ASR) due to its capabilities of efficiently modeling monotonic alignments between input and output sequences and naturally supporting streaming inputs. Considering that monotonic alignments are also critical to text to speech (TTS) synthesis and streaming TTS is also an important application scenario, in this work, we explore the possibility of applying Transducer to TTS and more. However, it is challenging because it is difficult to trade off the emission (continuous melspectrogram prediction) probability and transition (ASRTransducer predicts blank token to indicate transition to next input) probability when calculating the output probability lattice in Transducer, and it is not easy to learn the alignments between text and speech through the output probability lattice. We propose SpeechTransducer (Speech-T for short), a Transformer based Transducer model that 1) uses a new forward algorithm to separate the transition prediction from the continuous mel-spectrogram prediction when calculating the output probability lattice, and uses a diagonal constraint in the probability lattice to help the alignment learning; 2) supports both full-sentence or streaming TTS by adjusting the look-ahead context; and 3) further supports both TTS and ASR together for the first time, which enjoys several advantages including fewer parameters as well as streaming synthesis and recognition in a single model. Experiments on LJSpeech datasets demonstrate that Speech-T 1) is more robust than the attention based autoregressive TTS model due to its inherent monotonic alignments between text and speech; 2) naturally supports streaming TTS with good voice quality; and 3) enjoys the benefit of joint modeling TTS and ASR in a single network.


Discord Sleuths Gained Unauthorized Access to Anthropic's Mythos

WIRED

Plus: Spy firms tap into a global telecom weakness to track targets, 500,000 UK health records go up for sale on Alibaba, Apple patches a revealing notification bug, and more. As researchers and practitioners debate the impact that new AI models will have on cybersecurity, Mozilla said on Tuesday it used early access to Anthropic's Mythos Preview to find and fix 271 vulnerabilities in its new Firefox 150 browser release. Meanwhile, researchers identified a group of moderately successful North Korean hackers using AI for everything from vibe coding malware to creating fake company websites--stealing up to $12 million in three months. Researchers have finally cracked disruptive malware known as Fast16 that predates Stuxnet and may have been used to target Iran's nuclear program. It was created in 2005 and was likely deployed by the US or an ally.





RAF jets scrambled after Russian drones detected near Nato airspace

BBC News

At least seven people were killed in Russian strikes across Ukraine overnight, including five in the central city of Dnipro, where officials said an apartment building was hit. Ukrainian President Volodymyr Zelensky said the latest attack lasted practically all night, while rescue workers were still searching for survivors under rubble in Dnipro on Saturday morning. British jets were scrambled from Romania during the heavy attack when Russian drones were detected near the border, though the UK Ministry of Defence rejected a report it had shot some down. Meanwhile, Ukraine carried out some of its longest-distance drone strikes deep inside Russian territory. In Yekaterinburg, almost 1,000 miles (1,600km) from Ukraine's border, the governor said six people were injured when a building was struck - while in nearby Chelyabinsk, a local leader said drones targeting an industrial facility were shot down.


With A.I., Anyone Can Be an Influencer

The New Yorker

With A.I., Anyone Can Be an Influencer TikTok and Instagram made it easy to monetize the physical self. Now the social-media-savvy can use A.I. to play with their identity, or overhaul it entirely. A few months ago, a forty-five-year-old homemaker living in Georgia, whom I'll call Robin, started playing around with an A.I. image generator. Growing up, Robin had loved reading; she dabbled in writing, too, but after her first child was born, the habit faded. A.I. offered something different--a kind of world-building that allowed her to project herself into places and situations she'd never inhabited.