Goto

Collaborating Authors

 Information Technology


Panchromatic and Multispectral Image Fusion via Alternating Reverse Filtering Network (Supplementary Materials)

Neural Information Processing Systems

The best results are highlighted by bold. It can be clearly seen that our alternating reverse filtering network performs the best compared with other state-of-the-art methods in all the indexes, indicating the superiority of our proposed method. Images in the last row are the MSE residues between the fused results and the ground truth. Compared with other competing methods, our model has minor spatial and spectral distortions. It can be easily concluded from the observation of MSE maps.


ColdGANs: Taming Language GANs with Cautious Sampling Strategies Thomas Scialom, Paul-Alexis Dray

Neural Information Processing Systems

Training regimes based on Maximum Likelihood Estimation (MLE) suffer from known limitations, often leading to poorly generated text sequences. At the root of these limitations is the mismatch between training and inference, i.e. the so-called exposure bias, exacerbated by considering only the reference texts as correct, while in practice several alternative formulations could be as good. Generative Adversarial Networks (GANs) can mitigate those limitations but the discrete nature of text has hindered their application to language generation: the approaches proposed so far, based on Reinforcement Learning, have been shown to underperform MLE. Departing from previous works, we analyze the exploration step in GANs applied to text generation, and show how classical sampling results in unstable training. We propose to consider alternative exploration strategies in a GAN framework that we name ColdGANs, where we force the sampling to be close to the distribution modes to get smoother learning dynamics. For the first time, to the best of our knowledge, the proposed language GANs compare favorably to MLE, and obtain improvements over the state-of-the-art on three generative tasks, namely unconditional text generation, question generation, and abstractive summarization.


Who's to Blame When AI Agents Screw Up?

WIRED

Over the past year, veteran software engineer Jay Prakash Thakur has spent his nights and weekends prototyping AI agents that could, in the near future, order meals and engineer mobile apps almost entirely on their own. His agents, while surprisingly capable, have also exposed new legal questions that await companies trying to capitalize on Silicon Valley's hottest new technology. Agents are AI programs that can act mostly independently, allowing companies to automate tasks such as answering customer questions or paying invoices. While ChatGPT and similar chatbots can draft emails or analyze bills upon request, Microsoft and other tech giants expect that agents will tackle more complex functions--and most importantly, do it with little human oversight. The tech industry's most ambitious plans involve multi-agent systems, with dozens of agents someday teaming up to replace entire workforces.


C2FAR: Coarse-to-Fine Autoregressive Networks for Precise Probabilistic Forecasting

Neural Information Processing Systems

C2FAR generates a hierarchical, coarse-to-fine discretization of a variable autoregressively; progressively finer intervals of support are generated from a sequence of binned distributions, where each distribution is conditioned on previously-generated coarser intervals. Unlike prior (flat) binned distributions, C2FAR can represent values with exponentially higher precision, for only a linear increase in complexity. We use C2FAR for probabilistic forecasting via a recurrent neural network, thus modeling time series autoregressively in both space and time. C2FAR is the first method to simultaneously handle discrete and continuous series of arbitrary scale and distribution shape. This flexibility enables a variety of time series use cases, including anomaly detection, interpolation, and compression. C2FAR achieves improvements over the state-of-the-art on several benchmark forecasting datasets.


Interview with Gillian Hadfield: Normative infrastructure for AI alignment

AIHub

During the 33rd International Joint Conference on Artificial Intelligence (IJCAI), held in Jeju, I had the opportunity to meet with one of the keynote speakers, Gillian Hadfield. We spoke about her interdisciplinary research, career trajectory, path into AI alignment, law, and general thoughts on AI systems. Transcript: Note: the transcript has been lightly edited for clarity. This is an interview with Professor Gillian Hadfield who was a keynote speaker at IJCAI 2024. She gave a very insightful talk about normative infrastructures and how they can guide our search for AI alignment. Kumar Kshitij Patel (KKP): Could you talk a bit about your background and career trajectory? I want our readers to understand how much interdisciplinary work you've done over the years. Gillian Hadfield (GH): I did a PhD in economics and a law degree, a JD, at Stanford, originally motivated by wanting to think about the big questions about the world. So I read John Rawls' theory of justice when I was an undergraduate, and those are the big questions: how do we organize the world and just institutions, but I was very interested in using more formal methods and social scientific approaches. That's why I decided to do that joint degree. So, this is in the 1980s, and in the early days of starting to use a lot of game theory. I studied information theory, a student of Canaro and Paul Milgram at the economics department at Stanford. I did work on contract theory, bargaining theory, but I was still very interested in going to law school, not to practice law, but to learn about legal institutions and how those work. I was a member of this emerging area of law and economics early in my career, which of course, was interdisciplinary, using economics to think about law and legal institutions.


Wheeled, rugged robot dog built for extreme industrial missions

FOX News

The machine is designed to inspect industrial sites, respond to disasters, carry out logistics operations and support scientific research. Deep Robotics, a company from China, has unveiled a durable four-legged robot built to operate in extreme environments that humans struggle to traverse. It's called the Lynx M20, and it builds upon the agility of its predecessor, the Lynx robot dog. This versatile machine is designed to handle anything from inspecting industrial sites and responding to disasters to carrying out logistics operations and supporting scientific research. Here's what you need to know.


AI Melania: First lady embarks on 'new frontier' in publishing with audiobook of memoir

FOX News

EXCLUSIVE: First lady Melania Trump is launching an audiobook of her memoir using artificial intelligence (AI) audio technology in multiple languages, Fox News Digital has learned. The first lady released her first memoir, "Melania," last year. This week, she is breaking new ground by releasing "Melania, the Audiobook," which has been "created entirely" with AI. "I am proud to be at the forefront of publishing's new frontier โ€“ the intersection of artificial intelligence technology and audio," Trump told Fox News Digital. The first lady said ElevenLabs AI developed "an AI-generated replica of my voice under strict supervision, which will establish an unforgettable connection with my personal story, in multiple languages for listeners worldwide." ElevenLabs AI CEO Mati Staniszewski told Fox News Digital that they are "excited that Melania Trump trusted our technology to power this first-of-its-kind audiobook project."



ICNet: Intra-saliency Correlation Network for Co-Saliency Detection

Neural Information Processing Systems

Model-based methods produce coarse Co-SOD results due to hand-crafted intra-and inter-saliency features. Current data-driven models exploit inter-saliency cues, but undervalue the potential power of intra-saliency cues. In this paper, we propose an Intra-saliency Correlation Network (ICNet) to extract intra-saliency cues from the single image saliency maps (SISMs) predicted by any off-the-shelf SOD method, and obtain inter-saliency cues by correlation techniques. Specifically, we adopt normalized masked average pooling (NMAP) to extract latent intra-saliency categories from the SISMs and semantic features as intra cues. Then we employ a correlation fusion module (CFM) to obtain inter cues by exploiting correlations between the intra cues and single-image features. To improve Co-SOD performance, we propose a category-independent rearranged self-correlation feature (RSCF) strategy. Experiments on three benchmarks show that our ICNet outperforms previous state-of-the-art methods on Co-SOD.