Goto

Collaborating Authors

 center


A Locally Adaptive Normal Distribution

Neural Information Processing Systems

The underlyingmetricis,however,non-parametric.Wedevelopamaximumlikelihood algorithm to infer the distribution parameters that relies on a combination of gradient descent and Monte Carlo integration. We further extend the LAND to mixture models, andprovidethecorresponding EMalgorithm.


RECODE: Reasoning Through Code Generation for Visual Question Answering

arXiv.org Artificial Intelligence

Multimodal Large Language Models (MLLMs) struggle with precise reasoning for structured visuals like charts and diagrams, as pixel-based perception lacks a mechanism for verification. To address this, we propose to leverage derendering -- the process of reverse-engineering visuals into executable code -- as a new modality for verifiable visual reasoning. Specifically, we propose RECODE, an agentic framework that first generates multiple candidate programs to reproduce the input image. It then uses a critic to select the most faithful reconstruction and iteratively refines the code. This process not only transforms an ambiguous perceptual task into a verifiable, symbolic problem, but also enables precise calculations and logical inferences later on. On various visual reasoning benchmarks such as CharXiv, ChartQA, and Geometry3K, RECODE significantly outperforms methods that do not leverage code or only use code for drawing auxiliary lines or cropping. Our work demonstrates that grounding visual perception in executable code provides a new path toward more accurate and verifiable multimodal reasoning.


High-dimensional Analysis of Synthetic Data Selection

arXiv.org Machine Learning

Despite the progress in the development of generative models, their usefulness in creating synthetic data that improve prediction performance of classifiers has been put into question. Besides heuristic principles such as "synthetic data should be close to the real data distribution", it is actually not clear which specific properties affect the generalization error. Our paper addresses this question through the lens of high-dimensional regression. Theoretically, we show that, for linear models, the covariance shift between the target distribution and the distribution of the synthetic data affects the generalization error but, surprisingly, the mean shift does not. Furthermore we prove that, in some settings, matching the covariance of the target distribution is optimal. Remarkably, the theoretical insights from linear models carry over to deep neural networks and generative models. We empirically demonstrate that the covariance matching procedure (matching the covariance of the synthetic data with that of the data coming from the target distribution) performs well against several recent approaches for synthetic data selection, across training paradigms, architectures, datasets and generative models used for augmentation.


US investigators are using AI to detect child abuse images made by AI

MIT Technology Review

Though artificial intelligence is fueling a surge in synthetic child abuse images, it's also being tested as a way to stop harm to real victims. Generative AI has enabled the production of child sexual abuse images to skyrocket. Now the leading investigator of child exploitation in the US is experimenting with using AI to distinguish AI-generated images from material depicting real victims, according to a new government filing. The Department of Homeland Security's Cyber Crimes Center, which investigates child exploitation across international borders, has awarded a $150,000 contract to San Francisco-based Hive AI for its software, which can identify whether a piece of content was AI-generated. The filing, posted on September 19, is heavily redacted and Hive cofounder and CEO Kevin Guo told that he could not discuss the details of the contract, but confirmed it involves use of the company's AI detection algorithms for child sexual abuse material (CSAM). The filing quotes data from the National Center for Missing and Exploited Children that reported a 1,325% increase in incidents involving generative AI in 2024.


Review for NeurIPS paper: A Contour Stochastic Gradient Langevin Dynamics Algorithm for Simulations of Multi-modal Distributions

Neural Information Processing Systems

My main concern is that using a flattened surrogate energy in this fashion is suitable for most sampling situations. The main reason is, by construction our iterates are not following the true distribution particularly closely; for example a plot of the samples obtained in the synthetic experiments (figs 2c--d) would look quite different from the original. While this does allow the algorithm to bounce out of local optima, the deviance from the true energy would make samples obtained after convergence to not be super useful. For point estimation situations, we might be able to get away with these samples for cases where the multiple modes of the real energy are sort of symmetric (as in the synthetic Gaussian experiments); it seems that even if we use a'flattened' energy (can be thought of as lower peaks with higher elevation between them), the original distribution's symmetry would be essentially preserved and the mean / other point estimates would be close enough. But flattening energies with skewed distribution of modes might not be as accurate, as the flattened version might have a mean closer to the'center' of the space, but the original would be closer to one of the modes near the periphery (am visualizing a simple 2-d space).


Next generation arms race could cause 'extinction' event akin to nuclear war, pandemic: tech chief

FOX News

Artificial intelligence could lead to extinction and should be a global priority on the scale of nuclear war and pandemics, Center for AI Safety chief Dan Hendrycks said. An artificial intelligence arms race between countries and corporations to see who can develop the most powerful AI machines could create an existential threat to humanity, the co-founder of an AI safety nonprofit told Fox News. "AI could pose the risk of extinction, and part of the reason for this is because we're currently locked in an AI arms race," Center for AI Safety Executive Director Dan Hendrycks said. "We're building increasingly powerful technologies, and we don't know how to completely control them or understand them." Sam Altman, CEO of OpenAI, signed the Center for AI Safety's statement saying that AI poses an existential threat to humanity.


In Principle and In Practice

#artificialintelligence

Companies, organizations, and governments around the world are implementing or endorsing principles for artificial intelligence. During the Berkman Klein Center's first fully-virtual Tuesday luncheon, Jessica Fjeld, Assistant Director of BKC's Cyberlaw Clinic and lead author of the Principled AI report, and Ryan Budish, Assistant Research Director at BKC and a member of the Organisation for Economic Co-operation and Development (OECD)'s AI Governance Expert Group that came up with principles for OECD's AI Principles, teamed up to share their experiences. Their discussion was moderated by BKC Executive Director Urs Gasser and also featured commentary from members of the BKC community. Fjeld explained the method of creating the Principled AI report and visualization, which analyzed AI principles from around the world and ultimately identified eight trends among them. "We believe [these themes] are the signs of the earliest emerging consensus for societal norms around how AI can be -- should be -- used," she said.


Bank of Baroda May Introduce Blockchain, AI Technology to Increase Business - Crypto-News India

#artificialintelligence

Bank of Baroda, is contemplating integrating blockchain, artificial intelligence (AI), and robotics to increase business. Chief Executive Officer (CEO) Managing Director PS Jayakumar told MoneyControl, "The third-largest public sector lender has laid out a digital roadmap that will remove the need to worry about small balances in bank accounts. The cost of managing these accounts will be offset by the efficiency of opening bank accounts, and the lender would focus on product delivery." Explaining further he added, "We want to be able to anticipate what our customers want in a seamless manner. We have introduced tablets and now open almost 4 lakh accounts every month in 10-12 minutes (per account). This makes the balances in the account less relevant."


Ohio test uses drones to monitor highway traffic

Engadget

Highway traffic monitoring is frequently... less than efficient. Fixed cameras can't catch problems beyond their immediate location, while aircraft are both costly and inevitably have to fly back to a distant base to refuel. These systems may soon get a robotic upgrade, though. Ohio State University is leading a pilot program that will use drones for roadway and traffic monitoring along a 35-mile highway stretch (the Smart Mobility Corridor) between Dublin and East Liberty. The dry run will see drones feed tracking data to the Ohio Department of Transportation's Traffic Management Center to complement data from existing systems.


Cancer Research: A Supercomputing Perspective

#artificialintelligence

Cancer, the second-leading cause of death in the U.S. after heart disease, kills more than 500,000 citizens per year, including about 2,000 children. In 2016, then Vice President Joe Biden launched the Cancer Moonshot, saying: "I know that we can help solidify a genuine global commitment to end cancer as we know it today -- and inspire a new generation of scientists to pursue new discoveries and the bounds of human endeavor." The importance of high performance computing (HPC) in cancer research was recognized by the Cancer Moonshot Task Force report, and by then Vice President Joe Biden and Energy Secretary Ernie Monitz. "Supercomputers are key to the Cancer Moonshot," Monitz wrote. "These exceptionally high-powered machines have the potential to greatly accelerate the development of cancer therapies by finding patterns in massive datasets too large for human analysis. Supercomputers can help us better understand the complexity of cancer development, identify novel and effective treatments, and help elucidate patterns in vast and complex data sets that advance our understanding of cancer."