AITopics

2503.19763

Genre: Research Report (0.69)

Industry:

Law > Civil Rights & Constitutional Law (0.73)
Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (0.53)

arXiv.org Artificial IntelligenceFeb-3-2025

Generative Data Mining with Longtail-Guided Diffusion

Hayden, David S., Ye, Mao, Garipov, Timur, Meyer, Gregory P., Vondrick, Carl, Chen, Zhao, Chai, Yuning, Wolff, Eric, Srinivasa, Siddhartha S.

It is difficult to anticipate the myriad challenges that a predictive model will encounter once deployed. Common practice entails a reactive, cyclical approach: model deployment, data mining, and retraining. We instead develop a proactive longtail discovery process by imagining additional data during training. In particular, we develop general model-based longtail signals, including a differentiable, single forward pass formulation of epistemic uncertainty that does not impact model parameters or predictive performance but can flag rare or hard inputs. We leverage these signals as guidance to generate additional training data from a latent diffusion model in a process we call Longtail Guidance (LTG). Crucially, we can perform LTG without retraining the diffusion model or the predictive model, and we do not need to expose the predictive model to intermediate diffusion states. Data generated by LTG exhibit semantically meaningful variation, yield significant generalization improvements on image classification benchmarks, and can be analyzed to proactively discover, explain, and address conceptual gaps in a predictive model.

large language model, machine learning, natural language, (18 more...)

2502.0198

Country: North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(2 more...)

arXiv.org Artificial IntelligenceDec-19-2024

MRAG: A Modular Retrieval Framework for Time-Sensitive Question Answering

Siyue, Zhang, Yuxiang, Xue, Yiming, Zhang, Xiaobao, Wu, Tuan, Luu Anh, Chen, Zhao

Understanding temporal relations and answering time-sensitive questions is crucial yet a challenging task for question-answering systems powered by large language models (LLMs). Existing approaches either update the parametric knowledge of LLMs with new facts, which is resource-intensive and often impractical, or integrate LLMs with external knowledge retrieval (i.e., retrieval-augmented generation). However, off-the-shelf retrievers often struggle to identify relevant documents that require intensive temporal reasoning. To systematically study time-sensitive question answering, we introduce the TempRAGEval benchmark, which repurposes existing datasets by incorporating temporal perturbations and gold evidence labels. As anticipated, all existing retrieval methods struggle with these temporal reasoning-intensive questions. We further propose Modular Retrieval (MRAG), a trainless framework that includes three modules: (1) Question Processing that decomposes question into a main content and a temporal constraint; (2) Retrieval and Summarization that retrieves evidence and uses LLMs to summarize according to the main content; (3) Semantic-Temporal Hybrid Ranking that scores each evidence summarization based on both semantic and temporal relevance. On TempRAGEval, MRAG significantly outperforms baseline retrievers in retrieval performance, leading to further improvements in final answer accuracy.

large language model, machine learning, natural language, (20 more...)

2412.1554

Country:

North America > United States (1.00)
Europe > United Kingdom (1.00)
Asia (1.00)
Oceania > Australia (0.68)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment > Sports > Soccer (1.00)
Leisure & Entertainment > Sports > Basketball (1.00)
Leisure & Entertainment > Sports > Baseball (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceSep-11-2023

SHIFT3D: Synthesizing Hard Inputs For Tricking 3D Detectors

Chen, Hongge, Chen, Zhao, Meyer, Gregory P., Park, Dennis, Vondrick, Carl, Shrivastava, Ashish, Chai, Yuning

We present SHIFT3D, a differentiable pipeline for generating 3D shapes that are structurally plausible yet challenging to 3D object detectors. In safety-critical applications like autonomous driving, discovering such novel challenging objects can offer insight into unknown vulnerabilities of 3D detectors. By representing objects with a signed distanced function (SDF), we show that gradient error signals allow us to smoothly deform the shape or pose of a 3D object in order to confuse a downstream 3D detector. Importantly, the objects generated by SHIFT3D physically differ from the baseline object yet retain a semantically recognizable shape. Our approach provides interpretable failure modes for modern 3D object detectors, and can aid in preemptive discovery of potential safety risks within 3D perception systems before these risks become critical failures.

artificial intelligence, machine learning, shift3d, (15 more...)

2309.0581

Country: Asia > Middle East > Israel (0.14)

Genre: Research Report (0.82)

Industry:

Transportation > Ground > Road (0.68)
Information Technology (0.67)
Automobiles & Trucks (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.35)

arXiv.org Artificial IntelligenceJul-11-2022

Structural Inference of Networked Dynamical Systems with Universal Differential Equations

Koch, James, Chen, Zhao, Tuor, Aaron, Drgona, Jan, Vrabie, Draguna

Networked dynamical systems are common throughout science in engineering; e.g., biological networks, reaction networks, power systems, and the like. For many such systems, nonlinearity drives populations of identical (or near-identical) units to exhibit a wide range of nontrivial behaviors, such as the emergence of coherent structures (e.g., waves and patterns) or otherwise notable dynamics (e.g., synchrony and chaos). In this work, we seek to infer (i) the intrinsic physics of a base unit of a population, (ii) the underlying graphical structure shared between units, and (iii) the coupling physics of a given networked dynamical system given observations of nodal states. These tasks are formulated around the notion of the Universal Differential Equation, whereby unknown dynamical systems can be approximated with neural networks, mathematical terms known a priori (albeit with unknown parameterizations), or combinations of the two. We demonstrate the value of these inference tasks by investigating not only future state predictions but also the inference of system behavior on varied network topologies. The effectiveness and utility of these methods is shown with their application to canonical networked nonlinear coupled oscillators.

artificial intelligence, machine learning, scientific computing, (17 more...)

doi: 10.1063/5.0109093

2207.04962

Country: North America > United States (0.46)

Genre: Research Report (0.40)

Industry: Government > Regional Government (0.46)

Technology:

Information Technology > Scientific Computing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

arXiv.org Machine LearningMay-5-2020

Deep learning of physical laws from scarce data

Chen, Zhao, Liu, Yang, Sun, Hao

Harnessing data to discover the underlying governing laws or equations that describe the behavior of complex physical systems can significantly advance our modeling, simulation and understanding of such systems in various science and engineering disciplines. Recent advances in sparse identification show encouraging success in distilling closed-form governing equations from data for a wide range of nonlinear dynamical systems. However, the fundamental bottleneck of this approach lies in the robustness and scalability with respect to data scarcity and noise. This work introduces a novel physics-informed deep learning framework to discover governing partial differential equations (PDEs) from scarce and noisy data for nonlinear spatiotemporal systems. In particular, this approach seamlessly integrates the strengths of deep neural networks for rich representation learning, automatic differentiation and sparse regression to approximate the solution of system variables, compute essential derivatives, as well as identify the key derivative terms and parameters that form the structure and explicit expression of the PDEs. The efficacy and robustness of this method are demonstrated on discovering a variety of PDE systems with different levels of data scarcity and noise. The resulting computational framework shows the potential for closed-form model discovery in practical applications where large and accurate datasets are intractable to capture.

deep learning, equation, upstream oil & gas, (22 more...)

2005.03448

Country: North America > United States > Massachusetts (0.28)

Genre: Research Report (1.00)

Industry: Energy (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningDec-16-2018

Decentralized Computation Offloading for Multi-User Mobile Edge Computing: A Deep Reinforcement Learning Approach

Chen, Zhao, Wang, Xiaodong

Mobile edge computing (MEC) emerges recently as a promising solution to relieve resource-limited mobile devices from computation-intensive tasks, which enables devices to offload workloads to nearby MEC servers and improve the quality of computation experience. Nevertheless, by considering an MEC system consisting of multiple mobile users with stochastic task arrivals and wireless channels in this paper, the design of computation offloading policies is challenging to minimize the long-term average computation cost in terms of power consumption and buffering delay. A deep reinforcement learning (DRL) based decentralized dynamic computation offloading strategy is investigated to build a scalable MEC system with limited feedback. Specifically, a continuous action space based DRL approach named deep deterministic policy gradient (DDPG) is adopted to learn efficient computation offloading policies independently at each mobile user. Thus, powers of both local execution and task offloading can be adaptively allocated by the learned policies from each user's local observation of the MEC system. Numerical results are illustrated to demonstrate that efficient policies can be learned at each user, and performance of the proposed DDPG based decentralized strategy outperforms the conventional deep Q-network (DQN) based discrete power control strategy and some other greedy strategies with reduced computation cost. Besides, the power-delay tradeoff is also analyzed for both the DDPG based and DQN based strategies.

computation, survey article, télécommunications, (20 more...)

doi: 10.1186/s13638-020-01801-6

1812.07394

Country:

North America > United States > New York (0.14)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)

Genre: Research Report (0.70)

Industry:

Information Technology (0.88)
Telecommunications (0.88)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

arXiv.org Machine LearningJun-20-2018

Gradient Adversarial Training of Neural Networks

Sinha, Ayan, Chen, Zhao, Badrinarayanan, Vijay, Rabinovich, Andrew

We propose gradient adversarial training, an auxiliary deep learning framework applicable to different machine learning problems. In gradient adversarial training, we leverage a prior belief that in many contexts, simultaneous gradient updates should be statistically indistinguishable from each other. We enforce this consistency using an auxiliary network that classifies the origin of the gradient tensor, and the main network serves as an adversary to the auxiliary network in addition to performing standard task-based training. We demonstrate gradient adversarial training for three different scenarios: (1) as a defense to adversarial examples we classify gradient tensors and tune them to be agnostic to the class of their corresponding example, (2) for knowledge distillation, we do binary classification of gradient tensors derived from the student or teacher network and tune the student gradient tensor to mimic the teacher's gradient tensor; and (3) for multi-task learning we classify the gradient tensors derived from different task loss functions and tune them to be statistically indistinguishable. For each of the three scenarios we show the potential of gradient adversarial training procedure. Specifically, gradient adversarial training increases the robustness of a network to adversarial attacks, is able to better distill the knowledge from a teacher network to a student network compared to soft targets, and boosts multi-task learning by aligning the gradient tensors derived from the task specific loss functions. Overall, our experiments demonstrate that gradient tensors contain latent information about whatever tasks are being trained, and can support diverse machine learning problems when intelligently guided through adversarialization using a auxiliary network.

deep learning, gradient tensor, neural network, (18 more...)

1806.08028

Country:

North America > United States (0.93)
Europe (0.67)

Genre: Research Report (0.64)

Industry:

Education > Focused Education > Special Education (0.44)
Information Technology > Security & Privacy (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)