AITopics

2503.22252

Country:

North America > Canada (0.28)
North America > United States > California (0.28)

Genre: Research Report > New Finding (0.68)

Industry: Energy > Oil & Gas > Upstream (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

arXiv.org Artificial IntelligenceJul-3-2024

SFC: Achieve Accurate Fast Convolution under Low-precision Arithmetic

He, Liulu, Zhao, Yufei, Gao, Rui, Du, Yuan, Du, Li

Fast convolution algorithms, including Winograd and FFT, can efficiently accelerate convolution operations in deep models. However, these algorithms depend on high-precision arithmetic to maintain inference accuracy, which conflicts with the model quantization. To resolve this conflict and further improve the efficiency of quantized convolution, we proposes SFC, a new algebra transform for fast convolution by extending the Discrete Fourier Transform (DFT) with symbolic computing, in which only additions are required to perform the transformation at specific transform points, avoiding the calculation of irrational number and reducing the requirement for precision. Additionally, we enhance convolution efficiency by introducing correction terms to convert invalid circular convolution outputs of the Fourier method into effective ones. The numerical error analysis is presented for the first time in this type of work and proves that our algorithms can provide a 3.68x multiplication reduction for 3x3 convolution, while the Winograd algorithm only achieves a 2.25x reduction with similarly low numerical errors. Experiments carried out on benchmarks and FPGA show that our new algorithms can further improve the computation efficiency of quantized models while maintaining accuracy, surpassing both the quantization-alone method and existing works on fast convolution quantization.

artificial intelligence, data quality, machine learning, (19 more...)

2407.02913

Country: Europe > Austria > Vienna (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Data Science > Data Quality > Data Transformation (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

arXiv.org Artificial IntelligenceMay-6-2024

Relating-Up: Advancing Graph Neural Networks through Inter-Graph Relationships

Zou, Qi, Yu, Na, Zhang, Daoliang, Zhang, Wei, Gao, Rui

Graph Neural Networks (GNNs) have excelled in learning from graph-structured data, especially in understanding the relationships within a single graph, i.e., intra-graph relationships. Despite their successes, GNNs are limited by neglecting the context of relationships across graphs, i.e., inter-graph relationships. Recognizing the potential to extend this capability, we introduce Relating-Up, a plug-and-play module that enhances GNNs by exploiting inter-graph relationships. This module incorporates a relation-aware encoder and a feedback training strategy. The former enables GNNs to capture relationships across graphs, enriching relation-aware graph representation through collective context. The latter utilizes a feedback loop mechanism for the recursively refinement of these representations, leveraging insights from refining inter-graph dynamics to conduct feedback loop. The synergy between these two innovations results in a robust and versatile module. Relating-Up enhances the expressiveness of GNNs, enabling them to encapsulate a wider spectrum of graph relationships with greater precision. Our evaluations across 16 benchmark datasets demonstrate that integrating Relating-Up into GNN architectures substantially improves performance, positioning Relating-Up as a formidable choice for a broad spectrum of graph representation learning tasks.

artificial intelligence, data mining, machine learning, (15 more...)

2405.0395

Country:

Europe (0.67)
North America > United States > California > Los Angeles County > Long Beach (0.14)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Machine LearningMar-21-2024

Non-Convex Robust Hypothesis Testing using Sinkhorn Uncertainty Sets

Wang, Jie, Gao, Rui, Xie, Yao

We present a new framework to address the non-convex robust hypothesis testing problem, wherein the goal is to seek the optimal detector that minimizes the maximum of worst-case type-I and type-II risk functions. The distributional uncertainty sets are constructed to center around the empirical distribution derived from samples based on Sinkhorn discrepancy. Given that the objective involves non-convex, non-smooth probabilistic functions that are often intractable to optimize, existing methods resort to approximations rather than exact solutions. To tackle the challenge, we introduce an exact mixed-integer exponential conic reformulation of the problem, which can be solved into a global optimum with a moderate amount of input data. Subsequently, we propose a convex approximation, demonstrating its superiority over current state-of-the-art methodologies in literature. Furthermore, we establish connections between robust hypothesis testing and regularized formulations of non-robust risk functions, offering insightful interpretations. Our numerical study highlights the satisfactory testing performance and computational efficiency of the proposed framework.

artificial intelligence, constraint, machine learning, (15 more...)

2403.14822

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Texas (0.14)

Genre: Research Report (0.64)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (0.83)

arXiv.org Machine LearningOct-31-2023

Aleatoric and Epistemic Discrimination: Fundamental Limits of Fairness Interventions

Wang, Hao, Luxi, null, He, null, Gao, Rui, Calmon, Flavio P.

Machine learning (ML) models can underperform on certain population groups due to choices made during model development and bias inherent in the data. We categorize sources of discrimination in the ML pipeline into two classes: aleatoric discrimination, which is inherent in the data distribution, and epistemic discrimination, which is due to decisions made during model development. We quantify aleatoric discrimination by determining the performance limits of a model under fairness constraints, assuming perfect knowledge of the data distribution. We demonstrate how to characterize aleatoric discrimination by applying Blackwell's results on comparing statistical experiments. We then quantify epistemic discrimination as the gap between a model's accuracy when fairness constraints are applied and the limit posed by aleatoric discrimination. We apply this approach to benchmark existing fairness interventions and investigate fairness risks in data with missing values. Our results indicate that state-of-the-art fairness interventions are effective at removing epistemic discrimination on standard (overused) tabular datasets. However, when data has missing values, there is still significant room for improvement in handling aleatoric discrimination.

artificial intelligence, discrimination, machine learning, (15 more...)

2301.11781

Country: North America > United States > Texas (0.14)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.70)

arXiv.org Artificial IntelligenceOct-16-2023

Predicting fluid-structure interaction with graph neural networks

Gao, Rui, Jaiman, Rajeev K.

We present a rotation equivariant, quasi-monolithic graph neural network framework for the reduced-order modeling of fluid-structure interaction systems. With the aid of an arbitrary Lagrangian-Eulerian formulation, the system states are evolved temporally with two sub-networks. The movement of the mesh is reduced to the evolution of several coefficients via complex-valued proper orthogonal decomposition, and the prediction of these coefficients over time is handled by a single multi-layer perceptron. A finite element-inspired hypergraph neural network is employed to predict the evolution of the fluid state based on the state of the whole system. The structural state is implicitly modeled by the movement of the mesh on the solid-fluid interface; hence it makes the proposed framework quasi-monolithic. The effectiveness of the proposed framework is assessed on two prototypical fluid-structure systems, namely the flow around an elastically-mounted cylinder, and the flow around a hyperelastic plate attached to a fixed cylinder. The proposed framework tracks the interface description and provides stable and accurate system state predictions during roll-out for at least 2000 time steps, and even demonstrates some capability in self-correcting erroneous predictions. The proposed framework also enables direct calculation of the lift and drag forces using the predicted fluid and mesh states, in contrast to existing convolution-based architectures. The proposed reduced-order model via graph neural network has implications for the development of physics-based digital twins concerning moving boundaries and fluid-structure interactions.

artificial intelligence, machine learning, neural network, (19 more...)

2210.04193

Country: North America > Canada (0.14)

Genre: Research Report (0.64)

Industry: Energy > Oil & Gas > Upstream (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceApr-24-2023

A Finite Element-Inspired Hypergraph Neural Network: Application to Fluid Dynamics Simulations

Gao, Rui, Deo, Indu Kant, Jaiman, Rajeev K.

Since analytical solutions are usually not available, numerical solutions on discretized space and time domains are considered for predictive modeling. Leveraging state-of-the-art computational fluid dynamics (CFD) approaches based on finite volume [3] or finite element [4, 5] methods, one could obtain high-fidelity solutions that can be suitable for downstream design optimization and control purposes. However, the cost of performing such simulations is significant, and becomes prohibitively high for complex problems arising from real-world applications. This limitation of traditional CFD approaches has inspired the development of data-driven projectionbased reduced-order modeling techniques. Such models are usually used in an offline-online manner. In the offline stage, an approximation of the governing flow dynamics in a low-order linear subspace is constructed based on available fluid flow data collected. This approximation reduces the complexity of the problem in the online stage, making it possible to acquire fast, accurate predictions. Popular methods in this category include proper orthogonal decomposition (POD) [6, 7], dynamic mode decomposition (DMD) [8], along with many variants (e.g., [9, 10, 11]). However, these methods encounter difficulty when applied to scenarios with high Reynolds numbers and convection-dominated problems, whereas one needs a significantly large number of linear subspaces to achieve a satisfactory approximation.

artificial intelligence, machine learning, prediction, (18 more...)

2212.14545

Country: North America > Canada (0.28)

Genre: Research Report (0.64)

Industry: Energy > Oil & Gas > Upstream (0.46)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceOct-27-2022

Adaptive Environment Modeling Based Reinforcement Learning for Collision Avoidance in Complex Scenes

Wang, Shuaijun, Gao, Rui, Han, Ruihua, Chen, Shengduo, Li, Chengyang, Hao, Qi

The major challenges of collision avoidance for robot navigation in crowded scenes lie in accurate environment modeling, fast perceptions, and trustworthy motion planning policies. This paper presents a novel adaptive environment model based collision avoidance reinforcement learning (i.e., AEMCARL) framework for an unmanned robot to achieve collision-free motions in challenging navigation scenarios. The novelty of this work is threefold: (1) developing a hierarchical network of gated-recurrent-unit (GRU) for environment modeling; (2) developing an adaptive perception mechanism with an attention module; (3) developing an adaptive reward function for the reinforcement learning (RL) framework to jointly train the environment model, perception function and motion planning policy. The proposed method is tested with the Gym-Gazebo simulator and a group of robots (Husky and Turtlebot) under various crowded scenes. Both simulation and experimental results have demonstrated the superior performance of the proposed method over baseline methods.

machine learning, obstacle, reinforcement learning, (19 more...)

2203.07709

Country: Asia > China (0.68)

Genre: Research Report (0.84)

Industry: Transportation (0.91)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)
(2 more...)

arXiv.org Machine LearningSep-24-2021

Sinkhorn Distributionally Robust Optimization

Wang, Jie, Gao, Rui, Xie, Yao

Decision-making problems under uncertainty have broad applications in operations research, machine learning, engineering, and economics. When the data involves uncertainty due to measurement error, insufficient sample size, contamination, and anomalies, or model misspecification, distributionally robust optimization (DRO) is a promising approach to data-driven optimization, by seeking a minimax robust optimal decision that minimizes the expected loss under the most adverse distribution within a given set of relevant distributions, called ambiguity set. It provides a principled framework to produce a solution with more promising out-of-sample performance than the traditional sample average approximation (SAA) method for stochastic programming [86]. We refer to [81] for a recent survey on DRO. At the core of DRO is the choice of the ambiguity set. Ideally, a good ambiguity set should take account of the properties of practical applications while maintaining the computational tractability of resulted DRO formulation; and it should be rich enough to contain all distributions relevant to the decision-making but, at the same time, should not include unnecessary distributions that lead to overly conservative decisions. Various DRO formulations have been proposed in the literature. Among them, the ambiguity set based on Wasserstein distance has recently received much attention [104, 67, 17, 46]. The Wasserstein distance incorporates the geometry of sample space, and thereby is suitable for comparing distributions with non-overlapping supports and hedging against data perturbations [46].

artificial intelligence, machine learning, optimization problem, (15 more...)

2109.11926

Country: North America > United States > Texas > Travis County > Austin (0.14)

Genre: Research Report > New Finding (0.67)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.66)

arXiv.org Machine LearningMay-20-2021

Hierarchical Non-Stationary Temporal Gaussian Processes With $L^1$-Regularization

Zhao, Zheng, Gao, Rui, Särkkä, Simo

This paper is concerned with regularized extensions of hierarchical non-stationary temporal Gaussian processes (NSGPs) in which the parameters (e.g., length-scale) are modeled as GPs. In particular, we consider two commonly used NSGP constructions which are based on explicitly constructed non-stationary covariance functions and stochastic differential equations, respectively. We extend these NSGPs by including $L^1$-regularization on the processes in order to induce sparseness. To solve the resulting regularized NSGP (R-NSGP) regression problem we develop a method based on the alternating direction method of multipliers (ADMM) and we also analyze its convergence properties theoretically. We also evaluate the performance of the proposed methods in simulated and real-world datasets.

bayesian inference, health & medicine, nsgp, (19 more...)

2105.09695

Country:

Europe (0.67)
North America > United States > Arizona (0.14)

Genre: Research Report (0.64)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.67)