AITopics

2504.04973

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

arXiv.org Artificial IntelligenceDec-29-2023

XAI for In-hospital Mortality Prediction via Multimodal ICU Data

Li, Xingqiao, Gu, Jindong, Wang, Zhiyong, Yuan, Yancheng, Du, Bo, He, Fengxiang

Predicting in-hospital mortality for intensive care unit (ICU) patients is key to final clinical outcomes. AI has shown advantaged accuracy but suffers from the lack of explainability. To address this issue, this paper proposes an eXplainable Multimodal Mortality Predictor (X-MMP) approaching an efficient, explainable AI solution for predicting in-hospital mortality via multimodal ICU data. We employ multimodal learning in our framework, which can receive heterogeneous inputs from clinical data and make decisions. Furthermore, we introduce an explainable method, namely Layer-Wise Propagation to Transformer, as a proper extension of the LRP method to Transformers, producing explanations over multimodal inputs and revealing the salient features attributed to prediction. Moreover, the contribution of each modality to clinical outcomes can be visualized, assisting clinicians in understanding the reasoning behind decision-making. We construct a multimodal dataset based on MIMIC-III and MIMIC-III Waveform Database Matched Subset. Comprehensive experiments on benchmark datasets demonstrate that our proposed framework can achieve reasonable interpretation with competitive prediction accuracy. In particular, our framework can be easily transferred to other clinical tasks, which facilitates the discovery of crucial factors in healthcare research.

machine learning, natural language, prediction, (19 more...)

2312.17624

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Health Care Providers & Services (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)

arXiv.org Machine LearningDec-24-2023

Understanding Deep Learning via Decision Boundary

Lei, Shiye, He, Fengxiang, Yuan, Yancheng, Tao, Dacheng

This paper discovers that the neural network with lower decision boundary (DB) variability has better generalizability. Two new notions, algorithm DB variability and $(\epsilon, \eta)$-data DB variability, are proposed to measure the decision boundary variability from the algorithm and data perspectives. Extensive experiments show significant negative correlations between the decision boundary variability and the generalizability. From the theoretical view, two lower bounds based on algorithm DB variability are proposed and do not explicitly depend on the sample size. We also prove an upper bound of order $\mathcal{O}\left(\frac{1}{\sqrt{m}}+\epsilon+\eta\log\frac{1}{\eta}\right)$ based on data DB variability. The bound is convenient to estimate without the requirement of labels, and does not explicitly depend on the network size which is usually prohibitively large in deep learning.

artificial intelligence, db variability, machine learning, (14 more...)

doi: 10.1109/TNNLS.2023.3326654

2206.01515

Country: North America (0.28)

Genre:

Research Report > New Finding (0.68)
Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningNov-9-2023

Decentralized SGD and Average-direction SAM are Asymptotically Equivalent

Zhu, Tongtian, He, Fengxiang, Chen, Kaixuan, Song, Mingli, Tao, Dacheng

Decentralized stochastic gradient descent (D-SGD) allows collaborative learning on massive devices simultaneously without the control of a central server. However, existing theories claim that decentralization invariably undermines generalization. In this paper, we challenge the conventional belief and present a completely new perspective for understanding decentralized learning. We prove that D-SGD implicitly minimizes the loss function of an average-direction Sharpness-aware minimization (SAM) algorithm under general non-convex non-$\beta$-smooth settings. This surprising asymptotic equivalence reveals an intrinsic regularization-optimization trade-off and three advantages of decentralization: (1) there exists a free uncertainty evaluation mechanism in D-SGD to improve posterior estimation; (2) D-SGD exhibits a gradient smoothing effect; and (3) the sharpness regularization effect of D-SGD does not decrease as total batch size increases, which justifies the potential generalization benefit of D-SGD over centralized SGD (C-SGD) in large-batch scenarios. The code is available at https://github.com/Raiden-Zhu/ICML-2023-DSGD-and-SAM.

artificial intelligence, machine learning, optimization problem, (16 more...)

2306.02913

Country: North America > United States (0.67)

Genre: Research Report (0.63)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)

arXiv.org Artificial IntelligenceNov-8-2023

When to Update Your Model: Constrained Model-based Reinforcement Learning

Ji, Tianying, Luo, Yu, Sun, Fuchun, Jing, Mingxuan, He, Fengxiang, Huang, Wenbing

Designing and analyzing model-based RL (MBRL) algorithms with guaranteed monotonic improvement has been challenging, mainly due to the interdependence between policy optimization and model learning. Existing discrepancy bounds generally ignore the impacts of model shifts, and their corresponding algorithms are prone to degrade performance by drastic model updating. In this work, we first propose a novel and general theoretical scheme for a non-decreasing performance guarantee of MBRL. Our follow-up derived bounds reveal the relationship between model shifts and performance improvement. These discoveries encourage us to formulate a constrained lower-bound optimization problem to permit the monotonicity of MBRL. A further example demonstrates that learning models from a dynamically-varying number of explorations benefit the eventual returns. Motivated by these analyses, we design a simple but effective algorithm CMLO (Constrained Model-shift Lower-bound Optimization), by introducing an event-triggered mechanism that flexibly determines when to update the model. Experiments show that CMLO surpasses other state-of-the-art methods and produces a boost when various policy optimization methods are employed.

artificial intelligence, machine learning, model shift, (12 more...)

2210.08349

Country: Asia (0.28)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.67)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceSep-30-2023

Boosting Fair Classifier Generalization through Adaptive Priority Reweighing

Hu, Zhihao, Xu, Yiran, Du, Mengnan, Gu, Jindong, Tian, Xinmei, He, Fengxiang

With the increasing penetration of machine learning applications in critical decision-making areas, calls for algorithmic fairness are more prominent. Although there have been various modalities to improve algorithmic fairness through learning with fairness constraints, their performance does not generalize well in the test set. A performance-promising fair algorithm with better generalizability is needed. This paper proposes a novel adaptive reweighing method to eliminate the impact of the distribution shifts between training and test data on model generalizability. Most previous reweighing methods propose to assign a unified weight for each (sub)group. Rather, our method granularly models the distance from the sample predictions to the decision boundary. Our adaptive reweighing method prioritizes samples closer to the decision boundary and assigns a higher weight to improve the generalizability of fair classifiers. Extensive experiments are performed to validate the generalizability of our adaptive priority reweighing method for accuracy and fairness measures (i.e., equal opportunity, equalized odds, and demographic parity) in tabular benchmarks. We also highlight the performance of our method in improving the fairness of language and vision models. The code is available at https://github.com/che2198/APW.

artificial intelligence, fairness, machine learning, (13 more...)

2309.08375

Country: Europe > United Kingdom > England (0.46)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)

arXiv.org Machine LearningAug-29-2023

Heterogeneous Multi-Task Gaussian Cox Processes

Zhou, Feng, Kong, Quyu, Deng, Zhijie, He, Fengxiang, Cui, Peng, Zhu, Jun

Inhomogeneous Poisson process data defined on a continuous spatio-temporal domain has attracted immense attention recently in a wide variety of applications, including reliability analysis in manufacturing systems (Soleimani et al, 2017), event capture in sensing regions (Mutny and Krause, 2021), crime prediction in urban area (Shirota and Gelfand, 2017) and disease diagnosis based on medical records (Lasko, 2014). The reliable training of an inhomogeneous Poisson process model critically relies on a large amount of data to avoid overfitting, especially when modeling high-dimensional point processes. However, one challenge is that the available training data is routinely sparse or even partially missing in specific applications. Taking manufacturing failure and healthcare analysis as motivating examples: the modern manufacturing machines are reliable and sparsely fail; the individuals with healthy constitution will not visit hospital very often. The data missing problems also arise, e.g., the event location capture is intermittent for sensing systems because of weather or other related barriers.

artificial intelligence, cox process, machine learning, (17 more...)

2308.15364

Country: North America > Canada (0.67)

Genre: Research Report (0.50)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Modeling & Simulation (0.95)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)

arXiv.org Artificial IntelligenceJul-8-2023

OmniForce: On Human-Centered, Large Model Empowered and Cloud-Edge Collaborative AutoML System

Xue, Chao, Liu, Wei, Xie, Shuai, Wang, Zhenfang, Li, Jiaxing, Peng, Xuyang, Ding, Liang, Zhao, Shanshan, Cao, Qiong, Yang, Yibo, He, Fengxiang, Cai, Bohua, Bian, Rongcheng, Zhao, Yiyan, Zheng, Heliang, Liu, Xiangyang, Liu, Dongkai, Liu, Daqing, Shen, Li, Li, Chang, Zhang, Shijin, Zhang, Yukang, Chen, Guanpu, Chen, Shixiang, Zhan, Yibing, Zhang, Jing, Wang, Chaoyue, Tao, Dacheng

Automated machine learning (AutoML) seeks to build ML models with minimal human effort. While considerable research has been conducted in the area of AutoML in general, aiming to take humans out of the loop when building artificial intelligence (AI) applications, scant literature has focused on how AutoML works well in open-environment scenarios such as the process of training and updating large models, industrial supply chains or the industrial metaverse, where people often face open-loop problems during the search process: they must continuously collect data, update data and models, satisfy the requirements of the development and deployment environment, support massive devices, modify evaluation metrics, etc. Addressing the open-environment issue with pure data-driven approaches requires considerable data, computing resources, and effort from dedicated data engineers, making current AutoML systems and platforms inefficient and computationally intractable. Human-computer interaction is a practical and feasible way to tackle the problem of open-environment AI. In this paper, we introduce OmniForce, a human-centered AutoML (HAML) system that yields both human-assisted ML and ML-assisted human techniques, to put an AutoML system into practice and build adaptive AI in open-environment scenarios. Specifically, we present OmniForce in terms of ML version management; pipeline-driven development and deployment collaborations; a flexible search strategy framework; and widely provisioned and crowdsourced application algorithms, including large models. Furthermore, the (large) models constructed by OmniForce can be automatically turned into remote services in a few minutes; this process is dubbed model as a service (MaaS). Experimental results obtained in multiple search spaces and real-world use cases demonstrate the efficacy and efficiency of OmniForce.

artificial intelligence, machine learning, natural language, (21 more...)

2303.00501

Country: Asia (0.28)

Genre:

Workflow (0.68)
Research Report > Promising Solution (0.45)

Industry:

Transportation > Ground > Road (1.00)
Information Technology > Security & Privacy (1.00)
Automobiles & Trucks (0.92)
(3 more...)

Technology:

Information Technology > Human Computer Interaction > Interfaces (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
(4 more...)

arXiv.org Artificial IntelligenceJul-7-2023

$E(2)$-Equivariant Vision Transformer

Xu, Renjun, Yang, Kaifan, Liu, Ke, He, Fengxiang

Vision Transformer (ViT) has achieved remarkable performance in computer vision. However, positional encoding in ViT makes it substantially difficult to learn the intrinsic equivariance in data. Initial attempts have been made on designing equivariant ViT but are proved defective in some cases in this paper. To address this issue, we design a Group Equivariant Vision Transformer (GE-ViT) via a novel, effective positional encoding operator. We prove that GE-ViT meets all the theoretical requirements of an equivariant neural network. Comprehensive experiments are conducted on standard benchmark datasets, demonstrating that GE-ViT significantly outperforms non-equivariant self-attention networks. The code is available at https://github.com/ZJUCDSYangKaifan/GEVit.

artificial intelligence, machine learning, representation, (17 more...)

2306.06722

Genre: Research Report (0.64)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Artificial IntelligenceJun-6-2023

InFi: End-to-End Learning to Filter Input for Resource-Efficiency in Mobile-Centric Inference

Yuan, Mu, Zhang, Lan, He, Fengxiang, Tong, Xueting, Song, Miao-Hui, Xu, Zhengyuan, Li, Xiang-Yang

Mobile-centric AI applications have high requirements for resource-efficiency of model inference. Input filtering is a promising approach to eliminate the redundancy so as to reduce the cost of inference. Previous efforts have tailored effective solutions for many applications, but left two essential questions unanswered: (1) theoretical filterability of an inference workload to guide the application of input filtering techniques, thereby avoiding the trial-and-error cost for resource-constrained mobile applications; (2) robust discriminability of feature embedding to allow input filtering to be widely effective for diverse inference tasks and input content. To answer them, we first formalize the input filtering problem and theoretically compare the hypothesis complexity of inference models and input filters to understand the optimization potential. Then we propose the first end-to-end learnable input filtering framework that covers most state-of-the-art methods and surpasses them in feature embedding with robust discriminability. We design and implement InFi that supports six input modalities and multiple mobile-centric deployments. Comprehensive evaluations confirm our theoretical results and show that InFi outperforms strong baselines in applicability, accuracy, and efficiency. InFi achieve 8.5x throughput and save 95% bandwidth, while keeping over 90% accuracy, for a video analytics application on mobile platforms.

data mining, machine learning, workload, (23 more...)

2209.13873

Country:

Asia > China (0.47)
North America > United States (0.46)

Genre: Research Report > Promising Solution (0.54)

Industry: Information Technology (0.93)

Technology:

Information Technology > Internet of Things (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Mobile (1.00)
(5 more...)