AITopics | Ge, Hong

Collaborating Authors

Ge, Hong

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Multimodal Human-AI Synergy for Medical Imaging Quality Control: A Hybrid Intelligence Framework with Adaptive Dataset Curation and Closed-Loop Evaluation

Qin, Zhi, Gui, Qianhui, Bian, Mouxiao, Wang, Rui, Ge, Hong, Yao, Dandan, Sun, Ziying, Zhao, Yuan, Zhang, Yu, Shi, Hui, Wang, Dongdong, Song, Chenxin, Ju, Shenghong, Liu, Lihao, He, Junjun, Xu, Jie, Wang, Yuan-Cheng

arXiv.org Artificial IntelligenceMar-10-2025

Medical imaging quality control (QC) is essential for accurate diagnosis, yet traditional QC methods remain labor-intensive and subjective. To address this challenge, in this study, we establish a standardized dataset and evaluation framework for medical imaging QC, systematically assessing large language models (LLMs) in image quality assessment and report standardization. Specifically, we first constructed and anonymized a dataset of 161 chest X-ray (CXR) radiographs and 219 CT reports for evaluation. Then, multiple LLMs, including Gemini 2.0-Flash, GPT-4o, and DeepSeek-R1, were evaluated based on recall, precision, and F1 score to detect technical errors and inconsistencies. Experimental results show that Gemini 2.0-Flash achieved a Macro F1 score of 90 in CXR tasks, demonstrating strong generalization but limited fine-grained performance. DeepSeek-R1 excelled in CT report auditing with a 62.23\% recall rate, outperforming other models. However, its distilled variants performed poorly, while InternLM2.5-7B-chat exhibited the highest additional discovery rate, indicating broader but less precise error detection. These findings highlight the potential of LLMs in medical imaging QC, with DeepSeek-R1 and Gemini 2.0-Flash demonstrating superior performance.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2503.07032

Country:

North America > United States (0.68)
Asia > China > Jiangsu Province (0.14)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Government > Regional Government > North America Government > United States Government (0.46)
Energy > Renewable > Geothermal > Geothermal Energy Systems and Facilities > Geothermal System for Power Generation > Advanced Geothermal System (AGS) (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Numerically Stable Sparse Gaussian Processes via Minimum Separation using Cover Trees

Terenin, Alexander, Burt, David R., Artemev, Artem, Flaxman, Seth, van der Wilk, Mark, Rasmussen, Carl Edward, Ge, Hong

arXiv.org Machine LearningNov-6-2023

Gaussian processes are frequently deployed as part of larger machine learning and decision-making systems, for instance in geospatial modeling, Bayesian optimization, or in latent Gaussian models. Within a system, the Gaussian process model needs to perform in a stable and reliable manner to ensure it interacts correctly with other parts of the system. In this work, we study the numerical stability of scalable sparse approximations based on inducing points. To do so, we first review numerical stability, and illustrate typical situations in which Gaussian process models can be unstable. Building on stability theory originally developed in the interpolation literature, we derive sufficient and in certain cases necessary conditions on the inducing points for the computations performed to be numerically stable. For low-dimensional tasks such as geospatial modeling, we propose an automated method for computing inducing points satisfying these conditions. This is done via a modification of the cover tree data structure, which is of independent interest. We additionally propose an alternative sparse approximation for regression with a Gaussian likelihood which trades off a small amount of performance to further improve stability. We provide illustrative examples showing the relationship between stability of calculations and predictive performance of inducing point methods on spatial tasks.

artificial intelligence, machine learning, modeling & simulation, (20 more...)

arXiv.org Machine Learning

2210.07893

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Data Science (0.92)
(2 more...)

Add feedback

Understanding Sparse Feature Updates in Deep Networks using Iterative Linearisation

Goldwaser, Adrian, Ge, Hong

arXiv.org Machine LearningOct-12-2023

Larger and deeper networks generalise well despite their increased capacity to overfit. Understanding why this happens is theoretically and practically important. One recent approach looks at the infinitely wide limits of such networks and their corresponding kernels. However, these theoretical tools cannot fully explain finite networks as the empirical kernel changes significantly during gradient-descent-based training in contrast to infinite networks. In this work, we derive an iterative linearised training method as a novel empirical tool to further investigate this distinction, allowing us to control for sparse (i.e. infrequent) feature updates and quantify the frequency of feature learning needed to achieve comparable performance. We justify iterative linearisation as an interpolation between a finite analog of the infinite width regime, which does not learn features, and standard gradient descent training, which does. Informally, we also show that it is analogous to a damped version of the Gauss-Newton algorithm -- a second-order method. We show that in a variety of cases, iterative linearised training surprisingly performs on par with standard training, noting in particular how much less frequent feature learning is required to achieve comparable performance. We also show that feature learning is essential for good performance. Since such feature learning inevitably causes changes in the NTK kernel, we provide direct negative evidence for the NTK theory, which states the NTK kernel remains constant during training.

artificial intelligence, iterative linearisation, machine learning, (16 more...)

arXiv.org Machine Learning

2211.12345

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
North America > Canada > Quebec (0.14)
Asia > Middle East > Israel (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.56)

Add feedback

Neural Characteristic Activation Value Analysis for Improved ReLU Network Feature Learning

Chen, Wenlin, Ge, Hong

arXiv.org Machine LearningSep-29-2023

This work examines the characteristic activation values of individual ReLU units in neural networks. We refer to the set of input locations corresponding to such characteristic activation values as the characteristic activation set of a ReLU unit. We draw an explicit connection between the characteristic activation set and learned features in ReLU networks. This connection leads to new insights into how various neural network normalization techniques used in modern deep learning architectures regularize and stabilize stochastic gradient optimization. Utilizing these insights, we propose geometric parameterization for ReLU networks to improve feature learning, which decouples the radial and angular parameters in the hyperspherical coordinate system. We empirically verify its usefulness with less carefully chosen initialization schemes and larger learning rates. We report significant improvements in optimization stability, convergence speed, and generalization performance for various models on a variety of datasets, including the ResNet-50 network on ImageNet.

artificial intelligence, characteristic activation boundary, machine learning, (12 more...)

arXiv.org Machine Learning

2305.15912

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)

Genre: Research Report (0.64)

Industry:

Telecommunications > Networks (0.40)
Information Technology > Networks (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Beyond Intuition, a Framework for Applying GPs to Real-World Data

Tazi, Kenza, Lin, Jihao Andreas, Viljoen, Ross, Gardner, Alex, John, ST, Ge, Hong, Turner, Richard E.

arXiv.org Artificial IntelligenceJul-17-2023

Gaussian Processes (GPs) offer an attractive method for regression over small, structured and correlated datasets. However, their deployment is hindered by computational costs and limited guidelines on how to apply GPs beyond simple low-dimensional datasets. We propose a framework to identify the suitability of GPs to a given problem and how to set up a robust and well-specified GP model. The guidelines formalise the decisions of experienced GP practitioners, with an emphasis on kernel design and options for computational scalability. The framework is then applied to a case study of glacier elevation change yielding more accurate results at test time.

artificial intelligence, data mining, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2307.03093

Country:

North America > United States (0.68)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.28)

Genre:

Workflow (0.52)
Research Report (0.50)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.68)
(2 more...)

Add feedback

Bayesian inference and neural estimation of acoustic wave propagation

Huang, Yongchao, He, Yuhang, Ge, Hong

arXiv.org Artificial IntelligenceMay-28-2023

In this work, we introduce a novel framework which combines physics and machine learning methods to analyse acoustic signals. Three methods are developed for this task: a Bayesian inference approach for inferring the spectral acoustics characteristics, a neural-physical model which equips a neural network with forward and backward physical losses, and the non-linear least squares approach which serves as benchmark. The inferred propagation coefficient leads to the room impulse response (RIR) quantity which can be used for relocalisation with uncertainty. The simplicity and efficiency of this framework is empirically validated on simulated data.

artificial intelligence, bayesian inference, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2305.17749

Country: Europe > United Kingdom > England (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Bayesian Learning of Sum-Product Networks

Trapp, Martin, Peharz, Robert, Ge, Hong, Pernkopf, Franz, Ghahramani, Zoubin

arXiv.org Machine LearningMay-26-2019

Sum-product networks (SPNs) are flexible density estimators and have received significant attention, due to their attractive inference properties. While parameter learning in SPNs is well developed, structure learning leaves something to be desired: Even though there is a plethora of SPN structure learners, most of them are somewhat ad-hoc, and based on intuition rather than a clear learning principle. In this paper, we introduce a well-principled Bayesian framework for SPN structure learning. First, we decompose the problem into i) laying out a basic computational graph, and ii) learning the so-called scope function over the graph. The first is rather unproblematic and akin to neural network architecture validation. The second characterises the effective structure of the SPN and needs to respect the usual structural constraints in SPN, i.e. completeness and decomposability. While representing and learning the scope function is rather involved in general, in this paper, we propose a natural parametrisation for an important and widely used special case of SPNs. These structural parameters are incorporated into a Bayesian model, such that simultaneous structure and parameter learning is cast into monolithic Bayesian posterior inference. In various experiments, our Bayesian SPNs often improve test likelihoods over greedy SPN learners. Further, since the Bayesian framework protects against overfitting, we are able to evaluate hyper-parameters directly on the Bayesian model score, waiving the need for a separate validation set, which is especially beneficial in low data regimes. Bayesian SPNs can be applied to heterogeneous domains and can easily be extended to nonparametric formulations. Moreover, our Bayesian approach is the first which consistently and robustly learns SPN structures under missing data.

bayesian inference, health & medicine, sum-product network, (20 more...)

arXiv.org Machine Learning

1905.10884

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre: Research Report > Experimental Study (0.94)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Particle Gibbs for Infinite Hidden Markov Models

Tripuraneni, Nilesh, Gu, Shixiang (Shane), Ge, Hong, Ghahramani, Zoubin

Neural Information Processing SystemsDec-31-2015

Infinite Hidden Markov Models (iHMM's) are an attractive, nonparametric generalization of the classical Hidden Markov Model which can automatically infer the number of hidden states in the system. However, due to the infinite-dimensional nature of the transition dynamics, performing inference in the iHMM is difficult. In this paper, we present an infinite-state Particle Gibbs (PG) algorithm to resample state trajectories for the iHMM. The proposed algorithm uses an efficient proposal optimized for iHMMs, and leverages ancestor sampling to improve the mixing of the standard PG algorithm. Our algorithm demonstrates significant convergence improvements on synthetic and real world data sets.

artificial intelligence, machine learning, sampler, (15 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Dirichlet Fragmentation Processes

Ge, Hong, Gal, Yarin, Ghahramani, Zoubin

arXiv.org Machine LearningSep-15-2015

Tree structures are ubiquitous in data across many domains, and many datasets are naturally modelled by unobserved tree structures. In this paper, first we review the theory of random fragmentation processes [Bertoin, 2006], and a number of existing methods for modelling trees, including the popular nested Chinese restaurant process (nCRP). Then we define a general class of probability distributions over trees: the Dirichlet fragmentation process (DFP) through a novel combination of the theory of Dirichlet processes and random fragmentation processes. This DFP presents a stick-breaking construction, and relates to the nCRP in the same way the Dirichlet process relates to the Chinese restaurant process. Furthermore, we develop a novel hierarchical mixture model with the DFP, and empirically compare the new model to similar models in machine learning. Experiments show the DFP mixture model to be convincingly better than existing state-of-the-art approaches for hierarchical clustering and density modelling.

artificial intelligence, fragmentation process, health & medicine, (18 more...)

arXiv.org Machine Learning

1509.04781

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)

Genre:

Overview (0.54)
Research Report > Promising Solution (0.34)

Industry: Health & Medicine (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.51)

Add feedback

A Linear-Time Particle Gibbs Sampler for Infinite Hidden Markov Models

Tripuraneni, Nilesh, Gu, Shane, Ge, Hong, Ghahramani, Zoubin

arXiv.org Machine LearningJun-9-2015

Infinite Hidden Markov Models (iHMM's) are an attractive, nonparametric generalization of the classical Hidden Markov Model which can automatically infer the number of hidden states in the system. However, due to the infinite-dimensional nature of transition dynamics performing inference in the iHMM is difficult. In this paper, we present an infinite-state Particle Gibbs (PG) algorithm to resample state trajectories for the iHMM. The proposed algorithm uses an efficient proposal optimized for iHMMs and leverages ancestor sampling to suppress degeneracy of the standard PG algorithm. Our algorithm demonstrates significant convergence improvements on synthetic and real world data sets. Additionally, the infinite-state PG algorithm has linear-time complexity in the number of states in the sampler, while competing methods scale quadratically.

artificial intelligence, machine learning, sampler, (17 more...)

arXiv.org Machine Learning

1505.00428

Country: Europe > United Kingdom (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback