AITopics | Tan, Kay Chen

Collaborating Authors

Tan, Kay Chen

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

scBIT: Integrating Single-cell Transcriptomic Data into fMRI-based Prediction for Alzheimer's Disease Diagnosis

Huang, Yu-An, Hu, Yao, Li, Yue-Chao, Cao, Xiyue, Li, Xinyuan, Tan, Kay Chen, You, Zhu-Hong, Huang, Zhi-An

arXiv.org Artificial IntelligenceFeb-4-2025

Functional MRI (fMRI) and single-cell transcriptomics are pivotal in Alzheimer's disease (AD) research, each providing unique insights into neural function and molecular mechanisms. However, integrating these complementary modalities remains largely unexplored. Here, we introduce scBIT, a novel method for enhancing AD prediction by combining fMRI with single-nucleus RNA (snRNA). scBIT leverages snRNA as an auxiliary modality, significantly improving fMRI-based prediction models and providing comprehensive interpretability. It employs a sampling strategy to segment snRNA data into cell-type-specific gene networks and utilizes a self-explainable graph neural network to extract critical subgraphs. Additionally, we use demographic and genetic similarities to pair snRNA and fMRI data across individuals, enabling robust cross-modal learning. Extensive experiments validate scBIT's effectiveness in revealing intricate brain region-gene associations and enhancing diagnostic prediction accuracy. By advancing brain imaging transcriptomics to the single-cell level, scBIT sheds new light on biomarker discovery in AD research. Experimental results show that incorporating snRNA data into the scBIT model significantly boosts accuracy, improving binary classification by 3.39% and five-class classification by 26.59%. The codes were implemented in Python and have been released on GitHub (https://github.com/77YQ77/scBIT) and Zenodo (https://zenodo.org/records/11599030) with detailed instructions.

artificial intelligence, dataset, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2502.0263

Country: Asia > China (0.68)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology > Alzheimer's Disease (1.00)
Health & Medicine > Health Care Technology (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

ParetoLens: A Visual Analytics Framework for Exploring Solution Sets of Multi-objective Evolutionary Algorithms

Ma, Yuxin, Zhang, Zherui, Cheng, Ran, Jin, Yaochu, Tan, Kay Chen

arXiv.org Artificial IntelligenceJan-6-2025

In the domain of multi-objective optimization, evolutionary algorithms are distinguished by their capability to generate a diverse population of solutions that navigate the trade-offs inherent among competing objectives. This has catalyzed the ascension of evolutionary multi-objective optimization (EMO) as a prevalent approach. Despite the effectiveness of the EMO paradigm, the analysis of resultant solution sets presents considerable challenges. This is primarily attributed to the high-dimensional nature of the data and the constraints imposed by static visualization methods, which frequently culminate in visual clutter and impede interactive exploratory analysis. To address these challenges, this paper introduces ParetoLens, a visual analytics framework specifically tailored to enhance the inspection and exploration of solution sets derived from the multi-objective evolutionary algorithms. Utilizing a modularized, algorithm-agnostic design, ParetoLens enables a detailed inspection of solution distributions in both decision and objective spaces through a suite of interactive visual representations. This approach not only mitigates the issues associated with static visualizations but also supports a more nuanced and flexible analysis process. The usability of the framework is evaluated through case studies and expert interviews, demonstrating its potential to uncover complex patterns and facilitate a deeper understanding of multi-objective optimization solution sets. A demo website of ParetoLens is available at https://dva-lab.org/paretolens/.

artificial intelligence, evolutionary algorithm, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2501.02857

Genre:

Personal > Interview (0.66)
Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Toward Automated Algorithm Design: A Survey and Practical Guide to Meta-Black-Box-Optimization

Ma, Zeyuan, Guo, Hongshu, Gong, Yue-Jiao, Zhang, Jun, Tan, Kay Chen

arXiv.org Artificial IntelligenceNov-16-2024

In this survey, we introduce Meta-Black-Box-Optimization~(MetaBBO) as an emerging avenue within the Evolutionary Computation~(EC) community, which incorporates Meta-learning approaches to assist automated algorithm design. Despite the success of MetaBBO, the current literature provides insufficient summaries of its key aspects and lacks practical guidance for implementation. To bridge this gap, we offer a comprehensive review of recent advances in MetaBBO, providing an in-depth examination of its key developments. We begin with a unified definition of the MetaBBO paradigm, followed by a systematic taxonomy of various algorithm design tasks, including algorithm selection, algorithm configuration, solution manipulation, and algorithm generation. Further, we conceptually summarize different learning methodologies behind current MetaBBO works, including reinforcement learning, supervised learning, neuroevolution, and in-context learning with Large Language Models. A comprehensive evaluation of the latest representative MetaBBO methods is then carried out, alongside an experimental analysis of their optimization performance, computational efficiency, and generalization ability. Based on the evaluation results, we meticulously identify a set of core designs that enhance the generalization and learning effectiveness of MetaBBO. Finally, we outline the vision for the field by providing insight into the latest trends and potential future directions. Relevant literature will be continuously collected and updated at \url{https://github.com/GMC-DRL/Awesome-MetaBBO}.

evolutionary algorithm, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2411.00625

Country: Asia > China (0.46)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Energy (1.00)
Transportation > Air (0.61)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(4 more...)

Add feedback

HM3: Hierarchical Multi-Objective Model Merging for Pretrained Models

Zhou, Yu, Wu, Xingyu, Wu, Jibin, Feng, Liang, Tan, Kay Chen

arXiv.org Artificial IntelligenceSep-27-2024

Model merging is a technique that combines multiple large pretrained models into a single model with enhanced performance and broader task adaptability. It has gained popularity in large pretrained model development due to its ability to bypass the need for original training data and further training processes. However, most existing model merging approaches focus solely on exploring the parameter space, merging models with identical architectures. Merging within the architecture space, despite its potential, remains in its early stages due to the vast search space and the challenges of layer compatibility. This paper marks a significant advance toward more flexible and comprehensive model merging techniques by modeling the architecture-space merging process as a reinforcement learning task. We train policy and value networks using offline sampling of weight vectors, which are then employed for the online optimization of merging strategies. Moreover, a multi-objective optimization paradigm is introduced to accommodate users' diverse task preferences, learning the Pareto front of optimal models to offer customized merging suggestions. Experimental results across multiple tasks, including text translation, mathematical reasoning, and code generation, validate the effectiveness and superiority of the proposed framework in model merging. The code will be made publicly available after the review process.

machine learning, natural language, reinforcement learning, (18 more...)

arXiv.org Artificial Intelligence

2409.18893

Country: Asia > China (0.28)

Genre: Research Report (0.81)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.88)

Add feedback

Unlock the Power of Algorithm Features: A Generalization Analysis for Algorithm Selection

Wu, Xingyu, Zhong, Yan, Wu, Jibin, Huang, Yuxiao, Wu, Sheng-hao, Tan, Kay Chen

arXiv.org Artificial IntelligenceJun-3-2024

In the algorithm selection research, the discussion surrounding algorithm features has been significantly overshadowed by the emphasis on problem features. Although a few empirical studies have yielded evidence regarding the effectiveness of algorithm features, the potential benefits of incorporating algorithm features into algorithm selection models and their suitability for different scenarios remain unclear. In this paper, we address this gap by proposing the first provable guarantee for algorithm selection based on algorithm features, taking a generalization perspective. We analyze the benefits and costs associated with algorithm features and investigate how the generalization error is affected by different factors. Specifically, we examine adaptive and predefined algorithm features under transductive and inductive learning paradigms, respectively, and derive upper bounds for the generalization error based on their model's Rademacher complexity. Our theoretical findings not only provide tight upper bounds, but also offer analytical insights into the impact of various factors, such as the training scale of problem instances and candidate algorithms, model parameters, feature values, and distributional differences between the training and test data. Notably, we demonstrate how models will benefit from algorithm features in complex scenarios involving many algorithms, and proves the positive correlation between generalization error bound and $\chi^2$-divergence of distributions.

artificial intelligence, data mining, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2405.11349

Country: Asia > China (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Add feedback

Explainable Molecular Property Prediction: Aligning Chemical Concepts with Predictions via Language Models

Wang, Zhenzhong, Lin, Zehui, Lin, Wanyu, Yang, Ming, Zeng, Minggang, Tan, Kay Chen

arXiv.org Artificial IntelligenceMay-31-2024

Providing explainable molecule property predictions is critical for many scientific domains, such as drug discovery and material science. Though transformer-based language models have shown great potential in accurate molecular property prediction, they neither provide chemically meaningful explanations nor faithfully reveal the molecular structure-property relationships. In this work, we develop a new framework for explainable molecular property prediction based on language models, dubbed as Lamole, which can provide chemical concepts-aligned explanations. We first leverage a designated molecular representation -- the Group SELFIES -- as it can provide chemically meaningful semantics. Because attention mechanisms in Transformers can inherently capture relationships within the input, we further incorporate the attention weights and gradients together to generate explanations for capturing the functional group interactions. We then carefully craft a marginal loss to explicitly optimize the explanations to be able to align with the chemists' annotations. We bridge the manifold hypothesis with the elaborated marginal loss to prove that the loss can align the explanations with the tangent space of the data manifold, leading to concept-aligned explanations. Experimental results over six mutagenicity datasets and one hepatotoxicity dataset demonstrate Lamole can achieve comparable classification accuracy and boost the explanation accuracy by up to 14.8%, being the state-of-the-art in explainable molecular property prediction.

explanation, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2405.16041

Genre: Research Report (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.87)
Health & Medicine > Therapeutic Area > Toxicology (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Fast 3D Molecule Generation via Unified Geometric Optimal Transport

Hong, Haokai, Lin, Wanyu, Tan, Kay Chen

arXiv.org Artificial IntelligenceMay-24-2024

This paper proposes a new 3D molecule generation framework, called GOAT, for fast and effective 3D molecule generation based on the flow-matching optimal transport objective. Specifically, we formulate a geometric transport formula for measuring the cost of mapping multi-modal features (e.g., continuous atom coordinates and categorical atom types) between a base distribution and a target data distribution. Our formula is solved within a unified, equivalent, and smooth representation space. This is achieved by transforming the multi-modal features into a continuous latent space with equivalent networks. In addition, we find that identifying optimal distributional coupling is necessary for fast and effective transport between any two distributions. We further propose a flow refinement and purification mechanism for optimal coupling identification. By doing so, GOAT can turn arbitrary distribution couplings into new deterministic couplings, leading to a unified optimal transport path for fast 3D molecule generation. The purification filters the subpar molecules to ensure the ultimate generation performance. We theoretically prove the proposed method indeed reduced the transport cost. Finally, extensive experiments show that GOAT enjoys the efficiency of solving geometric optimal transport, leading to a double speedup compared to the sub-optimal method while achieving the best generation quality regarding validity, uniqueness, and novelty.

artificial intelligence, machine learning, transport cost, (15 more...)

arXiv.org Artificial Intelligence

2405.15252

Country: North America > United States (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)

Add feedback

Multi-View Subgraph Neural Networks: Self-Supervised Learning with Scarce Labeled Data

Wang, Zhenzhong, Zeng, Qingyuan, Lin, Wanyu, Jiang, Min, Tan, Kay Chen

arXiv.org Artificial IntelligenceApr-18-2024

While graph neural networks (GNNs) have become the de-facto standard for graph-based node classification, they impose a strong assumption on the availability of sufficient labeled samples. This assumption restricts the classification performance of prevailing GNNs on many real-world applications suffering from low-data regimes. Specifically, features extracted from scarce labeled nodes could not provide sufficient supervision for the unlabeled samples, leading to severe over-fitting. In this work, we point out that leveraging subgraphs to capture long-range dependencies can augment the representation of a node with homophily properties, thus alleviating the low-data regime. However, prior works leveraging subgraphs fail to capture the long-range dependencies among nodes. To this end, we present a novel self-supervised learning framework, called multi-view subgraph neural networks (Muse), for handling long-range dependencies. In particular, we propose an information theory-based identification mechanism to identify two types of subgraphs from the views of input space and latent space, respectively. The former is to capture the local structure of the graph, while the latter captures the long-range dependencies among nodes. By fusing these two views of subgraphs, the learned representations can preserve the topological properties of the graph at large, including the local structure and long-range dependencies, thus maximizing their expressiveness for downstream node classification tasks. Experimental results show that Muse outperforms the alternative methods on node classification tasks with limited labeled data.

artificial intelligence, machine learning, node, (14 more...)

arXiv.org Artificial Intelligence

2404.12569

Country: Asia > China (0.28)

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

CausalBench: A Comprehensive Benchmark for Causal Learning Capability of Large Language Models

Zhou, Yu, Wu, Xingyu, Huang, Beicheng, Wu, Jibin, Feng, Liang, Tan, Kay Chen

arXiv.org Artificial IntelligenceApr-9-2024

Causality reveals fundamental principles behind data distributions in real-world scenarios, and the capability of large language models (LLMs) to understand causality directly impacts their efficacy across explaining outputs, adapting to new evidence, and generating counterfactuals. With the proliferation of LLMs, the evaluation of this capacity is increasingly garnering attention. However, the absence of a comprehensive benchmark has rendered existing evaluation studies being straightforward, undiversified, and homogeneous. To address these challenges, this paper proposes a comprehensive benchmark, namely CausalBench, to evaluate the causality understanding capabilities of LLMs. Originating from the causal research community, CausalBench encompasses three causal learning-related tasks, which facilitate a convenient comparison of LLMs' performance with classic causal learning algorithms. Meanwhile, causal networks of varying scales and densities are integrated in CausalBench, to explore the upper limits of LLMs' capabilities across task scenarios of varying difficulty. Notably, background knowledge and structured data are also incorporated into CausalBench to thoroughly unlock the underlying potential of LLMs for long-text comprehension and prior information utilization. Based on CausalBench, this paper evaluates nineteen leading LLMs and unveils insightful conclusions in diverse aspects. Firstly, we present the strengths and weaknesses of LLMs and quantitatively explore the upper limits of their capabilities across various scenarios. Meanwhile, we further discern the adaptability and abilities of LLMs to specific structural networks and complex chain of thought structures. Moreover, this paper quantitatively presents the differences across diverse information sources and uncovers the gap between LLMs' capabilities in causal understanding within textual contexts and numerical domains.

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2404.06349

Country: Asia > China (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

Add feedback

Diffusion-Driven Domain Adaptation for Generating 3D Molecules

Hong, Haokai, Lin, Wanyu, Tan, Kay Chen

arXiv.org Artificial IntelligenceApr-1-2024

Can we train a molecule generator that can generate 3D molecules from a new domain, circumventing the need to collect data? This problem can be cast as the problem of domain adaptive molecule generation. This work presents a novel and principled diffusion-based approach, called GADM, that allows shifting a generative model to desired new domains without the need to collect even a single molecule. As the domain shift is typically caused by the structure variations of molecules, e.g., scaffold variations, we leverage a designated equivariant masked autoencoder (MAE) along with various masking strategies to capture the structural-grained representations of the in-domain varieties. In particular, with an asymmetric encoder-decoder module, the MAE can generalize to unseen structure variations from the target domains. These structure variations are encoded with an equivariant encoder and treated as domain supervisors to control denoising. We show that, with these encoded structural-grained domain supervisors, GADM can generate effective molecules within the desired new domains. We conduct extensive experiments across various domain adaptation tasks over benchmarking datasets. We show that our approach can improve up to 65.6% in terms of success rate defined based on molecular validity, uniqueness, and novelty compared to alternative baselines.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2404.00962

Country:

Europe > France (0.14)
Asia > Middle East > UAE (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)
Information Technology > Artificial Intelligence > Natural Language (0.88)

Add feedback