AITopics | Chen, Yanzhi

Collaborating Authors

Chen, Yanzhi

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

On Evaluating LLMs' Capabilities as Functional Approximators: A Bayesian Perspective

Siddiqui, Shoaib Ahmed, Chen, Yanzhi, Heo, Juyeon, Xia, Menglin, Weller, Adrian

arXiv.org Artificial IntelligenceOct-6-2024

Recent works have successfully applied Large Language Models (LLMs) to function modeling tasks. However, the reasons behind this success remain unclear. In this work, we propose a new evaluation framework to comprehensively assess LLMs' function modeling abilities. By adopting a Bayesian perspective of function modeling, we discover that LLMs are relatively weak in understanding patterns in raw data, but excel at utilizing prior knowledge about the domain to develop a strong understanding of the underlying function. Our findings offer new insights about the strengths and limitations of LLMs in the context of function modeling.

large language model, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2410.04541

Country: North America > United States (0.28)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback

Scalable Infomin Learning

Chen, Yanzhi, Sun, Weihao, Li, Yingzhen, Weller, Adrian

arXiv.org Artificial IntelligenceFeb-21-2023

The task of infomin learning aims to learn a representation with high utility while being uninformative about a specified target, with the latter achieved by minimising the mutual information between the representation and the target. It has broad applications, ranging from training fair prediction models against protected attributes, to unsupervised learning with disentangled representations. Recent works on infomin learning mainly use adversarial training, which involves training a neural network to estimate mutual information or its proxy and thus is slow and difficult to optimise. Drawing on recent advances in slicing techniques, we propose a new infomin learning approach, which uses a novel proxy metric to mutual information. We further derive an accurate and analytically computable approximation to this proxy metric, thereby removing the need of constructing neural network-based mutual information estimators. Experiments on algorithmic fairness, disentangled representation learning and domain adaptation verify that our method can effectively remove unwanted information with limited time budget.

artificial intelligence, machine learning, representation, (15 more...)

arXiv.org Artificial Intelligence

2302.10701

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Do Concept Bottleneck Models Learn as Intended?

Margeloiu, Andrei, Ashman, Matthew, Bhatt, Umang, Chen, Yanzhi, Jamnik, Mateja, Weller, Adrian

arXiv.org Artificial IntelligenceMay-10-2021

Concept bottleneck models map from raw inputs to concepts, and then from concepts to targets. Such models aim to incorporate pre-specified, high-level concepts into the learning procedure, and have been motivated to meet three desiderata: interpretability, predictability, and intervenability. However, we find that concept bottleneck models struggle to meet these goals. Using post hoc interpretability methods, we demonstrate that concepts do not correspond to anything semantically meaningful in input space, thus calling into question the usefulness of concept bottleneck models in their current form. Koh et al. (2020) proposed concept bottleneck models (CBMs) as a way to incorporate pre-defined expert concepts (e.g., "bone spurs present" or "wing color") into a supervised learning procedure.

health & medicine, neural network, saliency map, (19 more...)

arXiv.org Artificial Intelligence

2105.04289

Country: Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.16)

Genre: Research Report (1.00)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Neural Approximate Sufficient Statistics for Implicit Models

Chen, Yanzhi, Zhang, Dinghuai, Gutmann, Michael, Courville, Aaron, Zhu, Zhanxing

arXiv.org Artificial IntelligenceOct-20-2020

We consider the fundamental problem of how to automatically construct summary statistics for implicit generative models where the evaluation of likelihood function is intractable but sampling / simulating data from the model is possible. The idea is to frame the task of constructing sufficient statistics as learning mutual information maximizing representation of the data. This representation is computed by a deep neural network trained by a joint statistic-posterior learning strategy. We apply our approach to both traditional approximate Bayesian computation (ABC) and recent neural likelihood approaches, boosting their performance on a range of tasks.

deep learning, neural network, statistic, (19 more...)

arXiv.org Artificial Intelligence

2010.10079

Country: North America (0.14)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)
Health & Medicine > Therapeutic Area > Immunology (0.46)
Health & Medicine > Epidemiology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.35)

Add feedback