AITopics | Xiong, Hao

Collaborating Authors

Xiong, Hao

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

NesTools: A Dataset for Evaluating Nested Tool Learning Abilities of Large Language Models

Han, Han, Zhu, Tong, Zhang, Xiang, Wu, Mengsong, Xiong, Hao, Chen, Wenliang

arXiv.org Artificial IntelligenceJan-7-2025

Large language models (LLMs) combined with tool learning have gained impressive results in real-world applications. During tool learning, LLMs may call multiple tools in nested orders, where the latter tool call may take the former response as its input parameters. However, current research on the nested tool learning capabilities is still under-explored, since the existing benchmarks lack relevant data instances. To address this problem, we introduce NesTools to bridge the current gap in comprehensive nested tool learning evaluations. NesTools comprises a novel automatic data generation method to construct large-scale nested tool calls with different nesting structures. With manual review and refinement, the dataset is in high quality and closely aligned with real-world scenarios. Therefore, NesTools can serve as a new benchmark to evaluate the nested tool learning abilities of LLMs. We conduct extensive experiments on 22 LLMs, and provide in-depth analyses with NesTools, which shows that current LLMs still suffer from the complex nested tool learning task.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2410.11805

Country:

Asia (0.68)
North America > United States (0.28)

Genre: Research Report (1.00)

Industry:

Health & Medicine (0.68)
Consumer Products & Services > Food, Beverage, Tobacco & Cannabis > Beverages (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

Can Uncertainty Quantification Enable Better Learning-based Index Tuning?

Yu, Tao, Zou, Zhaonian, Xiong, Hao

arXiv.org Artificial IntelligenceOct-23-2024

Index tuning is crucial for optimizing database performance by selecting optimal indexes based on workload. The key to this process lies in an accurate and efficient benefit estimator. Traditional methods relying on what-if tools often suffer from inefficiency and inaccuracy. In contrast, learning-based models provide a promising alternative but face challenges such as instability, lack of interpretability, and complex management. To overcome these limitations, we adopt a novel approach: quantifying the uncertainty in learning-based models' results, thereby combining the strengths of both traditional and learning-based methods for reliable index tuning. We propose Beauty, the first uncertainty-aware framework that enhances learning-based models with uncertainty quantification and uses what-if tools as a complementary mechanism to improve reliability and reduce management complexity. Specifically, we introduce a novel method that combines AutoEncoder and Monte Carlo Dropout to jointly quantify uncertainty, tailored to the characteristics of benefit estimation tasks. In experiments involving sixteen models, our approach outperformed existing uncertainty quantification methods in the majority of cases. We also conducted index tuning tests on six datasets. By applying the Beauty framework, we eliminated worst-case scenarios and more than tripled the occurrence of best-case scenarios.

artificial intelligence, configuration, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2410.17748

Country:

North America > United States (1.00)
Europe (1.00)

Genre:

Research Report > New Finding (0.68)
Research Report > Promising Solution (0.54)

Industry: Information Technology (0.46)

Technology:

Information Technology > Databases (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Modeling Collaborator: Enabling Subjective Vision Classification With Minimal Human Effort via LLM Tool-Use

Toubal, Imad Eddine, Avinash, Aditya, Alldrin, Neil Gordon, Dlabal, Jan, Zhou, Wenlei, Luo, Enming, Stretcu, Otilia, Xiong, Hao, Lu, Chun-Ta, Zhou, Howard, Krishna, Ranjay, Fuxman, Ariel, Duerig, Tom

arXiv.org Artificial IntelligenceMar-19-2024

From content moderation to wildlife conservation, the number of applications that require models to recognize nuanced or subjective visual concepts is growing. Traditionally, developing classifiers for such concepts requires substantial manual effort measured in hours, days, or even months to identify and annotate data needed for training. Even with recently proposed Agile Modeling techniques, which enable rapid bootstrapping of image classifiers, users are still required to spend 30 minutes or more of monotonous, repetitive data labeling just to train a single classifier. Drawing on Fiske's Cognitive Miser theory, we propose a new framework that alleviates manual effort by replacing human labeling with natural language interactions, reducing the total effort required to define a concept by an order of magnitude: from labeling 2,000 images to only 100 plus some natural language interactions. Our framework leverages recent advances in foundation models, both large language models and vision-language models, to carve out the concept space through conversation and by automatically labeling training data points. Most importantly, our framework eliminates the need for crowd-sourced annotations. Moreover, our framework ultimately produces lightweight classification models that are deployable in cost-sensitive scenarios. Across 15 subjective concepts and across 2 public image classification datasets, our trained models outperform traditional Agile Modeling as well as state-of-the-art zero-shot classification models like ALIGN, CLIP, CuPL, and large visual question-answering models like PaLI-X.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2403.02626

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry:

Transportation (0.68)
Law Enforcement & Public Safety (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Add feedback

Multiple Sclerosis Lesion Inpainting Using Non-Local Partial Convolutions

Xiong, Hao, Tao, Dacheng

arXiv.org Machine LearningDec-24-2018

Multiple sclerosis (MS) is an inflammatory demyelinating disease of the central nervous system (CNS) that results in focal injury to the grey and white matter. The presence of white matter lesions biases morphometric analyses such as registration, individual longitudinal measurements and tissue segmentation for brain volume measurements. Lesion-inpainting with intensities derived from surround healthy tissue represent one approach to alleviate such problems. However, existing methods inpaint lesions based on texture information derived from local surrounding tissue, often leading to inconsistent inpainting and the generation of artifacts such as intensity discrepancy and blurriness. Based on these observations, we propose non-local partial convolutions (NLPC) which integrates a Unet-like network with the non-local module. The non-local module is exploited to capture long range dependencies between the lesion area and remaining normal-appearing brain regions. Then, the lesion area is filled by referring to normal-appearing regions with more similar features. This method generates inpainted regions that appear more realistic and natural. Our quantitative experimental results also demonstrate superiority of this technique of existing state-of-the-art inpainting methods.

deep learning, lesion, neural network, (20 more...)

arXiv.org Machine Learning

1901.00055

Country:

Oceania > Australia (0.14)
Europe (0.14)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Neurology > Multiple Sclerosis (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Multi-Channel Encoder for Neural Machine Translation

Xiong, Hao (Baidu Inc.) | He, Zhongjun (Baidu Inc.) | Hu, Xiaoguang (Baidu Inc.) | Wu, Hua (Baidu Inc.)

AAAI ConferencesFeb-8-2018

Attention-based Encoder-Decoder has the effective architecture for neural machine translation (NMT), which typically relies on recurrent neural networks (RNN) to build the blocks that will be lately called by attentive reader during the decoding process. This design of encoder yields relatively uniform composition on source sentence, despite the gating mechanism employed in encoding RNN. On the other hand, we often hope the decoder to take pieces of source sentence at varying levels suiting its own linguistic structure: for example, we may want to take the entity name in its raw form while taking an idiom as a perfectly composed unit. Motivated by this demand, we propose Multi-channel Encoder (MCE), which enhances encoding components with different levels of composition. More specifically, in addition to the hidden state of encoding RNN, MCE takes 1) the original word embedding for raw encoding with no composition, and 2) a particular design of external memory in Neural Turing Machine NTM) for more complex composition, while all three encoding strategies are properly blended during decoding. Empirical study on Chinese-English translation shows that our model can improve by 6.52 BLEU points upon a strong open source NMT system: DL4MT1. On the WMT14 English-French task, our single shallow system achieves BLEU=38.8, comparable with the state-of-the-art deep models.

deep learning, external memory, neural network, (18 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country: Asia > China (0.14)

Genre: Research Report (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

A Diversified Generative Latent Variable Model for WiFi-SLAM

Xiong, Hao (University of Technology, Sydney) | Tao, Dacheng (University of Technology, Sydney)

AAAI ConferencesFeb-14-2017

WiFi-SLAM aims to map WiFi signals within an unknown environment while simultaneously determining the location of a mobile device. This localization method has been extensively used in indoor, space, undersea, and underground environments. For the sake of accuracy, most methods label the signal readings against ground truth locations. However, this is impractical in large environments, where it is hard to collect and maintain the data. Some methods use latent variable models to generate latent-space locations of signal strength data, an advantage being that no prior labeling of signal strength readings and their physical locations is required. However, the generated latent variables cannot cover all wireless signal locations and WiFi-SLAM performance is significantly degraded. Here we propose the diversified generative latent variable model (DGLVM) to overcome these limitations. By building a positive-definite kernel function, a diversity-encouraging prior is introduced to render the generated latent variables non-overlapping, thus capturing more wireless signal measurements characteristics. The defined objective function is then solved by variational inference. Our experiments illustrate that the method performs WiFi localization more accurately than other label-free methods.

artificial intelligence, latent variable, machine learning, (17 more...)

AAAI Conferences

Thirty-First AAAI Conference on Artificial Intelligence

Country: Oceania > Australia (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.81)

Add feedback

Diversified Dynamical Gaussian Process Latent Variable Model for Video Repair

Xiong, Hao (University of Technology, Sydney) | Liu, Tongliang (University of Technology, Sydney) | Tao, Dacheng (University of Technology, Sydney)

AAAI ConferencesApr-19-2016

Videos can be conserved on different media. However, storing on media such as films and hard disks can suffer from unexpected data loss, for instance from physical damage. Repair of missing or damaged pixels is essential for video maintenance and preservation. Most methods seek to fill in missing holes by synthesizing similar textures from local or global frames. However, this can introduce incorrect contexts, especially when the missing hole or number of damaged frames is large. Furthermore, simple texture synthesis can introduce artifacts in undamaged and recovered areas. To address aforementioned problems, we propose the diversified dynamical Gaussian process latent variable model (D2GPLVM) for considering the variety in existing videos and thus introducing a diversity encouraging prior to inducing points. The aim is to ensure that the trained inducing points, which are a smaller set of all observed undamaged frames, are more diverse and resistant for context-aware and artifacts-free based video repair. The defined objective function in our proposed model is initially not analytically tractable and must be solved by variational inference. Finally, experimental testing illustrates the robustness and effectiveness of our method for damaged video repair.

artificial intelligence, machine learning, video, (16 more...)

AAAI Conferences

Thirtieth AAAI Conference on Artificial Intelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.71)

Add feedback

Forest-Based Semantic Role Labeling

Xiong, Hao (Chinese Academy of Sciences) | Mi, Haitao (Chinese Academy of Sciences) | Liu, Yang (Chinese Academy of Sciences) | Liu, Qun (Chinese Academy of Sciences)

AAAI ConferencesJul-15-2010

Parsing plays an important role in semantic role labeling (SRL) because most SRL systems infer semantic relations from 1-best parses. Therefore, parsing errors inevitably lead to labeling mistakes. To alleviate this problem, we propose to use packed forest, which compactly encodes all parses for a sentence. We design an algorithm to exploit exponentially many parses to learn semantic relations efciently. Experimental results on the CoNLL-2005 shared task show that using forests achieves an absolute improvement of 1.2% in terms of F1 score over using 1-best parses and 0.6% over using 50-best parses.

proceedings, survey article, text processing, (17 more...)

AAAI Conferences

Twenty-Fourth AAAI Conference on Artificial Intelligence

Country: North America > United States (0.28)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Add feedback