AITopics | Xu, Zhao

Collaborating Authors

Xu, Zhao

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Parrot: Multilingual Visual Instruction Tuning

Sun, Hai-Long, Zhou, Da-Wei, Li, Yang, Lu, Shiyin, Yi, Chao, Chen, Qing-Guo, Xu, Zhao, Luo, Weihua, Zhang, Kaifu, Zhan, De-Chuan, Ye, Han-Jia

arXiv.org Artificial IntelligenceJun-4-2024

The rapid development of Multimodal Large Language Models (MLLMs) like GPT-4V has marked a significant step towards artificial general intelligence. Existing methods mainly focus on aligning vision encoders with LLMs through supervised fine-tuning (SFT) to endow LLMs with multimodal abilities, making MLLMs' inherent ability to react to multiple languages progressively deteriorate as the training process evolves. We empirically find that the imbalanced SFT datasets, primarily composed of English-centric image-text pairs, lead to significantly reduced performance in non-English languages. This is due to the failure of aligning the vision encoder and LLM with multilingual tokens during the SFT process. In this paper, we introduce Parrot, a novel method that utilizes textual guidance to drive visual token alignment at the language level. Parrot makes the visual tokens condition on diverse language inputs and uses Mixture-of-Experts (MoE) to promote the alignment of multilingual tokens. Specifically, to enhance non-English visual tokens alignment, we compute the cross-attention using the initial visual features and textual embeddings, the result of which is then fed into the MoE router to select the most relevant experts. The selected experts subsequently convert the initial visual tokens into language-specific visual tokens. Moreover, considering the current lack of benchmarks for evaluating multilingual capabilities within the field, we collect and make available a Massive Multilingual Multimodal Benchmark which includes 6 languages, 15 categories, and 12,000 questions, named as MMMB. Our method not only demonstrates state-of-the-art performance on multilingual MMBench and MMMB, but also excels across a broad range of multimodal tasks. Both the source code and the training dataset of Parrot will be made publicly available.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2406.02539

Country:

Asia (0.28)
North America > United States (0.28)

Genre: Research Report > Promising Solution (0.48)

Industry: Leisure & Entertainment > Sports > Tennis (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

LLMLight: Large Language Models as Traffic Signal Control Agents

Lai, Siqi, Xu, Zhao, Zhang, Weijia, Liu, Hao, Xiong, Hui

arXiv.org Artificial IntelligenceFeb-9-2024

Traffic Signal Control (TSC) is a crucial component in urban traffic management, aiming to optimize road network efficiency and reduce congestion. Traditional methods in TSC, primarily based on transportation engineering and reinforcement learning (RL), often exhibit limitations in generalization across varied traffic scenarios and lack interpretability. This paper presents LLMLight, a novel framework employing Large Language Models (LLMs) as decision-making agents for TSC. Specifically, the framework begins by instructing the LLM with a knowledgeable prompt detailing real-time traffic conditions. Leveraging the advanced generalization capabilities of LLMs, LLMLight engages a reasoning and decision-making process akin to human intuition for effective traffic control. Moreover, we build LightGPT, a specialized backbone LLM tailored for TSC tasks. By learning nuanced traffic patterns and control strategies, LightGPT enhances the LLMLight framework cost-effectively. Extensive experiments on nine real-world and synthetic datasets showcase the remarkable effectiveness, generalization ability, and interpretability of LLMLight against nine transportation-based and RL-based baselines.

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2312.16044

Country:

Asia > China (0.96)
North America > United States > New York > New York County (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)

Genre: Research Report > New Finding (0.68)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.97)

Add feedback

Towards Urban General Intelligence: A Review and Outlook of Urban Foundation Models

Zhang, Weijia, Han, Jindong, Xu, Zhao, Ni, Hang, Liu, Hao, Xiong, Hui

arXiv.org Artificial IntelligenceJan-29-2024

Machine learning techniques are now integral to the advancement of intelligent urban services, playing a crucial role in elevating the efficiency, sustainability, and livability of urban environments. The recent emergence of foundation models such as ChatGPT marks a revolutionary shift in the fields of machine learning and artificial intelligence. Their unparalleled capabilities in contextual understanding, problem solving, and adaptability across a wide range of tasks suggest that integrating these models into urban domains could have a transformative impact on the development of smart cities. Despite growing interest in Urban Foundation Models~(UFMs), this burgeoning field faces challenges such as a lack of clear definitions, systematic reviews, and universalizable solutions. To this end, this paper first introduces the concept of UFM and discusses the unique challenges involved in building them. We then propose a data-centric taxonomy that categorizes current UFM-related works, based on urban data modalities and types. Furthermore, to foster advancement in this field, we present a promising framework aimed at the prospective realization of UFMs, designed to overcome the identified challenges. Additionally, we explore the application landscape of UFMs, detailing their potential impact in various urban contexts. Relevant papers and open-source resources have been collated and are continuously updated at https://github.com/usail-hkust/Awesome-Urban-Foundation-Models.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2402.01749

Country:

North America > Canada > Quebec (0.14)
North America > United States > California (0.14)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.45)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
(7 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Spatial Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.88)

Add feedback

Applying Large Language Models to Power Systems: Potential Security Threats

Ruan, Jiaqi, Liang, Gaoqi, Zhao, Huan, Liu, Guolong, Sun, Xianzhuo, Qiu, Jing, Xu, Zhao, Wen, Fushuan, Dong, Zhao Yang

arXiv.org Artificial IntelligenceJan-24-2024

Applying large language models (LLMs) to modern power systems presents a promising avenue for enhancing decision-making and operational efficiency. However, this action may also incur potential security threats, which have not been fully recognized so far. To this end, this article analyzes potential threats incurred by applying LLMs to power systems, emphasizing the need for urgent research and development of countermeasures.

artificial intelligence, large language model, natural language, (16 more...)

arXiv.org Artificial Intelligence

2311.13361

Country: Asia > China (0.71)

Genre:

Research Report (0.90)
Overview (0.54)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

A Text-guided Protein Design Framework

Liu, Shengchao, Li, Yanjing, Li, Zhuoxinran, Gitter, Anthony, Zhu, Yutao, Lu, Jiarui, Xu, Zhao, Nie, Weili, Ramanathan, Arvind, Xiao, Chaowei, Tang, Jian, Guo, Hongyu, Anandkumar, Anima

arXiv.org Machine LearningDec-3-2023

Meanwhile, there exists tremendous knowledge curated by humans in the text format describing proteins' high-level functionalities. Yet, whether the incorporation of such text data can help protein design tasks has not been explored. To bridge this gap, we propose ProteinDT, a multi-modal framework that leverages textual descriptions for protein design. ProteinDT consists of three subsequent steps: ProteinCLAP which aligns the representation of two modalities, a facilitator that generates the protein representation from the text modality, and a decoder that creates the protein sequences from the representation. To train ProteinDT, we construct a large dataset, SwissProtCLAP, with 441K text and protein pairs. We quantitatively verify the effectiveness of ProteinDT on three challenging tasks: (1) over 90% accuracy for text-guided protein generation; (2) best hit ratio on 10 zero-shot text-guided protein editing tasks; (3) superior performance on four out of six protein property prediction benchmarks. Machine learning (ML) has recently shown profound potential for protein discovery. These ML tools have been quickly adapted as auxiliary and accelerating roles in scientific pipelines, including but not limited to protein engineering [1], structure prediction [2], structure reconstruction [3], and inverse folding [4].

bioinformatics, machine learning, natural language, (19 more...)

arXiv.org Machine Learning

2302.04611

Country:

Asia (0.67)
North America > United States > Wisconsin > Dane County > Madison (0.14)
North America > Canada > Ontario > Toronto (0.14)
North America > United States > California > Alameda County > Berkeley (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Vision (0.93)
(3 more...)

Add feedback

Artificial Intelligence for Science in Quantum, Atomistic, and Continuum Systems

Zhang, Xuan, Wang, Limei, Helwig, Jacob, Luo, Youzhi, Fu, Cong, Xie, Yaochen, Liu, Meng, Lin, Yuchao, Xu, Zhao, Yan, Keqiang, Adams, Keir, Weiler, Maurice, Li, Xiner, Fu, Tianfan, Wang, Yucheng, Yu, Haiyang, Xie, YuQing, Fu, Xiang, Strasser, Alex, Xu, Shenglong, Liu, Yi, Du, Yuanqi, Saxton, Alexandra, Ling, Hongyi, Lawrence, Hannah, Stärk, Hannes, Gui, Shurui, Edwards, Carl, Gao, Nicholas, Ladera, Adriana, Wu, Tailin, Hofgard, Elyssa F., Tehrani, Aria Mansouri, Wang, Rui, Daigavane, Ameya, Bohde, Montgomery, Kurtin, Jerry, Huang, Qian, Phung, Tuong, Xu, Minkai, Joshi, Chaitanya K., Mathis, Simon V., Azizzadenesheli, Kamyar, Fang, Ada, Aspuru-Guzik, Alán, Bekkers, Erik, Bronstein, Michael, Zitnik, Marinka, Anandkumar, Anima, Ermon, Stefano, Liò, Pietro, Yu, Rose, Günnemann, Stephan, Leskovec, Jure, Ji, Heng, Sun, Jimeng, Barzilay, Regina, Jaakkola, Tommi, Coley, Connor W., Qian, Xiaoning, Qian, Xiaofeng, Smidt, Tess, Ji, Shuiwang

arXiv.org Artificial IntelligenceNov-15-2023

Advances in artificial intelligence (AI) are fueling a new paradigm of discoveries in natural sciences. Today, AI has started to advance natural sciences by improving, accelerating, and enabling our understanding of natural phenomena at a wide range of spatial and temporal scales, giving rise to a new area of research known as AI for science (AI4Science). Being an emerging research paradigm, AI4Science is unique in that it is an enormous and highly interdisciplinary area. Thus, a unified and technical treatment of this field is needed yet challenging. This work aims to provide a technically thorough account of a subarea of AI4Science; namely, AI for quantum, atomistic, and continuum systems. These areas aim at understanding the physical world from the subatomic (wavefunctions and electron density), atomic (molecules, proteins, materials, and interactions), to macro (fluids, climate, and subsurface) scales and form an important subarea of AI4Science. A unique advantage of focusing on these areas is that they largely share a common set of challenges, thereby allowing a unified and foundational treatment. A key common challenge is how to capture physics first principles, especially symmetries, in natural systems by deep learning methods. We provide an in-depth yet intuitive account of techniques to achieve equivariance to symmetry transformations. We also discuss other common technical challenges, including explainability, out-of-distribution generalization, knowledge transfer with foundation and large language models, and uncertainty quantification. To facilitate learning and education, we provide categorized lists of resources that we found to be useful. We strive to be thorough and unified and hope this initial effort may trigger more community interests and efforts to further advance AI4Science.

bioinformatics, large language model, machine learning, (27 more...)

arXiv.org Artificial Intelligence

2307.08423

Country:

Asia > Middle East (0.92)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.27)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.27)
North America > United States > California > Santa Clara County (0.27)

Genre:

Research Report > Promising Solution (1.00)
Research Report > New Finding (1.00)
Overview (1.00)
(2 more...)

Industry:

Materials > Chemicals > Commodity Chemicals > Petrochemicals (1.00)
Information Technology (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
(10 more...)

Technology:

Information Technology > Biomedical Informatics > Translational Bioinformatics (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (1.00)
(7 more...)

Add feedback

Efficient and Equivariant Graph Networks for Predicting Quantum Hamiltonian

Yu, Haiyang, Xu, Zhao, Qian, Xiaofeng, Qian, Xiaoning, Ji, Shuiwang

arXiv.org Artificial IntelligenceNov-8-2023

We consider the prediction of the Hamiltonian matrix, which finds use in quantum chemistry and condensed matter physics. Efficiency and equivariance are two important, but conflicting factors. In this work, we propose a SE(3)-equivariant network, named QHNet, that achieves efficiency and equivariance. Our key advance lies at the innovative design of QHNet architecture, which not only obeys the underlying symmetries, but also enables the reduction of number of tensor products by 92\%. In addition, QHNet prevents the exponential growth of channel dimension when more atom types are involved. We perform experiments on MD17 datasets, including four molecular systems. Experimental results show that our QHNet can achieve comparable performance to the state of the art methods at a significantly faster speed. Besides, our QHNet consumes 50\% less memory due to its streamlined architecture. Our code is publicly available as part of the AIRS library (\url{https://github.com/divelab/AIRS}).

artificial intelligence, latexit sha1, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2306.04922

Country:

North America > United States > Texas (0.14)
North America > United States > Hawaii (0.14)

Genre: Research Report > New Finding (0.66)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

3D Molecular Geometry Analysis with 2D Graphs

Xu, Zhao, Xie, Yaochen, Luo, Youzhi, Zhang, Xuan, Xu, Xinyi, Liu, Meng, Dickerson, Kaleb, Deng, Cheng, Nakata, Maho, Ji, Shuiwang

arXiv.org Artificial IntelligenceMay-1-2023

Ground-state 3D geometries of molecules are essential for many molecular analysis tasks. Modern quantum mechanical methods can compute accurate 3D geometries but are computationally prohibitive. Currently, an efficient alternative to computing ground-state 3D molecular geometries from 2D graphs is lacking. Here, we propose a novel deep learning framework to predict 3D geometries from molecular graphs. To this end, we develop an equilibrium message passing neural network (EMPNN) to better capture ground-state geometries from molecular graphs. To provide a testbed for 3D molecular geometry analysis, we develop a benchmark that includes a large-scale molecular geometry dataset, data splits, and evaluation protocols. Experimental results show that EMPNN can efficiently predict more accurate ground-state 3D geometries than RDKit and other deep learning methods. Results also show that the proposed framework outperforms self-supervised learning methods on property prediction tasks.

artificial intelligence, geometry, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2305.13315

Country:

Asia (0.93)
North America > United States > Texas (0.29)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (0.67)
Health & Medicine > Therapeutic Area (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Uncertainty Propagation in Node Classification

Xu, Zhao, Lawrence, Carolin, Shaker, Ammar, Siarheyeu, Raman

arXiv.org Artificial IntelligenceApr-3-2023

Quantifying predictive uncertainty of neural networks has recently attracted increasing attention. In this work, we focus on measuring uncertainty of graph neural networks (GNNs) for the task of node classification. Most existing GNNs model message passing among nodes. The messages are often deterministic. Questions naturally arise: Does there exist uncertainty in the messages? How could we propagate such uncertainty over a graph together with messages? To address these issues, we propose a Bayesian uncertainty propagation (BUP) method, which embeds GNNs in a Bayesian modeling framework, and models predictive uncertainty of node classification with Bayesian confidence of predictive probability and uncertainty of messages. Our method proposes a novel uncertainty propagation mechanism inspired by Gaussian models. Moreover, we present an uncertainty oriented loss for node classification that allows the GNNs to clearly integrate predictive uncertainty in learning procedure. Consequently, the training examples with large predictive uncertainty will be penalized. We demonstrate the BUP with respect to prediction reliability and out-of-distribution (OOD) predictions. The learned uncertainty is also analyzed in depth. The relations between uncertainty and graph topology, as well as predictive uncertainty in the OOD cases are investigated with extensive experiments. The empirical results with popular benchmark datasets demonstrate the superior performance of the proposed method.

artificial intelligence, machine learning, node, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ICDM54844.2022.00167

2304.00918

Country: Europe (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.49)

Add feedback

Learning Sparsity of Representations with Discrete Latent Variables

Xu, Zhao, Rubio, Daniel Onoro, Serra, Giuseppe, Niepert, Mathias

arXiv.org Artificial IntelligenceApr-3-2023

Deep latent generative models have attracted increasing attention due to the capacity of combining the strengths of deep learning and probabilistic models in an elegant way. The data representations learned with the models are often continuous and dense. However in many applications, sparse representations are expected, such as learning sparse high dimensional embedding of data in an unsupervised setting, and learning multi-labels from thousands of candidate tags in a supervised setting. In some scenarios, there could be further restriction on degree of sparsity: the number of non-zero features of a representation cannot be larger than a pre-defined threshold $L_0$. In this paper we propose a sparse deep latent generative model SDLGM to explicitly model degree of sparsity and thus enable to learn the sparse structure of the data with the quantified sparsity constraint. The resulting sparsity of a representation is not fixed, but fits to the observation itself under the pre-defined restriction. In particular, we introduce to each observation $i$ an auxiliary random variable $L_i$, which models the sparsity of its representation. The sparse representations are then generated with a two-step sampling process via two Gumbel-Softmax distributions. For inference and learning, we develop an amortized variational method based on MC gradient estimator. The resulting sparse representations are differentiable with backpropagation. The experimental evaluation on multiple datasets for unsupervised and supervised learning problems shows the benefits of the proposed method.

artificial intelligence, machine learning, representation, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/IJCNN52387.2021.9533762

2304.00935

Country: Europe (0.46)

Genre: Research Report (0.82)

Industry: Telecommunications (0.46)

Add feedback