AITopics | He, Xiaofeng

Collaborating Authors

He, Xiaofeng

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

From Complex to Simple: Unraveling the Cognitive Tree for Reasoning with Small Language Models

Yan, Junbing, Wang, Chengyu, Zhang, Taolin, He, Xiaofeng, Huang, Jun, Zhang, Wei

arXiv.org Artificial IntelligenceNov-12-2023

Reasoning is a distinctive human capacity, enabling us to address complex problems by breaking them down into a series of manageable cognitive steps. Yet, complex logical reasoning is still cumbersome for language models. Based on the dual process theory in cognitive science, we are the first to unravel the cognitive reasoning abilities of language models. Our framework employs an iterative methodology to construct a Cognitive Tree (CogTree). The root node of this tree represents the initial query, while the leaf nodes consist of straightforward questions that can be answered directly. This construction involves two main components: the implicit extraction module (referred to as the intuitive system) and the explicit reasoning module (referred to as the reflective system). The intuitive system rapidly generates multiple responses by utilizing in-context examples, while the reflective system scores these responses using comparative learning. The scores guide the intuitive system in its subsequent generation step. Our experimental results on two popular and challenging reasoning tasks indicate that it is possible to achieve a performance level comparable to that of GPT-3.5 (with 175B parameters), using a significantly smaller language model that contains fewer parameters (<=7B) than 5% of GPT-3.5.

large language model, machine learning, system, (21 more...)

arXiv.org Artificial Intelligence

2311.06754

Country:

Europe (0.67)
North America > United States > New York (0.14)
North America > United States > Hawaii (0.14)
Asia > Middle East > UAE (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.57)

Add feedback

Learning Knowledge-Enhanced Contextual Language Representations for Domain Natural Language Understanding

Xu, Ruyao, Zhang, Taolin, Wang, Chengyu, Duan, Zhongjie, Chen, Cen, Qiu, Minghui, Cheng, Dawei, He, Xiaofeng, Qian, Weining

arXiv.org Artificial IntelligenceNov-12-2023

Knowledge-Enhanced Pre-trained Language Models (KEPLMs) improve the performance of various downstream NLP tasks by injecting knowledge facts from large-scale Knowledge Graphs (KGs). However, existing methods for pre-training KEPLMs with relational triples are difficult to be adapted to close domains due to the lack of sufficient domain graph semantics. In this paper, we propose a Knowledge-enhanced lANGuAge Representation learning framework for various clOsed dOmains (KANGAROO) via capturing the implicit graph structure among the entities. Specifically, since the entity coverage rates of closed-domain KGs can be relatively low and may exhibit the global sparsity phenomenon for knowledge injection, we consider not only the shallow relational representations of triples but also the hyperbolic embeddings of deep hierarchical entity-class structures for effective knowledge fusion.Moreover, as two closed-domain entities under the same entity-class often have locally dense neighbor subgraphs counted by max point biconnected component, we further propose a data augmentation strategy based on contrastive learning over subgraphs to construct hard negative samples of higher quality. It makes the underlying KELPMs better distinguish the semantics of these neighboring entities to further complement the global semantic sparsity. In the experiments, we evaluate KANGAROO over various knowledge-aware and general NLP tasks in both full and few-shot learning settings, outperforming various KEPLM training paradigms performance in closed-domains significantly.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2311.06761

Country: Asia > China (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

SelfOdom: Self-supervised Egomotion and Depth Learning via Bi-directional Coarse-to-Fine Scale Recovery

Qu, Hao, Zhang, Lilian, Hu, Xiaoping, He, Xiaofeng, Pan, Xianfei, Chen, Changhao

arXiv.org Artificial IntelligenceSep-2-2023

Accurately perceiving location and scene is crucial for autonomous driving and mobile robots. Recent advances in deep learning have made it possible to learn egomotion and depth from monocular images in a self-supervised manner, without requiring highly precise labels to train the networks. However, monocular vision methods suffer from a limitation known as scale-ambiguity, which restricts their application when absolute-scale is necessary. To address this, we propose SelfOdom, a self-supervised dual-network framework that can robustly and consistently learn and generate pose and depth estimates in global scale from monocular images. In particular, we introduce a novel coarse-to-fine training strategy that enables the metric scale to be recovered in a two-stage process. Furthermore, SelfOdom is flexible and can incorporate inertial data with images, which improves its robustness in challenging scenarios, using an attention-based fusion module. Our model excels in both normal and challenging lighting conditions, including difficult night scenes. Extensive experiments on public datasets have demonstrated that SelfOdom outperforms representative traditional and learning-based VO and VIO models.

artificial intelligence, estimation, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2211.08904

Country:

Asia > China (0.15)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > Germany (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Information Technology (0.48)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
(2 more...)

Add feedback

GeoGLUE: A GeoGraphic Language Understanding Evaluation Benchmark

Li, Dongyang, Ding, Ruixue, Zhang, Qiang, Li, Zheng, Chen, Boli, Xie, Pengjun, Xu, Yao, Li, Xin, Guo, Ning, Huang, Fei, He, Xiaofeng

arXiv.org Artificial IntelligenceMay-10-2023

With a fast developing pace of geographic applications, automatable and intelligent models are essential to be designed to handle the large volume of information. However, few researchers focus on geographic natural language processing, and there has never been a benchmark to build a unified standard. In this work, we propose a GeoGraphic Language Understanding Evaluation benchmark, named GeoGLUE. We collect data from open-released geographic resources and introduce six natural language understanding tasks, including geographic textual similarity on recall, geographic textual similarity on rerank, geographic elements tagging, geographic composition analysis, geographic where what cut, and geographic entity alignment.

artificial intelligence, benchmark, natural language, (17 more...)

arXiv.org Artificial Intelligence

2305.06545

Country:

Asia > China > Zhejiang Province (0.14)
Asia > China > Guangdong Province (0.14)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

A 2D Georeferenced Map Aided Visual-Inertial System for Precise UAV Localization

Mao, Jun, Zhang, Lilian, He, Xiaofeng, Qu, Hao, Hu, Xiaoping

arXiv.org Artificial IntelligenceDec-28-2022

Precise geolocalization is crucial for unmanned aerial vehicles (UAVs). However, most current deployed UAVs rely on the global navigation satellite systems (GNSS) or high precision inertial navigation systems (INS) for geolocalization. In this paper, we propose to use a lightweight visual-inertial system with a 2D georeference map to obtain accurate and consecutive geodetic positions for UAVs. The proposed system firstly integrates a micro inertial measurement unit (MIMU) and a monocular camera as odometry to consecutively estimate the navigation states and reconstruct the 3D position of the observed visual features in the local world frame. To obtain the geolocation, the visual features tracked by the odometry are further registered to the 2D georeferenced map. While most conventional methods perform image-level aerial image registration, we propose to align the reconstructed points to the map points in the geodetic frame; this helps to filter out the large portion of outliers and decouples the negative effects from the horizontal angles. The registered points are then used to relocalize the vehicle in the geodetic frame. Finally, a pose graph is deployed to fuse the geolocation from the aerial image registration and the local navigation result from the visual-inertial odometry (VIO) to achieve consecutive and drift-free geolocalization performance. We have validated the proposed method by installing the sensors to a UAV body rigidly and have conducted two flights in different environments with unknown initials. The results show that the proposed method can achieve less than 4m position error in flight at 100m high and less than 9m position error in flight about 300m high.

artificial intelligence, dataset, image understanding, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/IROS47612.2022.9982254.

2107.05851

Country: Asia > China (0.14)

Genre: Research Report > New Finding (0.66)

Industry:

Information Technology > Robotics & Automation (0.34)
Aerospace & Defense > Aircraft (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (0.88)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.66)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.54)

Add feedback

TCL: Transformer-based Dynamic Graph Modelling via Contrastive Learning

Wang, Lu, Chang, Xiaofu, Li, Shuang, Chu, Yunfei, Li, Hui, Zhang, Wei, He, Xiaofeng, Song, Le, Zhou, Jingren, Yang, Hongxia

arXiv.org Artificial IntelligenceMay-17-2021

Dynamic graph modeling has recently attracted much attention due to its extensive applications in many real-world scenarios, such as recommendation systems, financial transactions, and social networks. Although many works have been proposed for dynamic graph modeling in recent years, effective and scalable models are yet to be developed. In this paper, we propose a novel graph neural network approach, called TCL, which deals with the dynamically-evolving graph in a continuous-time fashion and enables effective dynamic node representation learning that captures both the temporal and topology information. Technically, our model contains three novel aspects. First, we generalize the vanilla Transformer to temporal graph learning scenarios and design a graph-topology-aware transformer. Secondly, on top of the proposed graph transformer, we introduce a two-stream encoder that separately extracts representations from temporal neighborhoods associated with the two interaction nodes and then utilizes a co-attentional transformer to model inter-dependencies at a semantic level. Lastly, we are inspired by the recently developed contrastive learning and propose to optimize our model by maximizing mutual information (MI) between the predictive representations of two future interaction nodes. Benefiting from this, our dynamic representations can preserve high-level (or global) semantics about interactions and thus is robust to noisy interactions. To the best of our knowledge, this is the first attempt to apply contrastive learning to representation learning on dynamic graphs. We evaluate our model on four benchmark datasets for interaction prediction and experiment results demonstrate the superiority of our model.

deep learning, neural network, node, (16 more...)

arXiv.org Artificial Intelligence

2105.07944

Country:

Asia (0.29)
North America > United States > California (0.28)

Genre: Research Report > New Finding (0.48)

Industry: Information Technology > Services (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Knowledge-Empowered Representation Learning for Chinese Medical Reading Comprehension: Task, Model and Resources

Zhang, Taolin, Wang, Chengyu, Qiu, Minghui, Yang, Bite, He, Xiaofeng, Huang, Jun

arXiv.org Artificial IntelligenceAug-24-2020

Machine Reading Comprehension (MRC) aims to extract answers to questions given a passage. It has been widely studied recently, especially in open domains. However, few efforts have been made on closed-domain MRC, mainly due to the lack of large-scale training data. In this paper, we introduce a multi-target MRC task for the medical domain, whose goal is to predict answers to medical questions and the corresponding support sentences from medical information sources simultaneously, in order to ensure the high reliability of medical knowledge serving. A high-quality dataset is manually constructed for the purpose, named Multi-task Chinese Medical MRC dataset (CMedMRC), with detailed analysis conducted. We further propose the Chinese medical BERT model for the task (CMedBERT), which fuses medical knowledge into pre-trained language models by the dynamic fusion mechanism of heterogeneous features and the multi-task learning strategy. Experiments show that CMedBERT consistently outperforms strong baselines by fusing context-aware and knowledge-aware token representations.

dataset, deep learning, immunology, (23 more...)

arXiv.org Artificial Intelligence

2008.10327

Genre: Research Report (0.64)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Consumer Health (0.68)
Education > Assessment & Standards > Student Performance (0.63)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Learning Robust Representations with Graph Denoising Policy Network

Wang, Lu, Yu, Wenchao, Wang, Wei, Cheng, Wei, Zhang, Wei, Zha, Hongyuan, He, Xiaofeng, Chen, Haifeng

arXiv.org Machine LearningOct-3-2019

--Graph representation learning, aiming to learn low-dimensional representations which capture the geometric dependencies between nodes in the original graph, has gained increasing popularity in a variety of graph analysis tasks, including node classification and link prediction. Existing representation learning methods based on graph neural networks and their variants rely on the aggregation of neighborhood information, which makes it sensitive to noises in the graph, e.g. In this paper, we propose Graph Denoising Policy Network (short for GDPNet) to learn robust representations from noisy graph data through reinforcement learning. GDPNet first selects signal neighborhoods for each node, and then aggregates the information from the selected neighborhoods to learn node representations for the downstream tasks. Specifically, in the signal neighborhood selection phase, GDPNet optimizes the neighborhood for each target node by formulating the process of removing noisy neighborhoods as a Markov decision process and learning a policy with task-specific rewards received from the representation learning phase. In the representation learning phase, GDPNet aggregates features from signal neighbors to generate node representations for downstream tasks, and provides task-specific rewards to the signal neighbor selection phase. These two phases are jointly trained to select optimal sets of neighbors for target nodes with maximum cumulative task-specific rewards, and to learn robust representations for nodes. Note that GDPNet is naturally an inductive model which can leverage both graph structure and the associated node feature information to efficiently generate representations for unseen nodes. Experimental results on node classification task demonstrate the effectiveness of GDNet, outperforming the state-of-the-art graph representation learning methods on several well-studied datasets. Additionally, we show that, with a carefully designed reward function, GDPNet is mathematically equivalent to solving the submodular maximizing problem, which theoretically guarantees the best approximation to the optimal solution with GDPNet.

neural network, optimization problem, representation, (21 more...)

arXiv.org Machine Learning

1910.01784

Country: North America > United States > California (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.90)

Add feedback

Supervised Reinforcement Learning with Recurrent Neural Network for Dynamic Treatment Recommendation

Wang, Lu, Zhang, Wei, He, Xiaofeng, Zha, Hongyuan

arXiv.org Machine LearningJul-4-2018

Dynamic treatment recommendation systems based on large-scale electronic health records (EHRs) become a key to successfully improve practical clinical outcomes. Prior relevant studies recommend treatments either use supervised learning (e.g. matching the indicator signal which denotes doctor prescriptions), or reinforcement learning (e.g. maximizing evaluation signal which indicates cumulative reward from survival rates). However, none of these studies have considered to combine the benefits of supervised learning and reinforcement learning. In this paper, we propose Supervised Reinforcement Learning with Recurrent Neural Network (SRL-RNN), which fuses them into a synergistic learning framework. Specifically, SRL-RNN applies an off-policy actor-critic framework to handle complex relations among multiple medications, diseases and individual characteristics. The "actor" in the framework is adjusted by both the indicator signal and evaluation signal to ensure effective prescription and low mortality. RNN is further utilized to solve the Partially-Observed Markov Decision Process (POMDP) problem due to the lack of fully observed states in real world applications. Experiments on the publicly real-world dataset, i.e., MIMIC-3, illustrate that our model can reduce the estimated mortality, while providing promising accuracy in matching doctors' prescriptions.

attention deficit-hyperactivity disorder, deep learning, vascular disease, (25 more...)

arXiv.org Machine Learning

1807.01473

Country: North America > United States (0.31)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Spectral Relaxation for K-means Clustering

Zha, Hongyuan, He, Xiaofeng, Ding, Chris, Gu, Ming, Simon, Horst D.

Neural Information Processing SystemsDec-31-2002

In K-means clusters are represented by centers of mass of their members, and it can be shown that the K-means algorithm of alternating between assigning cluster membership for each data vector to the nearest cluster center and computing the center of each cluster as the centroid of its member data vectors is equivalent to finding the minimum of a sum-of-squares cost function using coordinate descend. Despite the popularity of K means clustering, one of its major drawbacks is that the coordinate descend search method is prone to local minima. Much research has been done on computing refined initial points and adding explicit constraints to the sum-of-squares cost function for K-means clustering so that the search can converge to better local minimum [1,2]. In this paper we tackle the problem from a different angle: we find an equivalent formulation of the sum-of-squares minimization as a trace maximization problem with special constraints; relaxing the constraints leads to a maximization problem that possesses optimal global solutions. As a byproduct we also have an easily computable lower bound for the minimum of the sum-of-squares cost function. Our work is inspired by [9, 3] where connection to Gram matrix and extension of K means method to general Mercer kernels were investigated. The rest of the paper is organized as follows: in section 2, we derive the equivalent trace maximization formulation and discuss its spectral relaxation. In section 3, we discuss how to assign cluster membership using pivoted QR decomposition, taking into account the special structure of the partial eigenvector matrix. Finally, in section 4, we illustrate the performance of the clustering algorithms using document clustering as an example.

artificial intelligence, machine learning, vector, (14 more...)

Neural Information Processing Systems

Country: North America > United States > Pennsylvania (0.29)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback