AITopics | Xu, Chunyan

Collaborating Authors

Xu, Chunyan

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

MMM-RS: A Multi-modal, Multi-GSD, Multi-scene Remote Sensing Dataset and Benchmark for Text-to-Image Generation

Luo, Jialin, Wang, Yuanzhi, Gu, Ziqi, Qiu, Yide, Yao, Shuaizhen, Wang, Fuyun, Xu, Chunyan, Zhang, Wenhua, Wang, Dan, Cui, Zhen

arXiv.org Artificial IntelligenceOct-26-2024

Recently, the diffusion-based generative paradigm has achieved impressive general image generation capabilities with text prompts due to its accurate distribution modeling and stable training process. However, generating diverse remote sensing (RS) images that are tremendously different from general images in terms of scale and perspective remains a formidable challenge due to the lack of a comprehensive remote sensing image generation dataset with various modalities, ground sample distances (GSD), and scenes. In this paper, we propose a Multi-modal, Multi-GSD, Multi-scene Remote Sensing (MMM-RS) dataset and benchmark for text-to-image generation in diverse remote sensing scenarios. Specifically, we first collect nine publicly available RS datasets and conduct standardization for all samples. To bridge RS images to textual semantic information, we utilize a large-scale pretrained vision-language model to automatically output text prompts and perform hand-crafted rectification, resulting in information-rich text-image pairs (including multi-modal images). In particular, we design some methods to obtain the images with different GSD and various environments (e.g., low-light, foggy) in a single sample. With extensive manual screening and refining annotations, we ultimately obtain a MMM-RS dataset that comprises approximately 2.1 million text-image pairs. Extensive experimental results verify that our proposed MMM-RS dataset allows off-the-shelf diffusion models to generate diverse RS images across various modalities, scenes, weather conditions, and GSD. The dataset is available at https://github.com/ljl5261/MMM-RS.

artificial intelligence, deep learning, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2410.22362

Country: Asia > China (0.68)

Genre: Research Report > New Finding (0.68)

Industry: Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Gaussian-Induced Convolution for Graphs

Jiang, Jiatao, Cui, Zhen, Xu, Chunyan, Yang, Jian

arXiv.org Machine LearningNov-11-2018

Learning representation on graph plays a crucial role in numerous tasks of pattern recognition. Different from grid-shaped images/videos, on which local convolution kernels can be lattices, however, graphs are fully coordinate-free on vertices and edges. In this work, we propose a Gaussian-induced convolution (GIC) framework to conduct local convolution filtering on irregular graphs. Specifically, an edge-induced Gaussian mixture model is designed to encode variations of subgraph region by integrating edge information into weighted Gaussian models, each of which implicitly characterizes one component of subgraph variations. In order to coarsen a graph, we derive a vertex-induced Gaussian mixture model to cluster vertices dynamically according to the connection of edges, which is approximately equivalent to the weighted graph cut. We conduct our multi-layer graph convolution network on several public datasets of graph classification. The extensive experiments demonstrate that our GIC is effective and can achieve the state-of-the-art results.

deep learning, graph, neural network, (20 more...)

arXiv.org Machine Learning

1811.04393

Country: North America > United States > Texas (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Context-Dependent Diffusion Network for Visual Relationship Detection

Cui, Zhen, Xu, Chunyan, Zheng, Wenming, Yang, Jian

arXiv.org Machine LearningSep-10-2018

Visual relationship detection can bridge the gap between computer vision and natural language for scene understanding of images. Different from pure object recognition tasks, the relation triplets of subject-predicate-object lie on an extreme diversity space, such as \textit{person-behind-person} and \textit{car-behind-building}, while suffering from the problem of combinatorial explosion. In this paper, we propose a context-dependent diffusion network (CDDN) framework to deal with visual relationship detection. To capture the interactions of different object instances, two types of graphs, word semantic graph and visual scene graph, are constructed to encode global context interdependency. The semantic graph is built through language priors to model semantic correlations across objects, whilst the visual scene graph defines the connections of scene objects so as to utilize the surrounding scene information. For the graph-structured data, we design a diffusion network to adaptively aggregate information from contexts, which can effectively learn latent representations of visual relationships and well cater to visual relationship detection in view of its isomorphic invariance to graphs. Experiments on two widely-used datasets demonstrate that our proposed method is more effective and achieves the state-of-the-art performance.

deep learning, neural network, relationship detection, (18 more...)

arXiv.org Machine Learning

doi: 10.1145/3240508.3240668

1809.06213

Country:

Asia (0.28)
North America > United States (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Add feedback

When Work Matters: Transforming Classical Network Structures to Graph CNN

Zhao, Wenting, Xu, Chunyan, Cui, Zhen, Zhang, Tong, Jiang, Jiatao, Zhang, Zhenyu, Yang, Jian

arXiv.org Machine LearningJul-7-2018

Numerous pattern recognition applications can be formed as learning from graph-structured data, including social network, protein-interaction network, the world wide web data, knowledge graph, etc. While convolutional neural network (CNN) facilitates great advances in gridded image/video understanding tasks, very limited attention has been devoted to transform these successful network structures (including Inception net, Residual net, Dense net, etc.) to establish convolutional networks on graph, due to its irregularity and complexity geometric topologies (unordered vertices, unfixed number of adjacent edges/vertices). In this paper, we aim to give a comprehensive analysis of when work matters by transforming different classical network structures to graph CNN, particularly in the basic graph recognition problem. Specifically, we firstly review the general graph CNN methods, especially in its spectral filtering operation on the irregular graph data. We then introduce the basic structures of ResNet, Inception and DenseNet into graph CNN and construct these network structures on graph, named as G_ResNet, G_Inception, G_DenseNet. In particular, it seeks to help graph CNNs by shedding light on how these classical network structures work and providing guidelines for choosing appropriate graph network frameworks. Finally, we comprehensively evaluate the performance of these different network structures on several public graph datasets (including social networks and bioinformatic datasets), and demonstrate how different network structures work on graph CNN in the graph recognition task.

deep learning, graph cnn, neural network, (20 more...)

arXiv.org Machine Learning

1807.02653

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.93)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Spatio-Temporal Graph Convolution for Skeleton Based Action Recognition

Li, Chaolong (Southeast University) | Cui, Zhen (Nanjing University of Science and Technology) | Zheng, Wenming (Southeast University) | Xu, Chunyan (Nanjing University of Science and Technology) | Yang, Jian (Nanjing University of Science and Technology)

AAAI ConferencesFeb-8-2018

Variations of human body skeletons may be considered as dynamic graphs, which are generic data representation for numerous real-world applications. In this paper, we propose a spatio-temporal graph convolution (STGC) approach for assembling the successes of local convolutional filtering and sequence learning ability of autoregressive moving average. To encode dynamic graphs, the constructed multi-scale local graph convolution filters, consisting of matrices of local receptive fields and signal mappings, are recursively performed on structured graph data of temporal and spatial domain. The proposed model is generic and principled as it can be generalized into other dynamic models. We theoretically prove the stability of STGC and provide an upper-bound of the signal transformation to be learnt. Further, the proposed recursive model can be stacked into a multi-layer architecture. To evaluate our model, we conduct extensive experiments on four benchmark skeleton-based action datasets, including the large-scale challenging NTU RGB+D. The experimental results demonstrate the effectiveness of our proposed model and the improvement over the state-of-the-art.

action recognition, deep learning, neural network, (18 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country: North America > United States > Texas (0.14)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Deep Learning with S-Shaped Rectified Linear Activation Units

Jin, Xiaojie (National University of Singapore) | Xu, Chunyan (Nanjing University of Science and Technology) | Feng, Jiashi (National University of Singapore) | Wei, Yunchao (National University of Singapore) | Xiong, Junjun (Beijing Samsung Telecom) | Yan, Shuicheng (National University of Singapore)

AAAI ConferencesApr-19-2016

Rectified linear activation units are important components for state-of-the-art deep convolutional networks. In this paper, we propose a novel S-shaped rectifiedlinear activation unit (SReLU) to learn both convexand non-convex functions, imitating the multiple function forms given by the two fundamental laws, namely the Webner-Fechner law and the Stevens law, in psychophysics and neural sciences. Specifically, SReLU consists of three piecewise linear functions, which are formulated by four learnable parameters. The SReLU is learned jointly with the training of the whole deep network through back propagation. During the training phase, to initialize SReLU in different layers, we propose a “freezing” method to degenerate SReLU into a predefined leaky rectified linear unit in the initial several training epochs and then adaptively learn the good initial values. SReLU can be universally used in the existing deep networks with negligible additional parameters and computation cost. Experiments with two popular CNN architectures, Network in Network and GoogLeNet on scale-various benchmarks including CIFAR10, CIFAR100, MNIST and ImageNet demonstrate that SReLU achieves remarkable improvement compared to other activation functions.

deep learning, neural network, srelu, (21 more...)

AAAI Conferences

Thirtieth AAAI Conference on Artificial Intelligence

Industry: Health & Medicine (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Generalized Singular Value Thresholding

Lu, Canyi (National University of Singapore) | Zhu, Changbo (National University of Singapore) | Xu, Chunyan (Huazhong University of Science and Technology) | Yan, Shuicheng (National University of Singapore) | Lin, Zhouchen (Peking University)

AAAI ConferencesMar-6-2015

This work studies the Generalized Singular Value Thresholding (GSVT) operator associated with a nonconvex function g defined on the singular values of X. We prove that GSVT can be obtained by performing the proximal operator of g on the singular values since Proxg(.) is monotone when g is lower bounded. If the nonconvex g satisfies some conditions (many popular nonconvex surrogate functions, e.g., lp-norm, 0 < p < 1, of l0-norm are special cases), a general solver to find Proxg(b) is proposed for any b ≥ 0. GSVT greatly generalizes the known Singular Value Thresholding (SVT) which is a basic subroutine in many convex low rank minimization methods. We are able to solve the nonconvex low rank minimization problem by using GSVT in place of SVT.

artificial intelligence, machine learning, singular value, (18 more...)

AAAI Conferences

Twenty-Ninth AAAI Conference on Artificial Intelligence

Country: Asia (0.14)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.47)

Add feedback