AITopics | Wang, Chenguang

Plotting

Wang, Chenguang

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Targeted Analysis of High-Risk States Using an Oriented Variational Autoencoder

Wang, Chenguang, Sharifnia, Ensieh, Tindemans, Simon H., Palensky, Peter

arXiv.org Artificial IntelligenceMar-20-2023

Variational autoencoder (VAE) neural networks can be trained to generate power system states that capture both marginal distribution and multivariate dependencies of historical data. The coordinates of the latent space codes of VAEs have been shown to correlate with conceptual features of the data, which can be leveraged to synthesize targeted data with desired features. However, the locations of the VAEs' latent space codes that correspond to specific properties are not constrained. Additionally, the generation of data with specific characteristics may require data with corresponding hard-to-get labels fed into the generative model for training. In this paper, to make data generation more controllable and efficient, an oriented variation autoencoder (OVAE) is proposed to constrain the link between latent space code and generated data in the form of a Spearman correlation, which provides increased control over the data synthesis process. On this basis, an importance sampling process is used to sample data in the latent space. Two cases are considered for testing the performance of the OVAE model: the data set is fully labeled with approximate information and the data set is incompletely labeled but with more accurate information. The experimental results show that, in both cases, the OVAE model correlates latent space codes with the generated data, and the efficiency of generating targeted samples is significantly improved.

artificial intelligence, machine learning, ovae model, (15 more...)

arXiv.org Artificial Intelligence

2303.1141

Country: Europe (1.00)

Genre: Research Report > New Finding (0.34)

Industry: Energy > Power Industry (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

DeepStruct: Pretraining of Language Models for Structure Prediction

Wang, Chenguang, Liu, Xiao, Chen, Zui, Hong, Haoyun, Tang, Jie, Song, Dawn

arXiv.org Artificial IntelligenceMar-5-2023

We introduce a method for improving the structural understanding abilities of language models. Unlike previous approaches that finetune the models with task-specific augmentation, we pretrain language models on a collection of task-agnostic corpora to generate structures from text. Our structure pretraining enables zero-shot transfer of the learned knowledge that models have about the structure tasks. We study the performance of this approach on 28 datasets, spanning 10 structure prediction tasks including open information extraction, joint entity and relation extraction, named entity recognition, relation classification, semantic role labeling, event extraction, coreference resolution, factual probe, intent detection, and dialogue state tracking. We further enhance the pretraining with the task-specific training sets. We show that a 10B parameter language model transfers non-trivially to most tasks and obtains state-of-the-art performance on 21 of 28 datasets that we evaluate.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2205.10475

Country:

North America > United States (1.00)
Asia (1.00)

Genre: Research Report (1.00)

Industry:

Government (0.67)
Health & Medicine > Therapeutic Area > Oncology (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.91)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.91)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.66)

Add feedback

Comprehensive Analysis of Over-smoothing in Graph Neural Networks from Markov Chains Perspective

Zhao, Weichen, Wang, Chenguang, Han, Congying, Guo, Tiande

arXiv.org Artificial IntelligenceMar-1-2023

The over-smoothing problem is an obstacle of developing deep graph neural network (GNN). Although many approaches to improve the over-smoothing problem have been proposed, there is still a lack of comprehensive understanding and conclusion of this problem. In this work, we analyze the over-smoothing problem from the Markov chain perspective. We focus on message passing of GNN and first establish a connection between GNNs and Markov chains on the graph. GNNs are divided into two classes of operator-consistent and operator-inconsistent based on whether the corresponding Markov chains are time-homogeneous. Next we attribute the over-smoothing problem to the convergence of an arbitrary initial distribution to a stationary distribution. Based on this, we prove that although the previously proposed methods can alleviate over-smoothing, but these methods cannot avoid the over-smoothing problem. In addition, we give the conclusion of the over-smoothing problem in two types of GNNs in the Markovian sense. On the one hand, operator-consistent GNN cannot avoid over-smoothing at an exponential rate. On the other hand, operator-inconsistent GNN is not always over-smoothing. Further, we investigate the existence of the limiting distribution of the time-inhomogeneous Markov chain, from which we derive a sufficient condition for operator-inconsistent GNN to avoid over-smoothing. Finally, we design experiments to verify our findings. Results show that our proposed sufficient condition can effectively improve over-smoothing problem in operator-inconsistent GNN and enhance the performance of the model.

artificial intelligence, machine learning, over-smoothing problem, (18 more...)

arXiv.org Artificial Intelligence

2211.06605

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

ASP: Learn a Universal Neural Solver!

Wang, Chenguang, Yu, Zhouliang, McAleer, Stephen, Yu, Tianshu, Yang, Yaodong

arXiv.org Artificial IntelligenceMar-1-2023

Applying machine learning to combinatorial optimization problems has the potential to improve both efficiency and accuracy. However, existing learning-based solvers often struggle with generalization when faced with changes in problem distributions and scales. In this paper, we propose a new approach called ASP: Adaptive Staircase Policy Space Response Oracle to address these generalization issues and learn a universal neural solver. ASP consists of two components: Distributional Exploration, which enhances the solver's ability to handle unknown distributions using Policy Space Response Oracles, and Persistent Scale Adaption, which improves scalability through curriculum learning. We have tested ASP on several challenging COPs, including the traveling salesman problem, the vehicle routing problem, and the prize collecting TSP, as well as the real-world instances from TSPLib and CVRPLib. Our results show that even with the same model size and weak training signal, ASP can help neural solvers explore and adapt to unseen distributions and varying scales, achieving superior performance. In particular, compared with the same neural solvers under a standard training pipeline, ASP produces a remarkable decrease in terms of the optimality gap with 90.9% and 47.43% on generated instances and real-world instances for TSP, and a decrease of 19% and 45.57% for CVRP.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2303.00466

Country: North America > United States > California (0.28)

Genre: Research Report > New Finding (0.54)

Industry:

Transportation (0.55)
Leisure & Entertainment > Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.88)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.66)

Add feedback

Generating Multivariate Load States Using a Conditional Variational Autoencoder

Wang, Chenguang, Sharifnia, Ensieh, Gao, Zhi, Tindemans, Simon H., Palensky, Peter

arXiv.org Artificial IntelligenceDec-14-2022

For planning of power systems and for the calibration of operational tools, it is essential to analyse system performance in a large range of representative scenarios. When the available historical data is limited, generative models are a promising solution, but modelling high-dimensional dependencies is challenging. In this paper, a multivariate load state generating model on the basis of a conditional variational autoencoder (CVAE) neural network is proposed. Going beyond common CVAE implementations, the model includes stochastic variation of output samples under given latent vectors and co-optimizes the parameters for this output variability. It is shown that this improves statistical properties of the generated data. The quality of generated multivariate loads is evaluated using univariate and multivariate performance metrics. A generation adequacy case study on the European network is used to illustrate model's ability to generate realistic tail distributions. The experiments demonstrate that the proposed generator outperforms other data generating mechanisms.

artificial intelligence, historical data, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1016/j.epsr.2022.108603

2110.11435

Country: Europe (1.00)

Genre: Research Report (1.00)

Industry:

Energy > Renewable (1.00)
Energy > Power Industry (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Generating Contextual Load Profiles Using a Conditional Variational Autoencoder

Wang, Chenguang, Tindemans, Simon H., Palensky, Peter

arXiv.org Artificial IntelligenceDec-13-2022

Generating power system states that have similar distribution and dependency to the historical ones is essential for the tasks of system planning and security assessment, especially when the historical data is insufficient. In this paper, we described a generative model for load profiles of industrial and commercial customers, based on the conditional variational autoencoder (CVAE) neural network architecture, which is challenging due to the highly variable nature of such profiles. Generated contextual load profiles were conditioned on the month of the year and typical power exchange with the grid. Moreover, the quality of generations was both visually and statistically evaluated. The experimental results demonstrate our proposed CVAE model can capture temporal features of historical load profiles and generate `realistic' data with satisfying univariate distributions and multivariate dependencies.

artificial intelligence, load profile, machine learning, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ISGT-Europe54678.2022.9960309

2209.04056

Genre: Research Report > New Finding (0.48)

Industry: Energy > Renewable (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Zero-Shot Information Extraction as a Unified Text-to-Triple Translation

Wang, Chenguang, Liu, Xiao, Chen, Zui, Hong, Haoyun, Tang, Jie, Song, Dawn

arXiv.org Artificial IntelligenceSep-23-2021

We cast a suite of information extraction tasks into a text-to-triple translation framework. Instead of solving each task relying on task-specific datasets and models, we formalize the task as a translation between task-specific input text and output triples. By taking the task-specific input, we enable a task-agnostic translation by leveraging the latent knowledge that a pre-trained language model has about the task. We further demonstrate that a simple pre-training task of predicting which relational information corresponds to which input text is an effective way to produce task-specific outputs. This enables the zero-shot transfer of our framework to downstream tasks. We study the zero-shot performance of this framework on open information extraction (OIE2016, NYT, WEB, PENN), relation classification (FewRel and TACRED), and factual probe (Google-RE and T-REx). The model transfers non-trivially to most tasks and is often competitive with a fully supervised method without the need for any task-specific training. For instance, we significantly outperform the F1 score of the supervised open information extraction without needing to use its training set.

artificial intelligence, dataset, natural language, (18 more...)

arXiv.org Artificial Intelligence

2109.11171

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.67)

Industry:

Media (0.67)
Government (0.67)
Leisure & Entertainment > Sports > Horse Racing (0.46)

Technology:

Information Technology > Data Science > Data Mining > Text Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)

Add feedback

Language Models are Open Knowledge Graphs

Wang, Chenguang, Liu, Xiao, Song, Dawn

arXiv.org Artificial IntelligenceOct-22-2020

This paper shows how to construct knowledge graphs (KGs) from pre-trained language models (e.g., BERT, GPT-2/3), without human supervision. Popular KGs (e.g, Wikidata, NELL) are built in either a supervised or semi-supervised manner, requiring humans to create knowledge. Recent deep language models automatically acquire knowledge from large-scale corpora via pre-training. The stored knowledge has enabled the language models to improve downstream NLP tasks, e.g., answering questions, and writing code and articles. In this paper, we propose an unsupervised method to cast the knowledge contained within language models into KGs. We show that KGs are constructed with a single forward pass of the pre-trained language models (without fine-tuning) over the corpora. We demonstrate the quality of the constructed KGs by comparing to two KGs (Wikidata, TAC KBP) created by humans. Our KGs also provide open factual knowledge that is new in the existing KGs. Our code and KGs will be made publicly available.

helen storrow helen osborne storrow, law enforcement, soccer, (27 more...)

arXiv.org Artificial Intelligence

2010.11967

Country:

South America (1.00)
Asia > Middle East (1.00)
Europe > United Kingdom > England (0.67)
North America > United States > California > Los Angeles County (0.28)

Genre:

Personal > Obituary (1.00)
Research Report (0.81)

Industry:

Media > News (1.00)
Media > Music (1.00)
Media > Film (1.00)
(16 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.67)

Add feedback

GluonCV and GluonNLP: Deep Learning in Computer Vision and Natural Language Processing

Guo, Jian, He, He, He, Tong, Lausen, Leonard, Li, Mu, Lin, Haibin, Shi, Xingjian, Wang, Chenguang, Xie, Junyuan, Zha, Sheng, Zhang, Aston, Zhang, Hang, Zhang, Zhi, Zhang, Zhongyue, Zheng, Shuai

arXiv.org Machine LearningJul-9-2019

We present GluonCV and GluonNLP, the deep learning toolkits for computer vision and natural language processing based on Apache MXNet (incubating). These toolkits provide state-of-the-art pre-trained models, training scripts, and training logs, to facilitate rapid prototyping and promote reproducible research. We also provide modular APIs with flexible building blocks to enable efficient customization. Leveraging the MXNet ecosystem, the deep learning models in GluonCV and GluonNLP can be deployed onto a variety of platforms with different programming languages. Benefiting from open source under the Apache 2.0 license, GluonCV and GluonNLP have attracted 100 contributors worldwide on GitHub. Models of GluonCV and GluonNLP have been downloaded for more than 1.6 million times in fewer than 10 months.

deep learning, gluoncv nlp, neural network, (15 more...)

arXiv.org Machine Learning

1907.04433

Country:

Asia > China (0.16)
North America > United States (0.15)

Genre: Research Report (0.51)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Language Models with Transformers

Wang, Chenguang, Li, Mu, Smola, Alexander J.

arXiv.org Artificial IntelligenceApr-20-2019

The Transformer architecture is superior to RNN-based models in computational efficiency. Recently, GPT and BERT demonstrate the efficacy of Transformer models on various NLP tasks using pre-trained language models on large-scale corpora. Surprisingly, these Transformer architectures are suboptimal for language model itself. Neither self-attention nor the positional encoding in the Transformer is able to efficiently incorporate the word-level sequential context crucial to language modeling. In this paper, we explore effective Transformer architectures for language model, including adding additional LSTM layers to better capture the sequential context while still keeping the computation efficient. We propose Coordinate Architecture Search (CAS) to find an effective architecture through iterative refinement of the model. Experimental results on the PTB, WikiText-2, and WikiText-103 show that CAS achieves perplexities between 20.42 and 34.11 on all problems, i.e. on average an improvement of 12.0 perplexity units compared to state-of-the-art LSTMs.

architecture, deep learning, neural network, (21 more...)

arXiv.org Artificial Intelligence

1904.09408

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback