AITopics

2007.04785

Country: North America > United States (0.28)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

arXiv.org Machine LearningJul-7-2020

ASGN: An Active Semi-supervised Graph Neural Network for Molecular Property Prediction

Hao, Zhongkai, Lu, Chengqiang, Hu, Zheyuan, Wang, Hao, Huang, Zhenya, Liu, Qi, Chen, Enhong, Lee, Cheekong

Molecular property prediction (e.g., energy) is an essential problem in chemistry and biology. Unfortunately, many supervised learning methods usually suffer from the problem of scarce labeled molecules in the chemical space, where such property labels are generally obtained by Density Functional Theory (DFT) calculation which is extremely computational costly. An effective solution is to incorporate the unlabeled molecules in a semi-supervised fashion. However, learning semi-supervised representation for large amounts of molecules is challenging, including the joint representation issue of both molecular essence and structure, the conflict between representation and property leaning. Here we propose a novel framework called Active Semi-supervised Graph Neural Network (ASGN) by incorporating both labeled and unlabeled molecules. Specifically, ASGN adopts a teacher-student framework. In the teacher model, we propose a novel semi-supervised learning method to learn general representation that jointly exploits information from molecular structure and molecular distribution. Then in the student model, we target at property prediction task to deal with the learning loss conflict. At last, we proposed a novel active learning strategy in terms of molecular diversities to select informative data during the whole framework learning. We conduct extensive experiments on several public datasets. Experimental results show the remarkable performance of our ASGN framework.

artificial intelligence, molecule, neural network, (19 more...)

doi: 10.1145/3394486.3403117

2007.03196

Country:

Asia (0.46)
North America > United States > Wisconsin (0.14)

Genre: Research Report > New Finding (0.48)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.68)
Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Neural Information Processing SystemsFeb-14-2020, 19:57:31 GMT

Neural Architecture Optimization

Luo, Renqian, Tian, Fei, Qin, Tao, Chen, Enhong, Liu, Tie-Yan

Automatic neural architecture design has shown its potential in discovering powerful neural network architectures. Existing methods, no matter based on reinforcement learning or evolutionary algorithms (EA), conduct architecture search in a discrete space, which is highly inefficient. In this paper, we propose a simple and efficient method to automatic neural architecture design based on continuous optimization. We call this new approach neural architecture optimization (NAO). There are three key components in our proposed approach: (1) An encoder embeds/maps neural network architectures into a continuous space.

architecture, artificial intelligence, neural network, (8 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Systems & Languages > Problem-Independent Architectures (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Machine LearningJan-2-2020

Deep Technology Tracing for High-tech Companies

Wu, Han, Zhang, Kun, Lv, Guangyi, Liu, Qi, Yu, Runlong, Zhao, Weihao, Chen, Enhong, Ma, Jianhui

Technological change and innovation are vitally important, especially for high-tech companies. However, factors influencing their future research and development (R&D) trends are both complicated and various, leading it a quite difficult task to make technology tracing for high-tech companies. To this end, in this paper, we develop a novel data-driven solution, i.e., Deep Technology Forecasting (DTF) framework, to automatically find the most possible technology directions customized to each high-tech company. Specially, DTF consists of three components: Potential Competitor Recognition (PCR), Collaborative Technology Recognition (CTR), and Deep Technology Tracing (DTT) neural network. For one thing, PCR and CTR aim to capture competitive relations among enterprises and collaborative relations among technologies, respectively. For another, DTT is designed for modeling dynamic interactions between companies and technologies with the above relations involved. Finally, we evaluate our DTF framework on real-world patent data, and the experimental results clearly prove that DTF can precisely help to prospect future technology emphasis of companies by exploiting hybrid factors.

company and technology, deep learning, intellectual property & technology law, (20 more...)

doi: 10.1109/ICDM.2019.00180

2001.08606

Country: North America > United States (0.47)

Genre: Research Report (0.64)

Industry:

Law > Intellectual Property & Technology Law (1.00)
Information Technology (1.00)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

arXiv.org Artificial IntelligenceOct-27-2019

Long-term Joint Scheduling for Urban Traffic

Liang, Xianfeng, Wu, Likang, Chen, Joya, Liu, Yang, Yu, Runlong, Hou, Min, Wu, Han, Ye, Yuyang, Liu, Qi, Chen, Enhong

Recently, the traffic congestion in modern cities has become a growing worry for the residents. As presented in Baidu traffic report, the commuting stress index has reached surprising 1.973 in Beijing during rush hours, which results in longer trip time and increased vehicular queueing. Previous works have demonstrated that by reasonable scheduling, e.g, rebalancing bike-sharing systems and optimized bus transportation, the traffic efficiency could be significantly improved with little resource consumption. However, there are still two disadvantages that restrict their performance: (1) they only consider single scheduling in a short time, but ignoring the layout after first reposition, and (2) they only focus on the single transport. However, the multi-modal characteristics of urban public transportation are largely under-exploited. In this paper, we propose an efficient and economical multi-modal traffic scheduling scheme named JLRLS based on spatio -temporal prediction, which adopts reinforcement learning to obtain optimal long-term and joint schedule. In JLRLS, we combines multiple transportation to conduct scheduling by their own characteristics, which potentially helps the system to reach the optimal performance. Our implementation of an example by PaddlePaddle is available at https://github.com/bigdata-ustc/Long-term-Joint-Scheduling, with an explaining video at https://youtu.be/t5M2wVPhTyk.

deep learning, neural network, scheduling, (22 more...)

1910.12283

Country: Asia > China > Beijing > Beijing (0.25)

Genre: Research Report (0.50)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)

Technology:

Information Technology > Data Science (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

arXiv.org Machine LearningSep-24-2019

Understanding and Improving One-shot Neural Architecture Optimization

Luo, Renqian, Qin, Tao, Chen, Enhong

The ability of accurately ranking candidate architectures is the key to the performance of neural architecture search~(NAS). One-shot NAS is proposed to cut the expense but shows inferior performance against conventional NAS and is not adequately stable. We find that the ranking correlation between architectures under one-shot training and the ones under stand-alone training is poor, which misleads the algorithm to discover better architectures. We conjecture that this is owing to the gaps between one-shot training and stand-alone complete training. In this work, we empirically investigate several main factors that lead to the gaps and so weak ranking correlation. We then propose NAO-V2 to alleviate such gaps where we: (1) Increase the average updates for individual architecture to a relatively adequate extent. (2) Encourage more updates for large and complex architectures than small and simple architectures to balance them by sampling architectures in proportion to their model sizes. (3) Make the one-shot training of the supernet independent at each iteration. Comprehensive experiments verify that our proposed method is effective and robust. It leads to a more stable search that all the top architectures perform well enough compared to baseline methods. The final discovered architecture shows significant improvements against baselines with a test error rate of 2.60% on CIFAR-10 and top-1 accuracy of 74.4% on ImageNet under the mobile setting. Code and model checkpoints are publicly available at https://github.com/renqianluo/NAO_pytorch.

architecture, deep learning, neural network, (18 more...)

1909.10815

Country: Asia > China (0.28)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

arXiv.org Artificial IntelligenceAug-28-2019

STMARL: A Spatio-Temporal Multi-Agent Reinforcement Learning Approach for Traffic Light Control

Wang, Yanan, Xu, Tong, Niu, Xin, Tan, Chang, Chen, Enhong, Xiong, Hui

The development of intelligent traffic light control systems is essential for smart transportation management. While some efforts have been made to optimize the use of individual traffic lights in an isolated way, related studies have largely ignored the fact that the use of multi-intersection traffic lights is spatially influenced and there is a temporal dependency of historical traffic status for current traffic light control. To that end, in this paper, we propose a novel SpatioTemporal Multi-Agent Reinforcement Learning (STMARL) framework for effectively capturing the spatio-temporal dependency of multiple related traffic lights and control these traffic lights in a coordinating way. Specifically, we first construct the traffic light adjacency graph based on the spatial structure among traffic lights. Then, historical traffic records will be integrated with current traffic status via Recurrent Neural Network structure. Moreover, based on the temporally-dependent traffic information, we design a Graph Neural Network based model to represent relationships among multiple traffic lights, and the decision for each traffic light will be made in a distributed way by the deep Q-learning method. Finally, the experimental results on both synthetic and real-world data have demonstrated the effectiveness of our STMARL framework, which also provides an insightful understanding of the influence mechanism among multi-intersection traffic lights.

deep learning, ground transportation, neural network, (21 more...)

1908.10577

Country: North America > United States (0.30)

Genre: Research Report (0.82)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

arXiv.org Machine LearningAug-23-2019

Interpretable Cognitive Diagnosis with Neural Network for Intelligent Educational Systems

Wang, Fei, Liu, Qi, Chen, Enhong, Huang, Zhenya

In intelligent education systems, one key issue is to discover students' proficiency level on specific knowledge concepts, which called cognitive diagnosis. Existing approaches usually mine the student exercising process by manually designed function, which is usually linear and not sufficient to capture complex relations between students and exercises. In this paper, we propose a general Neural Cognitive Diagnosis (NeuralCD) framework, which incorporates neural networks to learn the complex interactions between student's and exercise's factor vectors. The interpretability of factor vectors is guaranteed with the monotonicity assumption borrowed from educational psychology. We provide NeuralCDM model as an implementation example of the framework. Further, we explore the text content for improving NeuralCDM to show the extendability of NeuralCD, and demonstrate the generality of NeuralCD by proving how it covers some traditional diagnostic models. Extensive experimental results on real-world datasets show the effectiveness of NeuralCD framework with both accuracy and interpretability.

computer based training, deep learning, student, (22 more...)

1908.08733

Genre: Research Report (0.64)

Industry:

Education > Educational Technology > Educational Software (0.47)
Education > Educational Setting (0.46)
Education > Policy & Governance > Governance (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

arXiv.org Artificial IntelligenceJun-2-2019

Budgeted Policy Learning for Task-Oriented Dialogue Systems

Zhang, Zhirui, Li, Xiujun, Gao, Jianfeng, Chen, Enhong

This paper presents a new approach that extends Deep Dyna-Q (DDQ) by incorporating a Budget-Conscious Scheduling (BCS) to best utilize a fixed, small amount of user interactions (budget) for learning task-oriented dialogue agents. BCS consists of (1) a Poisson-based global scheduler to allocate budget over different stages of training; (2) a controller to decide at each training step whether the agent is trained using real or simulated experiences; (3) a user goal sampling module to generate the experiences that are most effective for policy learning. Experiments on a movie-ticket booking task with simulated and real users show that our approach leads to significant improvements in success rate over the state-of-the-art baselines given the fixed budget.

agent, deep learning, neural network, (22 more...)

1906.00499

Genre: Research Report (1.00)

Industry: Media > Film (0.35)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.69)

arXiv.org Machine LearningMay-26-2019

Transcribing Content from Structural Images with Spotlight Mechanism

Yin, Yu, Huang, Zhenya, Chen, Enhong, Liu, Qi, Zhang, Fuzheng, Xie, Xing, Hu, Guoping

Transcribing content from structural images, e.g., writing notes from music scores, is a challenging task as not only the content objects should be recognized, but the internal structure should also be preserved. Existing image recognition methods mainly work on images with simple content (e.g., text lines with characters), but are not capable to identify ones with more complex content (e.g., structured symbols), which often follow a fine-grained grammar. To this end, in this paper, we propose a hierarchical Spotlight Transcribing Network (STN) framework followed by a two-stage "where-to-what" solution. Specifically, we first decide "where-to-look" through a novel spotlight mechanism to focus on different areas of the original image following its structure. Then, we decide "what-to-write" by developing a GRU based network with the spotlight areas for transcribing the content accordingly. Moreover, we propose two implementations on the basis of STN, i.e., STNM and STNR, where the spotlight movement follows the Markov property and Recurrent modeling, respectively. We also design a reinforcement method to refine the framework by self-improving the spotlight mechanism. We conduct extensive experiments on many structural image datasets, where the results clearly demonstrate the effectiveness of STN framework.

deep learning, neural network, structural image, (24 more...)

doi: 10.1145/3219819.3219962

1905.10954

Country:

Asia (0.28)
North America > United States (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Education (1.00)
Leisure & Entertainment (0.88)
Media > Music (0.67)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(5 more...)