Goto

Collaborating Authors

Lu, Wei


Deep-Learning-Enabled Simulated Annealing for Topology Optimization

arXiv.org Machine Learning

Topology optimization by distributing materials in a domain requires stochastic optimizers to solve highly complicated problems. However, solving such problems requires millions of finite element calculations with hundreds of design variables or more involved , whose computational cost is huge and often unacceptable. To speed up computation, here we report a method to integrate deep learning into stochastic optimization algorithm. A Deep Neural Network (DNN) learns and substitutes the objective function by forming a loop with Generative Simulated Annealing (GSA). In each iteration, GSA uses DNN to evaluate the objective function to obtain an optimized solution, based on which new training data are generated; thus, DNN enhances its accuracy and GSA could accordingly improve its solution in next iteration until convergence. Our algorithm was tested by compliance minimization problems and reduced computational time by over two orders of magnitude. This approach sheds light on solving large multi-dimensional optimization problems.


Read Beyond the Lines: Understanding the Implied Textual Meaning via a Skim and Intensive Reading Model

arXiv.org Artificial Intelligence

The nonliteral interpretation of a text is hard to be understood by machine models due to its high context-sensitivity and heavy usage of figurative language. In this study, inspired by human reading comprehension, we propose a novel, simple, and effective deep neural framework, called Skim and Intensive Reading Model (SIRM), for figuring out implied textual meaning. The proposed SIRM consists of two main components, namely the skim reading component and intensive reading component. N-gram features are quickly extracted from the skim reading component, which is a combination of several convolutional neural networks, as skim (entire) information. An intensive reading component enables a hierarchical investigation for both local (sentence) and global (paragraph) representation, which encapsulates the current embedding and the contextual information with a dense connection. More specifically, the contextual information includes the near-neighbor information and the skim information mentioned above. Finally, besides the normal training loss function, we employ an adversarial loss function as a penalty over the skim reading component to eliminate noisy information arisen from special figurative words in the training data. To verify the effectiveness, robustness, and efficiency of the proposed architecture, we conduct extensive comparative experiments on several sarcasm benchmarks and an industrial spam dataset with metaphors. Experimental results indicate that (1) the proposed model, which benefits from context modeling and consideration of figurative language, outperforms existing state-of-the-art solutions, with comparable parameter scale and training speed; (2) the SIRM yields superior robustness in terms of parameter size sensitivity; (3) compared with ablation and addition variants of the SIRM, the final framework is efficient enough.


The Learning of Fuzzy Cognitive Maps With Noisy Data: A Rapid and Robust Learning Method With Maximum Entropy

arXiv.org Machine Learning

Numerous learning methods for fuzzy cognitive maps (FCMs), such as the Hebbian-based and the population-based learning methods, have been developed for modeling and simulating dynamic systems. However, these methods are faced with several obvious limitations. Most of these models are extremely time consuming when learning the large-scale FCMs with hundreds of nodes. Furthermore, the FCMs learned by those algorithms lack robustness when the experimental data contain noise. In addition, reasonable distribution of the weights is rarely considered in these algorithms, which could result in the reduction of the performance of the resulting FCM. In this article, a straightforward, rapid, and robust learning method is proposed to learn FCMs from noisy data, especially, to learn large-scale FCMs. The crux of the proposed algorithm is to equivalently transform the learning problem of FCMs to a classic-constrained convex optimization problem in which the least-squares term ensures the robustness of the well-learned FCM and the maximum entropy term regularizes the distribution of the weights of the well-learned FCM. A series of experiments covering two frequently used activation functions (the sigmoid and hyperbolic tangent functions) are performed on both synthetic datasets with noise and real-world datasets. The experimental results show that the proposed method is rapid and robust against data containing noise and that the well-learned weights have better distribution. In addition, the FCMs learned by the proposed method also exhibit superior performance in comparison with the existing methods. Index Terms-Fuzzy cognitive maps (FCMs), maximum entropy, noisy data, rapid and robust learning.


cw2vec: Learning Chinese Word Embeddings with Stroke n-gram Information

AAAI Conferences

We propose cw2vec, a novel method for learning Chinese word embeddings. It is based on our observation that exploiting stroke-level information is crucial for improving the learning of Chinese word embeddings. Specifically, we design a minimalist approach to exploit such features, by using stroke n-grams, which capture semantic and morphological level information of Chinese words. Through qualitative analysis, we demonstrate that our model is able to extract semantic information that  cannot be captured by existing methods. Empirical results on the word similarity, word analogy, text classification and named entity recognition tasks show that the proposed approach consistently outperforms state-of-the-art approaches such as word-based word2vec and GloVe, character-based CWE, component-based JWE and pixel-based GWE.


Learning Latent Opinions for Aspect-level Sentiment Classification

AAAI Conferences

Aspect-level sentiment classification aims at detecting the sentiment expressed towards a particular target in a sentence. Based on the observation that the sentiment polarity is often related to specific spans in the given sentence, it is possible to make use of such information for better classification. On the other hand, such information can also serve as justifications associated with the predictions.We propose a segmentation attention based LSTM model which can effectively capture the structural dependencies between the target and the sentiment expressions with a linear-chain conditional random field (CRF) layer. The model simulates human's process of inferring sentiment information when reading: when given a target, humans tend to search for surrounding relevant text spans in the sentence before making an informed decision on the underlying sentiment information.We perform sentiment classification tasks on publicly available datasets on online reviews across different languages from SemEval tasks and social comments from Twitter. Extensive experiments show that our model achieves the state-of-the-art performance while extracting interpretable sentiment expressions.  


Cao

AAAI Conferences

We propose cw2vec, a novel method for learning Chinese word embeddings. It is based on our observation that exploiting stroke-level information is crucial for improving the learning of Chinese word embeddings. Specifically, we design a minimalist approach to exploit such features, by using stroke n-grams, which capture semantic and morphological level information of Chinese words. Through qualitative analysis, we demonstrate that our model is able to extract semantic information that cannot be captured by existing methods. Empirical results on the word similarity, word analogy, text classification and named entity recognition tasks show that the proposed approach consistently outperforms state-of-the-art approaches such as word-based word2vec and GloVe, character-based CWE, component-based JWE and pixel-based GWE.


Wang

AAAI Conferences

Aspect-level sentiment classification aims at detecting the sentiment expressed towards a particular target in a sentence. Based on the observation that the sentiment polarity is often related to specific spans in the given sentence, it is possible to make use of such information for better classification. On the other hand, such information can also serve as justifications associated with the predictions.We propose a segmentation attention based LSTM model which can effectively capture the structural dependencies between the target and the sentiment expressions with a linear-chain conditional random field (CRF) layer. The model simulates human's process of inferring sentiment information when reading: when given a target, humans tend to search for surrounding relevant text spans in the sentence before making an informed decision on the underlying sentiment information.We perform sentiment classification tasks on publicly available datasets on online reviews across different languages from SemEval tasks and social comments from Twitter. Extensive experiments show that our model achieves the state-of-the-art performance while extracting interpretable sentiment expressions.


Li

AAAI Conferences

In this paper, we focus on the task of extracting named entities together with their associated sentiment information in a joint manner.


Susanto

AAAI Conferences

We propose a neural graphical model for parsing natural language sentences into their logical representations. The graphical model is based on hybrid tree structures that jointly represent both sentences and semantics. Learning and decoding are done using efficient dynamic programming algorithms. The model is trained under a discriminative setting, which allows us to incorporate a rich set of features. Hybrid tree structures have shown to achieve state-of-the-art results on standard semantic parsing datasets. In this work, we propose a novel model that incorporates a rich, nonlinear featurization by a feedforward neural network. The error signals are computed with respect to the conditional random fields (CRFs) objective using an inside-outside algorithm, which are then backpropagated to the neural network. We demonstrate that by combining the strengths of the exact global inference in the hybrid tree models and the power of neural networks to extract high level features, our model is able to achieve new state-of-the-art results on standard benchmark datasets across different languages.


Cao

AAAI Conferences

We present a novel approach to learning word embeddings by exploring subword information (character n-gram, root/affix and inflections) and capturing the structural information of their context with convolutional feature learning. Specifically, we introduce a convolutional neural network architecture that allows us to measure structural information of context words and incorporate subword features conveying semantic, syntactic and morphological information related to the words. To assess the effectiveness of our model, we conduct extensive experiments on the standard word similarity and word analogy tasks. We showed improvements over existing state-of-the-art methods for learning word embeddings, including skipgram, GloVe, char n-gram and DSSM.