AITopics | Wang, Di

Collaborating Authors

Wang, Di

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Single-Path Mobile AutoML: Efficient ConvNet Design and NAS Hyperparameter Optimization

Stamoulis, Dimitrios, Ding, Ruizhou, Wang, Di, Lymberopoulos, Dimitrios, Priyantha, Bodhi, Liu, Jie, Marculescu, Diana

arXiv.org Machine LearningJul-1-2019

Can we reduce the search cost of Neural Architecture Search (NAS) from days down to only few hours? NAS methods automate the design of Convolutional Networks (ConvNets) under hardware constraints and they have emerged as key components of AutoML frameworks. However, the NAS problem remains challenging due to the combinatorially large design space and the significant search time (at least 200 GPU-hours). In this work, we alleviate the NAS search cost down to less than 3 hours, while achieving state-of-the-art image classification results under mobile latency constraints. We propose a novel differentiable NAS formulation, namely Single-Path NAS, that uses one single-path over-parameterized ConvNet to encode all architectural decisions based on shared convolutional kernel parameters, hence drastically decreasing the search overhead. Single-Path NAS achieves state-of-the-art top-1 ImageNet accuracy (75.62%), hence outperforming existing mobile NAS methods in similar latency settings (~80ms). In particular, we enhance the accuracy-runtime trade-off in differentiable NAS by treating the Squeeze-and-Excitation path as a fully searchable operation with our novel single-path encoding. Our method has an overall cost of only 8 epochs (24 TPU-hours), which is up to 5,000x faster compared to prior work. Moreover, we study how different NAS formulation choices affect the performance of the designed ConvNets. Furthermore, we exploit the efficiency of our method to answer an interesting question: instead of empirically tuning the hyperparameters of the NAS solver (as in prior work), can we automatically find the hyperparameter values that yield the desired accuracy-runtime trade-off? We open-source our entire codebase at: https://github.com/dstamoulis/single-path-nas.

convnet, deep learning, neural network, (19 more...)

arXiv.org Machine Learning

1907.00959

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Single-Path NAS: Device-Aware Efficient ConvNet Design

Stamoulis, Dimitrios, Ding, Ruizhou, Wang, Di, Lymberopoulos, Dimitrios, Priyantha, Bodhi, Liu, Jie, Marculescu, Diana

arXiv.org Machine LearningMay-10-2019

Can we automatically design a Convolutional Network (ConvNet) with the highest image classification accuracy under the latency constraint of a mobile device? Neural Architecture Search (NAS) for ConvNet design is a challenging problem due to the combinatorially large design space and search time (at least 200 GPU-hours). To alleviate this complexity, we propose Single-Path NAS, a novel differentiable NAS method for designing device-efficient ConvNets in less than 4 hours. 1. Novel NAS formulation: our method introduces a single-path, over-parameterized ConvNet to encode all architectural decisions with shared convolutional kernel parameters. 2. NAS efficiency: Our method decreases the NAS search cost down to 8 epochs (30 TPU-hours), i.e., up to 5,000x faster compared to prior work. 3. On-device image classification: Single-Path NAS achieves 74.96% top-1 accuracy on ImageNet with 79ms inference latency on a Pixel 1 phone, which is state-of-the-art accuracy compared to NAS methods with similar latency (<80ms).

artificial intelligence, arxiv preprint arxiv, neural network, (17 more...)

arXiv.org Machine Learning

1905.04159

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Single-Path NAS: Designing Hardware-Efficient ConvNets in less than 4 Hours

Stamoulis, Dimitrios, Ding, Ruizhou, Wang, Di, Lymberopoulos, Dimitrios, Priyantha, Bodhi, Liu, Jie, Marculescu, Diana

arXiv.org Machine LearningApr-5-2019

Can we automatically design a Convolutional Network (ConvNet) with the highest image classification accuracy under the runtime constraint of a mobile device? Neural architecture search (NAS) has revolutionized the design of hardware-efficient ConvNets by automating this process. However, the NAS problem remains challenging due to the combinatorially large design space, causing a significant searching time (at least 200 GPU-hours). To alleviate this complexity, we propose Single-Path NAS, a novel differentiable NAS method for designing hardware-efficient ConvNets in less than 4 hours. Our contributions are as follows: 1. Single-path search space: Compared to previous differentiable NAS methods, Single-Path NAS uses one single-path over-parameterized ConvNet to encode all architectural decisions with shared convolutional kernel parameters, hence drastically decreasing the number of trainable parameters and the search cost down to few epochs. 2. Hardware-efficient ImageNet classification: Single-Path NAS achieves 74.96% top-1 accuracy on ImageNet with 79ms latency on a Pixel 1 phone, which is state-of-the-art accuracy compared to NAS methods with similar constraints (<80ms). 3. NAS efficiency: Single-Path NAS search cost is only 8 epochs (30 TPU-hours), which is up to 5,000x faster compared to prior work. 4. Reproducibility: Unlike all recent mobile-efficient NAS methods which only release pretrained models, we open-source our entire codebase at: https://github.com/dstamoulis/single-path-nas.

convnet, deep learning, neural network, (18 more...)

arXiv.org Machine Learning

1904.02877

Country: North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Differentially Private High Dimensional Sparse Covariance Matrix Estimation

Wang, Di, Xu, Jinhui

arXiv.org Machine LearningJan-18-2019

In this paper, we study the problem of estimating the covariance matrix under differential privacy,where the underlying covariance matrix is assumed to be sparse and of high dimensions. Our approach can be easily extendedto local differential privacy. Experiments on the synthetic datasets show consistent results with our theoretical claims. Keywords: Differential privacy, sparse covariance estimation, high dimensional statistics 1. Introduction Machine Learning and Statistical Estimation have made profound impact in recent years to many applied domains such as social sciences, genomics, and medicine. During theirapplications, a frequently encountered challenge is how to deal with the high dimensionality of the datasets, especially for those in genomics, educational and psychological research.A commonly adopted strategy for dealing with such an issue is to assume that the underlying structures of parameters are sparse.

artificial intelligence, ln 1, machine learning, (15 more...)

arXiv.org Machine Learning

1901.06413

Country: North America > United States > California (0.14)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.89)

Add feedback

Empirical Risk Minimization in Non-interactive Local Differential Privacy Revisited

Wang, Di, Gaboardi, Marco, Xu, Jinhui

Neural Information Processing SystemsDec-31-2018

In this paper, we revisit the Empirical Risk Minimization problem in the non-interactive local model of differential privacy. In the case of constant or low dimensions ($p\ll n$), we first show that if the loss function is $(\infty, T)$-smooth, we can avoid a dependence of the sample complexity, to achieve error $\alpha$, on the exponential of the dimensionality $p$ with base $1/\alpha$ ({\em i.e.,} $\alpha^{-p}$), which answers a question in \cite{smith2017interaction}. Our approach is based on polynomial approximation. Then, we propose player-efficient algorithms with $1$-bit communication complexity and $O(1)$ computation cost for each player. The error bound is asymptotically the same as the original one. With some additional assumptions, we also give an efficient algorithm for the server. In the case of high dimensions ($n\ll p$), we show that if the loss function is a convex generalized linear function, the error can be bounded by using the Gaussian width of the constrained set, instead of $p$, which improves the one in \cite{smith2017interaction}.

artificial intelligence, loss function, machine learning, (18 more...)

Neural Information Processing Systems

Country: North America > United States > New York (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback

Empirical Risk Minimization in Non-interactive Local Differential Privacy Revisited

Wang, Di, Gaboardi, Marco, Xu, Jinhui

Neural Information Processing SystemsDec-31-2018

artificial intelligence, loss function, machine learning, (14 more...)

Neural Information Processing Systems

Country: North America > United States > New York (0.28)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)

Add feedback

Differentially Private Empirical Risk Minimization in Non-interactive Local Model via Polynomial of Inner Product Approximation

Wang, Di, Smith, Adam, Xu, Jinhui

arXiv.org Machine LearningDec-17-2018

In this paper, we study the Empirical Risk Minimization problem in the non-interactive Local Differential Privacy (LDP) model. First, we show that for the hinge loss function, there is an $(\epsilon, \delta)$-LDP algorithm whose sample complexity for achieving an error of $\alpha$ is only linear in the dimensionality $p$ and quasi-polynomial in other terms. Then, we extend the result to any $1$-Lipschitz generalized linear convex loss functions by showing that every such function can be approximated by a linear combination of hinge loss functions and some linear functions. Finally, we apply our technique to the Euclidean median problem and show that its sample complexity needs only to be quasi-polynomial in $p$, which is the first result with a sub-exponential sample complexity in $p$ for non-generalized linear loss functions. Our results are based on a technique, called polynomial of inner product approximation, which may be applicable to other problems.

artificial intelligence, loss function, machine learning, (15 more...)

arXiv.org Machine Learning

1812.06825

Country: North America > United States > New York (0.28)

Genre: Research Report (0.70)

Industry: Information Technology > Security & Privacy (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.94)

Add feedback

An Affect-Rich Neural Conversational Model with Biased Attention and Weighted Cross-Entropy Loss

Zhong, Peixiang, Wang, Di, Miao, Chunyan

arXiv.org Artificial IntelligenceNov-16-2018

Affect conveys important implicit information in human communication. Having the capability to correctly express affect during human-machine conversations is one of the major milestones in artificial intelligence. In recent years, extensive research on open-domain neural conversational models has been conducted. However, embedding affect into such models is still under explored. In this paper, we propose an end-to-end affect-rich open-domain neural conversational model that produces responses not only appropriate in syntax and semantics, but also with rich affect. Our model extends the Seq2Seq model and adopts VAD (Valence, Arousal and Dominance) affective notations to embed each word with affects. In addition, our model considers the effect of negators and intensifiers via a novel affective attention mechanism, which biases attention towards affect-rich words in input sentences. Lastly, we train our model with an affect-incorporated objective function to encourage the generation of affect-rich words in the output responses. Evaluations based on both perplexity and human evaluations show that our model outperforms the state-of-the-art baseline model of comparable size in producing natural and affect-rich responses.

affect-rich word, deep learning, neural network, (21 more...)

arXiv.org Artificial Intelligence

1811.07078

Country: Asia > Singapore (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Texar: A Modularized, Versatile, and Extensible Toolkit for Text Generation

Hu, Zhiting, Shi, Haoran, Yang, Zichao, Tan, Bowen, Zhao, Tiancheng, He, Junxian, Wang, Wentao, Yu, Xingjiang, Qin, Lianhui, Wang, Di, Ma, Xuezhe, Liu, Hector, Liang, Xiaodan, Zhu, Wanrong, Sachan, Devendra Singh, Xing, Eric P.

arXiv.org Artificial IntelligenceSep-4-2018

We introduce Texar, an open-source toolkit aiming to support the broad set of text generation tasks that transforms any inputs into natural language, such as machine translation, summarization, dialog, content manipulation, and so forth. With the design goals of modularity, versatility, and extensibility in mind, Texar extracts common patterns underlying the diverse tasks and methodologies, creates a library of highly reusable modules and functionalities, and allows arbitrary model architectures and algorithmic paradigms. In Texar, model architecture, losses, and learning processes are fully decomposed. Modules at high concept level can be freely assembled or plugged in/swapped out. These features make Texar particularly suitable for researchers and practitioners to do fast prototyping and experimentation, as well as foster technique sharing across different text generation tasks. We provide case studies to demonstrate the use and advantage of the toolkit. Texar is released under Apache license 2.0 at https://github.com/asyml/texar.

arxiv preprint arxiv, deep learning, neural network, (19 more...)

arXiv.org Artificial Intelligence

1809.00794

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.68)

Add feedback

Capacity Releasing Diffusion for Speed and Locality

Wang, Di, Fountoulakis, Kimon, Henzinger, Monika, Mahoney, Michael W., Rao, Satish

arXiv.org Artificial IntelligenceJun-10-2018

Diffusions and related random walk procedures are of central importance in many areas of machine learning, data analysis, and applied mathematics. Because they spread mass agnostically at each step in an iterative manner, they can sometimes spread mass "too aggressively," thereby failing to find the "right" clusters. We introduce a novel Capacity Releasing Diffusion (CRD) Process, which is both faster and stays more local than the classical spectral diffusion process. As an application, we use our CRD Process to develop an improved local algorithm for graph clustering. Our local graph clustering method can find local clusters in a model of clustering where one begins the CRD Process in a cluster whose vertices are connected better internally than externally by an $O(\log^2 n)$ factor, where $n$ is the number of nodes in the cluster. Thus, our CRD Process is the first local graph clustering algorithm that is not subject to the well-known quadratic Cheeger barrier. Our result requires a certain smoothness condition, which we expect to be an artifact of our analysis. Our empirical evaluation demonstrates improved results, in particular for realistic social graphs where there are moderately good---but not very good---clusters.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Artificial Intelligence

1706.05826

Country:

North America > United States (0.46)
Europe > Austria (0.28)

Genre: Research Report > New Finding (0.66)

Industry:

Government (0.46)
Education (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)

Add feedback