AITopics | Kumar, Sandeep

Plotting

Kumar, Sandeep

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

TanhSoft -- a family of activation functions combining Tanh and Softplus

Biswas, Koushik, Kumar, Sandeep, Banerjee, Shilpak, Pandey, Ashish Kumar

arXiv.org Artificial IntelligenceSep-8-2020

Artificial neural networks (ANNs) have occupied the center stage in the realm of deep learning in the recent past. ANNs are made up of several hidden layers, and each hidden layer consists of several neurons. At each neuron, an affine linear map is composed with a nonlinear function known as activation function. During the training of an ANN, the linear map is optimized, however an activation function is usually fixed in the beginning along with the architecture of the ANN. There has been an increasing interest in developing a methodical understanding of activation functions, in particular with regards to the construction of novel activation functions and identifying mathematical properties leading to a better learning [1]. An activation function is considered good if it can increase the learning rate and leaning to better convergence which leads to more accurate results. At the early stage of deep learning research, researchers used shallow networks (fewer hidden layers), and tanh or sigmoid, were used as activation functions. As the research progressed and deeper networks (multiple hidden layers) came into fashion to achieve challenging tasks, Rectified Linear Unit (ReLU)([2], [3], [4]) emerged as the most popular activation function. Despite its simplicity, deep neural networks with ReLU have learned many complex and highly nonlinear functions with high accuracy.

activation function, deep learning, neural network, (17 more...)

arXiv.org Artificial Intelligence

2009.03863

Country:

North America > United States (0.68)
Asia > India > NCT (0.15)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Multimodal Differential Network for Visual Question Generation

Patro, Badri N., Kumar, Sandeep, Kurmi, Vinod K., Namboodiri, Vinay P.

arXiv.org Artificial IntelligenceOct-17-2019

Namboodiri Indian Institute of Technology, Kanpur { badri,sandepkr,vinodkk,vinaypn} @iitk.ac.in Abstract Generating natural questions from an image is a semantic task that requires using visual and language modality to learn multimodal representations. Images can have multiple visual and language contexts that are relevant for generating questions namely places, captions, and tags. In this paper, we propose the use of exemplars for obtaining the relevant context. We obtain this by using a Multimodal Differential Network to produce natural and engaging questions. The generated questions show a remarkable similarity to the natural questions as validated by a human study. Further, we observe that the proposed approach substantially improves over state-of-the-art benchmarks on the quantitative metrics (BLEU, METEOR, ROUGE, and CIDEr). 1 Introduction To understand the progress towards multimedia vision and language understanding, a visual Turing test was proposed by (Geman et al., 2015) that was aimed at visual question answering (Antol et al., 2015). Visual Dialog (Das et al., 2017) is a natural extension for VQA. Current dialog systems as evaluated in (Chattopadhyay et al., 2017) show that when trained between bots, AIAI dialog systems show improvement, but that does not translate to actual improvement for Human-AI dialog. This is because, the questions generated by bots are not natural (humanlike) and therefore does not translate to improved human dialog. Therefore it is imperative that improvement in the quality of questions will enable dialog agents to perform well in human interactions. Further, (Ganju et al., 2017) show that unanswered questions can be used for improving VQA, Image captioning and Object Classification. An interesting line of work in this respect is the work of (Mostafazadeh et al., 2016). Here the authors have proposed the challenging task of generating natural questions for an image. One aspect that is central to a question is the context that is relevant to generate it. As can be seen in Figure 1, an image with a person on a skateboard would result in questions related to the event.

dataset, deep learning, neural network, (20 more...)

arXiv.org Artificial Intelligence

1808.03986

Country: North America > United States (0.28)

Genre: Research Report > Promising Solution (0.46)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.96)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.86)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.74)

Add feedback

Structured Graph Learning Via Laplacian Spectral Constraints

Kumar, Sandeep, Ying, Jiaxi, Cardoso, Jos'e Vin'icius de M., Palomar, Daniel P.

arXiv.org Machine LearningSep-24-2019

Learning a graph with a specific structure is essential for interpretability and identification of the relationships among data. It is well known that structured graph learning from observed samples is an NP-hard combinatorial problem. In this paper, we first show that for a set of important graph families it is possible to convert the structural constraints of structure into eigenvalue constraints of the graph Laplacian matrix. Then we introduce a unified graph learning framework, lying at the integration of the spectral properties of the Laplacian matrix with Gaussian graphical modeling that is capable of learning structures of a large class of graph families. The proposed algorithms are provably convergent and practically amenable for large-scale semi-supervised and unsupervised graph-based learning tasks. Extensive numerical experiments with both synthetic and real data sets demonstrate the effectiveness of the proposed methods. An R package containing code for all the experimental results is available at https://cran.r-project.org/package=spectralGraphTopology.

graph, oncology, optimization problem, (21 more...)

arXiv.org Machine Learning

1909.11594

Country: Asia > China > Hong Kong (0.14)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.89)

Add feedback

Distributed Inexact Successive Convex Approximation ADMM: Analysis-Part I

Kumar, Sandeep, Rajawat, Ketan, Palomar, Daniel P.

arXiv.org Machine LearningJul-21-2019

In this two-part work, we propose an algorithmic framework for solving non-convex problems whose objective function is the sum of a number of smooth component functions plus a convex (possibly non-smooth) or/and smooth (possibly non-convex) regularization function. The proposed algorithm incorporates ideas from several existing approaches such as alternate direction method of multipliers (ADMM), successive convex approximation (SCA), distributed and asynchronous algorithms, and inexact gradient methods. Different from a number of existing approaches, however, the proposed framework is flexible enough to incorporate a class of non-convex objective functions, allow distributed operation with and without a fusion center, and include variance reduced methods as special cases. Remarkably, the proposed algorithms are robust to uncertainties arising from random, deterministic, and adversarial sources. The part I of the paper develops two variants of the algorithm under very mild assumptions and establishes first-order convergence rate guarantees. The proof developed here allows for generic errors and delays, paving the way for different variance-reduced, asynchronous, and stochastic implementations, outlined and evaluated in part II.

algorithm, artificial intelligence, optimization problem, (15 more...)

arXiv.org Machine Learning

1907.08969

Country: Asia > India (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Communications > Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
(3 more...)

Add feedback

A Unified Framework for Structured Graph Learning via Spectral Constraints

Kumar, Sandeep, Ying, Jiaxi, Cardoso, José Vinícius de M., Palomar, Daniel

arXiv.org Machine LearningApr-22-2019

Graph learning from data represents a canonical problem that has received substantial attention in the literature. However, insufficient work has been done in incorporating prior structural knowledge onto the learning of underlying graphical models from data. Learning a graph with a specific structure is essential for interpretability and identification of the relationships among data. Useful structured graphs include the multi-component graph, bipartite graph, connected graph, sparse graph, and regular graph. In general, structured graph learning is an NP-hard combinatorial problem, therefore, designing a general tractable optimization method is extremely challenging. In this paper, we introduce a unified graph learning framework lying at the integration of Gaussian graphical models and spectral graph theory. To impose a particular structure on a graph, we first show how to formulate the combinatorial constraints as an analytical property of the graph matrix. Then we develop an optimization framework that leverages graph learning with specific structures via spectral constraints on graph matrices. The proposed algorithms are provably convergent, computationally efficient, and practically amenable for numerous graph-based tasks. Extensive numerical experiments with both synthetic and real data sets illustrate the effectiveness of the proposed algorithms. The code for all the simulations is made available as an open source repository.

graph, oncology, optimization problem, (22 more...)

arXiv.org Machine Learning

1904.09792

Country: South America > Brazil (0.14)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Oncology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Mathematical & Statistical Methods (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Learning Semantic Sentence Embeddings using Pair-wise Discriminator

Patro, Badri N., Kurmi, Vinod K., Kumar, Sandeep, Namboodiri, Vinay P.

arXiv.org Artificial IntelligenceJul-2-2018

In this paper, we propose a method for obtaining sentence-level embeddings. While the problem of securing word-level embeddings is very well studied, we propose a novel method for obtaining sentence-level embeddings. This is obtained by a simple method in the context of solving the paraphrase generation task. If we use a sequential encoder-decoder model for generating paraphrase, we would like the generated paraphrase to be semantically close to the original sentence. One way to ensure this is by adding constraints for true paraphrase embeddings to be close and unrelated paraphrase candidate sentence embeddings to be far. This is ensured by using a sequential pair-wise discriminator that shares weights with the encoder that is trained with a suitable loss function. Our loss function penalizes paraphrase sentence embedding distances from being too large. This loss is used in combination with a sequential encoder-decoder network. We also validated our method by evaluating the obtained embeddings for a sentiment analysis task. The proposed method results in semantic embeddings and outperforms the state-of-the-art on the paraphrase generation and sentiment analysis task on standard datasets. These results are also shown to be statistically significant.

dataset, deep learning, neural network, (20 more...)

arXiv.org Artificial Intelligence

1806.00807

Country: North America > United States (0.46)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Industry:

Media > Film (0.94)
Leisure & Entertainment (0.69)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (0.70)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.69)

Add feedback