AITopics | Gupta, Gaurav

Collaborating Authors

Gupta, Gaurav

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

ProtSi: Prototypical Siamese Network with Data Augmentation for Few-Shot Subjective Answer Evaluation

Lu, Yining, Qiu, Jingxi, Gupta, Gaurav

arXiv.org Artificial IntelligenceNov-17-2022

Subjective answer evaluation is a time-consuming and tedious task, and the quality of the evaluation is heavily influenced by a variety of subjective personal characteristics. Instead, machine evaluation can effectively assist educators in saving time while also ensuring that evaluations are fair and realistic. However, most existing methods using regular machine learning and natural language processing techniques are generally hampered by a lack of annotated answers and poor model interpretability, making them unsuitable for real-world use. To solve these challenges, we propose ProtSi Network, a unique semi-supervised architecture that for the first time uses few-shot learning to subjective answer evaluation. To evaluate students' answers by similarity prototypes, ProtSi Network simulates the natural process of evaluator scoring answers by combining Siamese Network which consists of BERT and encoder layers with Prototypical Network. We employed an unsupervised diverse paraphrasing model ProtAugment, in order to prevent overfitting for effective few-shot text classification. By integrating contrastive learning, the discriminative text issue can be mitigated. Experiments on the Kaggle Short Scoring Dataset demonstrate that the ProtSi Network outperforms the most recent baseline models in terms of accuracy and quadratic weighted kappa.

machine learning, natural language, protsi network, (19 more...)

arXiv.org Artificial Intelligence

2211.09855

Genre: Research Report (0.40)

Industry: Education > Assessment & Standards (0.35)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)

Add feedback

STORM: Foundations of End-to-End Empirical Risk Minimization on the Edge

Coleman, Benjamin, Gupta, Gaurav, Chen, John, Shrivastava, Anshumali

arXiv.org Machine LearningJun-25-2020

Empirical risk minimization is perhaps the most influential idea in statistical learning, with applications to nearly all scientific and technical domains in the form of regression and classification models. To analyze massive streaming datasets in distributed computing environments, practitioners increasingly prefer to deploy regression models on edge rather than in the cloud. By keeping data on edge devices, we minimize the energy, communication, and data security risk associated with the model. Although it is equally advantageous to train models at the edge, a common assumption is that the model was originally trained in the cloud, since training typically requires substantial computation and memory. To this end, we propose STORM, an online sketch for empirical risk minimization. STORM compresses a data stream into a tiny array of integer counters. This sketch is sufficient to estimate a variety of surrogate losses over the original dataset. We provide rigorous theoretical analysis and show that STORM can estimate a carefully chosen surrogate loss for the least-squares objective. In an exhaustive experimental comparison for linear regression models on real-world datasets, we find that STORM allows accurate regression models to be trained.

artificial intelligence, machine learning, sketch, (15 more...)

arXiv.org Machine Learning

2006.14554

Country: North America > United States (0.29)

Genre: Research Report (0.50)

Industry: Information Technology > Security & Privacy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.98)

Add feedback

Learning in Confusion: Batch Active Learning with Noisy Oracle

Gupta, Gaurav, Sahu, Anit Kumar, Lin, Wan-Yi

arXiv.org Machine LearningSep-26-2019

We study the problem of training machine learning models incrementally using active learning with access to imperfect or noisy oracles. We specifically consider the setting of batch active learning, in which multiple samples are selected as opposed to a single sample as in classical settings so as to reduce the training overhead. Our approach bridges between uniform randomness and score based importance sampling of clusters when selecting a batch of new samples. Experiments on benchmark image classification datasets (MNIST, SVHN, and CIFAR10) shows improvement over existing active learning strategies. We introduce an extra denoising layer to deep networks to make active learning robust to label noises and show significant improvements.

active learning, deep learning, neural network, (20 more...)

arXiv.org Machine Learning

1909.12473

Country: North America > United States > Wisconsin (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.46)

Add feedback

Learning Latent Fractional dynamics with Unknown Unknowns

Gupta, Gaurav, Pequito, Sergio, Bogdan, Paul

arXiv.org Machine LearningNov-1-2018

Despite significant effort in understanding complex systems (CS), we lack a theory for modeling, inference, analysis and efficient control of time-varying complex networks (TVCNs) in uncertain environments. From brain activity dynamics to microbiome, and even chromatin interactions within the genome architecture, many such TVCNs exhibits a pronounced spatio-temporal fractality. Moreover, for many TVCNs only limited information (e.g., few variables) is accessible for modeling, which hampers the capabilities of analytical tools to uncover the true degrees of freedom and infer the CS model, the hidden states and their parameters. Another fundamental limitation is that of understanding and unveiling of unknown drivers of the dynamics that could sporadically excite the network in ways that straightforward modeling does not work due to our inability to model non-stationary processes. Towards addressing these challenges, in this paper, we consider the problem of learning the fractional dynamical complex networks under unknown unknowns (i.e., hidden drivers) and partial observability (i.e., only partial data is available). More precisely, we consider a generalized modeling approach of TVCNs consisting of discrete-time fractional dynamical equations and propose an iterative framework to determine the network parameterization and predict the state of the system. We showcase the performance of the proposed framework in the context of task classification using real electroencephalogram data.

health & medicine, neurology, node, (20 more...)

arXiv.org Machine Learning

1811.00703

Country: North America > United States > California (0.14)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area > Neurology (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.47)

Add feedback

Data-driven Perception of Neuron Point Process with Unknown Unknowns

Yang, Ruochen, Gupta, Gaurav, Bogdan, Paul

arXiv.org Machine LearningNov-1-2018

Identification of patterns from discrete data time-series for statistical inference, threat detection, social opinion dynamics, brain activity prediction has received recent momentum. In addition to the huge data size, the associated challenges are, for example, (i) missing data to construct a closed time-varying complex network, and (ii) contribution of unknown sources which are not probed. Towards this end, the current work focuses on statistical neuron system model with multi-covariates and unknown inputs. Previous research of neuron activity analysis is mainly limited with effects from the spiking history of target neuron and the interaction with other neurons in the system while ignoring the influence of unknown stimuli. We propose to use unknown unknowns, which describes the effect of unknown stimuli, undetected neuron activities and all other hidden sources of error. The maximum likelihood estimation with the fixed-point iteration method is implemented. The fixed-point iterations converge fast, and the proposed methods can be efficiently parallelized and offer computational advantage especially when the input spiking trains are over long time-horizon. The developed framework provides an intuition into the meaning of having extra degrees-of-freedom in the data to support the need for unknowns. The proposed algorithm is applied to simulated spike trains and on real-world experimental data of mouse somatosensory, mouse retina and cat retina. The model shows a successful increasing of system likelihood with respect to the conditional intensity function, and it also reveals the convergence with iterations. Results suggest that the neural connection model with unknown unknowns can efficiently estimate the statistical properties of the process by increasing the network likelihood.

bayesian inference, neurology, neuron, (20 more...)

arXiv.org Machine Learning

1811.00688

Country: North America > United States > California (0.14)

Genre: Research Report (0.70)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Add feedback