AITopics | Dube, Parijat

Collaborating Authors

Dube, Parijat

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Test Time Learning for Time Series Forecasting

Christou, Panayiotis, Chen, Shichu, Chen, Xupeng, Dube, Parijat

arXiv.org Artificial IntelligenceNov-30-2024

Time-series forecasting has seen significant advancements with the introduction of token prediction mechanisms such as multi-head attention. However, these methods often struggle to achieve the same performance as in language modeling, primarily due to the quadratic computational cost and the complexity of capturing long-range dependencies in time-series data. State-space models (SSMs), such as Mamba, have shown promise in addressing these challenges by offering efficient solutions with linear RNNs capable of modeling long sequences with larger context windows. However, there remains room for improvement in accuracy and scalability. We propose the use of Test-Time Training (TTT) modules in a parallel architecture to enhance performance in long-term time series forecasting. Through extensive experiments on standard benchmark datasets, we demonstrate that TTT modules consistently outperform state-of-the-art models, including the Mamba-based TimeMachine, particularly in scenarios involving extended sequence and prediction lengths. Our results show significant improvements in Mean Squared Error (MSE) and Mean Absolute Error (MAE), especially on larger datasets such as Electricity, Traffic, and Weather, underscoring the effectiveness of TTT in capturing long-range dependencies. Additionally, we explore various convolutional architectures within the TTT framework, showing that even simple configurations like 1D convolution with small filters can achieve competitive results. This work sets a new benchmark for time-series forecasting and lays the groundwork for future research in scalable, high-performance forecasting models.

data mining, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2409.14012

Country: North America > United States > New York (0.28)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (0.67)
Banking & Finance (0.67)
Energy (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

ConvNLP: Image-based AI Text Detection

Jambunathan, Suriya Prakash, Shankarnarayan, Ashwath, Dube, Parijat

arXiv.org Artificial IntelligenceJul-9-2024

The potentials of Generative-AI technologies like Large Language models (LLMs) to revolutionize education are undermined by ethical considerations around their misuse which worsens the problem of academic dishonesty. LLMs like GPT-4 and Llama 2 are becoming increasingly powerful in generating sophisticated content and answering questions, from writing academic essays to solving complex math problems. Students are relying on these LLMs to complete their assignments and thus compromising academic integrity. Solutions to detect LLM-generated text are compute-intensive and often lack generalization. This paper presents a novel approach for detecting LLM-generated AI-text using a visual representation of word embedding. We have formulated a novel Convolutional Neural Network called ZigZag ResNet, as well as a scheduler for improving generalization, named ZigZag Scheduler. Through extensive evaluation using datasets of text generated by six different state-of-the-art LLMs, our model demonstrates strong intra-domain and inter-domain generalization capabilities. Our best model detects AI-generated text with an impressive average detection rate (over inter- and intra-domain test data) of 88.35%. Through an exhaustive ablation study, our ZigZag ResNet and ZigZag Scheduler provide a performance improvement of nearly 4% over the vanilla ResNet. The end-to-end inference latency of our model is below 2.5ms per sentence. Our solution offers a lightweight, computationally efficient, and faster alternative to existing tools for AI-generated text detection, with better generalization performance. It can help academic institutions in their fight against the misuse of LLMs in academic settings. Through this work, we aim to contribute to safeguarding the principles of academic integrity and ensuring the trustworthiness of student work in the era of advanced LLMs.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2407.07225

Country:

North America > United States > New York (0.14)
Europe > Middle East > Malta (0.14)

Genre: Research Report > Promising Solution (0.34)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.70)

Add feedback

RCD-SGD: Resource-Constrained Distributed SGD in Heterogeneous Environment via Submodular Partitioning

He, Haoze, Dube, Parijat

arXiv.org Artificial IntelligenceSep-18-2023

The convergence of SGD based distributed training algorithms is tied to the data distribution across workers. Standard partitioning techniques try to achieve equal-sized partitions with per-class population distribution in proportion to the total dataset. Partitions having the same overall population size or even the same number of samples per class may still have Non-IID distribution in the feature space. In heterogeneous computing environments, when devices have different computing capabilities, even-sized partitions across devices can lead to the straggler problem in distributed SGD. We develop a framework for distributed SGD in heterogeneous environments based on a novel data partitioning algorithm involving submodular optimization. Our data partitioning algorithm explicitly accounts for resource heterogeneity across workers while achieving similar class-level feature distribution and maintaining class balance. Based on this algorithm, we develop a distributed SGD framework that can accelerate existing SOTA distributed training algorithms by up to 32%.

algorithm, artificial intelligence, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2211.00839

Country:

North America > United States (0.14)
North America > Canada (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)

Add feedback

Beyond Black Box AI-Generated Plagiarism Detection: From Sentence to Document Level

Quidwai, Mujahid Ali, Li, Chunhui, Dube, Parijat

arXiv.org Artificial IntelligenceJun-13-2023

The increasing reliance on large language models (LLMs) in academic writing has led to a rise in plagiarism. Existing AI-generated text classifiers have limited accuracy and often produce false positives. We propose a novel approach using natural language processing (NLP) techniques, offering quantifiable metrics at both sentence and document levels for easier interpretation by human evaluators. Our method employs a multi-faceted approach, generating multiple paraphrased versions of a given question and inputting them into the LLM to generate answers. By using a contrastive loss function based on cosine similarity, we match generated sentences with those from the student's response. Our approach achieves up to 94% accuracy in classifying human and AI text, providing a robust and adaptable solution for plagiarism detection in academic settings. This method improves with LLM advancements, reducing the need for new model training or reconfiguration, and offers a more transparent way of evaluating and detecting AI-generated text.

artificial intelligence, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2306.08122

Genre:

Research Report > Promising Solution (0.34)
Overview > Innovation (0.34)

Industry:

Education > Educational Technology > Educational Software > Computer-Aided Assessment (0.61)
Transportation > Air (0.41)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.89)

Add feedback

G2L: A Geometric Approach for Generating Pseudo-labels that Improve Transfer Learning

Kender, John R., Bhattacharjee, Bishwaranjan, Dube, Parijat, Belgodere, Brian

arXiv.org Artificial IntelligenceJul-7-2022

Transfer learning is a deep-learning technique that ameliorates the problem of learning when human-annotated labels are expensive and limited. In place of such labels, it uses instead the previously trained weights from a well-chosen source model as the initial weights for the training of a base model for a new target dataset. We demonstrate a novel but general technique for automatically creating such source models. We generate pseudo-labels according to an efficient and extensible algorithm that is based on a classical result from the geometry of high dimensions, the Cayley-Menger determinant. This G2L (``geometry to label'') method incrementally builds up pseudo-labels using a greedy computation of hypervolume content. We demonstrate that the method is tunable with respect to expected accuracy, which can be forecast by an information-theoretic measure of dataset similarity (divergence) between source and target. The results of 280 experiments show that this mechanical technique generates base models that have similar or better transferability compared to a baseline of models trained on extensively human-annotated ImageNet1K labels, yielding an overall error decrease of 0.43\%, and an error decrease in 4 out of 5 divergent datasets tested.

artificial intelligence, deep learning, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2207.03554

Country: North America > United States (0.46)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback

Adversarial training in communication constrained federated learning

Shah, Devansh, Dube, Parijat, Chakraborty, Supriyo, Verma, Ashish

arXiv.org Artificial IntelligenceMar-1-2021

Federated learning enables model training over a distributed corpus of agent data. However, the trained model is vulnerable to adversarial examples, designed to elicit misclassification. We study the feasibility of using adversarial training (AT) in the federated learning setting. Furthermore, we do so assuming a fixed communication budget and non-iid data distribution between participating agents. We observe a significant drop in both natural and adversarial accuracies when AT is used in the federated setting as opposed to centralized training. We attribute this to the number of epochs of AT performed locally at the agents, which in turn effects (i) drift between local models; and (ii) convergence time (measured in number of communication rounds). Towards this end, we propose FedDynAT, a novel algorithm for performing AT in federated setting. Through extensive experimentation we show that FedDynAT significantly improves both natural and adversarial accuracy, as well as model convergence time by reducing the model drift.

artificial intelligence, fedavgat, neural network, (18 more...)

arXiv.org Artificial Intelligence

2103.01319

Country: North America > United States > Illinois (0.14)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Add feedback

Detection of data drift and outliers affecting machine learning model performance over time

Ackerman, Samuel, Farchi, Eitan, Raz, Orna, Zalmanovici, Marcel, Dube, Parijat

arXiv.org Machine LearningDec-16-2020

A trained ML model is deployed on another `test' dataset where target feature values (labels) are unknown. Drift is distribution change between the training and deployment data, which is concerning if model performance changes. For a cat/dog image classifier, for instance, drift during deployment could be rabbit images (new class) or cat/dog images with changed characteristics (change in distribution). We wish to detect these changes but can't measure accuracy without deployment data labels. We instead detect drift indirectly by nonparametrically testing the distribution of model prediction confidence for changes. This generalizes our method and sidesteps domain-specific feature representation. We address important statistical issues, particularly Type-1 error control in sequential testing, using Change Point Models (CPMs; see Adams and Ross 2012). We also use nonparametric outlier methods to show the user suspicious observations for model diagnosis, since the before/after change confidence distributions overlap significantly. In experiments to demonstrate robustness, we train on a subset of MNIST digit classes, then insert drift (e.g., unseen digit class) in deployment data in various settings (gradual/sudden changes in the drift proportion). A novel loss function is introduced to compare the performance (detection delay, Type-1 and 2 errors) of a drift detector under different levels of drift class contamination.

artificial intelligence, detection, instructional theory, (19 more...)

arXiv.org Machine Learning

2012.09258

Genre: Research Report > Experimental Study (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.47)

Add feedback

Sequential Drift Detection in Deep Learning Classifiers

Ackerman, Samuel, Dube, Parijat, Farchi, Eitan

arXiv.org Machine LearningJul-31-2020

We utilize neural network embeddings to detect data drift by formulating the drift detection within an appropriate sequential decision framework. This enables control of the false alarm rate although the statistical tests are repeatedly applied. Since change detection algorithms naturally face a tradeoff between avoiding false alarms and quick correct detection, we introduce a loss function which evaluates an algorithm's ability to balance these two concerns, and we use it in a series of experiments.

deep learning, detection, neural network, (17 more...)

arXiv.org Machine Learning

2007.16109

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.83)

Add feedback

P2L: Predicting Transfer Learning for Images and Semantic Relations

Bhattacharjee, Bishwaranjan, Codella, Noel, Kender, John R., Huo, Siyu, Watson, Patrick, Glass, Michael R., Dube, Parijat, Hill, Matthew, Belgodere, Brian

arXiv.org Artificial IntelligenceAug-20-2019

Transfer learning enhances learning across tasks, by leveraging previously learned representations -- if they are properly chosen. We describe an efficient method to accurately estimate the appropriateness of a previously trained model for use in a new learning task. We use this measure, which we call "Predict To Learn" ("P2L"), in the two very different domains of images and semantic relations, where it predicts, from a set of "source" models, the one model most likely to produce effective transfer for training a given "target" model. We validate our approach thoroughly, by assembling a collection of candidate source models, then fine-tuning each candidate to perform each of a collection of target tasks, and finally measuring how well transfer has been enhanced. Across 95 tasks within multiple domains (images classification and semantic relations), the P2L approach was able to select the best transfer learning model on average, while the heuristic of choosing model trained with the largest data set selected the best model in only 55 cases. These results suggest that P2L captures important information in common between source and target tasks, and that this shared informational structure contributes to successful transfer learning more than simple data size.

neural network, text processing, transfer learning, (20 more...)

arXiv.org Artificial Intelligence

1908.0763

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Add feedback

Improving Transferability of Deep Neural Networks

Dube, Parijat, Bhattacharjee, Bishwaranjan, Petit-Bois, Elisabeth, Hill, Matthew

arXiv.org Machine LearningJul-30-2018

Learning from small amounts of labeled data is a challenge in the area of deep learning. This is currently addressed by Transfer Learning where one learns the small data set as a transfer task from a larger source dataset. Transfer Learning can deliver higher accuracy if the hyperparameters and source dataset are chosen well. One of the important parameters is the learning rate for the layers of the neural network. We show through experiments on the ImageNet22k and Oxford Flowers datasets that improvements in accuracy in range of 127% can be obtained by proper choice of learning rates. We also show that the images/label parameter for a dataset can potentially be used to determine optimal learning rates for the layers to get the best overall accuracy. We additionally validate this method on a sample of real-world image classification tasks from a public visual recognition API.

accuracy, deep learning, neural network, (19 more...)

arXiv.org Machine Learning

1807.11459

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback