AITopics | Das, Arijit

Collaborating Authors

Das, Arijit

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Natural GaLore: Accelerating GaLore for memory-efficient LLM Training and Fine-tuning

Das, Arijit

arXiv.org Artificial IntelligenceOct-21-2024

Training LLMs presents significant memory challenges due to growing size of data, weights, and optimizer states. Techniques such as data and model parallelism, gradient checkpointing, and offloading strategies address this issue but are often infeasible due to hardware constraints. To mitigate memory usage, alternative methods like Parameter-Efficient-Fine-Tuning (PEFT) and GaLore approximate weights or optimizer states. PEFT methods, such as LoRA, have gained popularity for fine-tuning LLMs, though they require a full-rank warm start. In contrast, GaLore allows full-parameter learning while being more memory-efficient. This work introduces Natural GaLore, a simple drop in replacement for AdamW, which efficiently applies the inverse Empirical Fisher Information Matrix to low-rank gradients using Woodbury's Identity. We demonstrate that incorporating second-order information speeds up optimization significantly, especially when the iteration budget is limited. Empirical pretraining on 60M, 130M, 350M, and 1.1B parameter Llama models on C4 data demonstrate significantly lower perplexity over GaLore without additional memory overhead. By fine-tuning RoBERTa on the GLUE benchmark using Natural GaLore, we demonstrate significant reduction in gap 86.05% vs 86.28% for full-finetuning. Furthermore, fine-tuning the TinyLlama 1.1B model for function calling using the TinyAgent framework shows that Natural GaLore achieving 83.09% accuracy on the TinyAgent dataset, significantly outperforms 16-bit LoRA at 80.06% and even surpasses GPT4-Turbo by 4%, all while using 30% less memory. All code to reproduce the results are available at: https://github.com/selfsupervised-ai/Natural-GaLore.git

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2410.16029

Country:

Asia > Middle East (0.14)
Europe > Germany (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Online Learning under Haphazard Input Conditions: A Comprehensive Review and Analysis

Agarwal, Rohit, Das, Arijit, Horsch, Alexander, Agarwal, Krishna, Prasad, Dilip K.

arXiv.org Artificial IntelligenceApr-7-2024

The domain of online learning has experienced multifaceted expansion owing to its prevalence in real-life applications. Nonetheless, this progression operates under the assumption that the input feature space of the streaming data remains constant. In this survey paper, we address the topic of online learning in the context of haphazard inputs, explicitly foregoing such an assumption. We discuss, classify, evaluate, and compare the methodologies that are adept at modeling haphazard inputs, additionally providing the corresponding code implementations and their carbon footprint. Moreover, we classify the datasets related to the field of haphazard inputs and introduce evaluation metrics specifically designed for datasets exhibiting imbalance. The code of each methodology can be found at https://github.com/Rohit102497/HaphazardInputsReview

data mining, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2404.04903

Country:

Europe (0.92)
North America > United States > Wisconsin (0.14)
North America > United States > California (0.14)

Genre: Overview (1.00)

Industry:

Information Technology (1.00)
Health & Medicine > Therapeutic Area (1.00)
Energy (1.00)
Education > Educational Setting > Online (0.81)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(5 more...)

Add feedback

Securing Social Spaces: Harnessing Deep Learning to Eradicate Cyberbullying

Biswas, Rohan, Ganguly, Kasturi, Das, Arijit, Saha, Diganta

arXiv.org Artificial IntelligenceApr-1-2024

In today's digital world, cyberbullying is a serious problem that can harm the mental and physical health of people who use social media. This paper explains just how serious cyberbullying is and how it really affects indi-viduals exposed to it. It also stresses how important it is to find better ways to detect cyberbullying so that online spaces can be safer. Plus, it talks about how making more accurate tools to spot cyberbullying will be really helpful in the future. Our paper introduces a deep learning-based ap-proach, primarily employing BERT and BiLSTM architectures, to effective-ly address cyberbullying. This approach is designed to analyse large vol-umes of posts and predict potential instances of cyberbullying in online spaces. Our results demonstrate the superiority of the hateBERT model, an extension of BERT focused on hate speech detection, among the five mod-els, achieving an accuracy rate of 89.16%. This research is a significant con-tribution to "Computational Intelligence for Social Transformation," prom-ising a safer and more inclusive digital landscape.

artificial intelligence, detection, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2404.03686

Country: Asia > India > West Bengal (0.14)

Genre: Research Report > New Finding (0.86)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Analysis and Detection of Multilingual Hate Speech Using Transformer Based Deep Learning

Das, Arijit, Nandy, Somashree, Saha, Rupam, Das, Srijan, Saha, Diganta

arXiv.org Artificial IntelligenceJan-19-2024

Hate speech is harmful content that directly attacks or promotes hatred against members of groups or individuals based on actual or perceived aspects of identity, such as racism, religion, or sexual orientation. This can affect social life on social media platforms as hateful content shared through social media can harm both individuals and communities. As the prevalence of hate speech increases online, the demand for automated detection as an NLP task is increasing. In this work, the proposed method is using transformer-based model to detect hate speech in social media, like twitter, Facebook, WhatsApp, Instagram, etc. The proposed model is independent of languages and has been tested on Italian, English, German, Bengali. The success rate of the proposed model for hate speech detection is higher than the existing baseline and state-of-the-art models with accuracy in Bengali dataset is 89%, in English: 91%, in German dataset 91% and in Italian dataset it is 77%. The proposed algorithm shows substantial improvement to the benchmark method. Key Words and Phrases: Hate Speech Detection, BERT, Deep Learning, NLP, Transformers 1. INTRODUCTION Online social media has allowed dissemination of information at a faster rate than ever. This has allowed bad actors to use this for their nefarious purposes such as propaganda spreading, fake news, and hate speech. Online hate speech could be a genuine issue with potential harmful results for both individuals and the community. It can be utilized to mix up struggle, progress prejudice, and make climate forceful to vulnerable groups.

artificial intelligence, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2401.11021

Country: Asia > Middle East (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Services (0.66)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Variational Bayes Approach to Decoding in a Phase-Uncertain Digital Receiver

Das, Arijit, Quinn, Anthony

arXiv.org Machine LearningJul-4-2011

This paper presents a Bayesian approach to symbol and phase inference in a phase-unsynchronized digital receiver. It primarily extends [Quinn 2011] to the multi-symbol case, using the variational Bayes (VB) approximation to deal with the combinatorial complexity of the phase inference in this case. The work provides a fully Bayesian extension of the EM-based framework underlying current turbo-synchronization methods, since it induces a von Mises prior on the time-invariant phase parmeter. As a result, we achieve tractable iterative algorithms with improved robustness in low SNR regimes, compared to the current EM-based approaches. As a corollary to our analysis we also discover the importance of prior regularization in elegantly tackling the significant problem of phase ambiguity.

artificial intelligence, bayesian inference, symbol period, (14 more...)

arXiv.org Machine Learning

1107.0662

Country:

Europe > Ireland > Leinster > County Dublin > Dublin (0.14)
North America > United States (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)

Add feedback