Goto

Collaborating Authors

 stemming


Stemming -- The Evolution and Current State with a Focus on Bangla

arXiv.org Artificial Intelligence

Bangla, the seventh most widely spoken language worldwide with 300 million native speakers, faces digital under-representation due to limited resources and lack of annotated datasets. Stemming, a critical preprocessing step in language analysis, is essential for low-resource, highly-inflectional languages like Bangla, because it can reduce the complexity of algorithms and models by significantly reducing the number of words the algorithm needs to consider. This paper conducts a comprehensive survey of stemming approaches, emphasizing the importance of handling morphological variants effectively. While exploring the landscape of Bangla stemming, it becomes evident that there is a significant gap in the existing literature. The paper highlights the discontinuity from previous research and the scarcity of accessible implementations for replication. Furthermore, it critiques the evaluation methodologies, stressing the need for more relevant metrics. In the context of Bangla's rich morphology and diverse dialects, the paper acknowledges the challenges it poses. To address these challenges, the paper suggests directions for Bangla stemmer development. It concludes by advocating for robust Bangla stemmers and continued research in the field to enhance language analysis and processing.


Stemming vs Lemmatization in NLP: Must-Know Differences

#artificialintelligence

This article was published as a part of the Data Science Blogathon. In the field of Natural Language Processing i.e., NLP, Lemmatization and Stemming are Text Normalization techniques. These techniques are used to prepare words, text, and documents for further processing. Languages such as English, Hindi consists of several words which are often derived from one another. Further, Inflected Language is a term used for a language that contains derived words. For instance, word "historical" is derived from the word "history" and hence is the derived word.


Natural Language Processing for Absolute Beginners

#artificialintelligence

Before diving into the definition of natural language processing it is extremely important to explore why it came into existence. Our personal computers communicate in a language known as machine language. Unlike human natural language, a machine language uses a series of zeroes and ones often called bits to communicate to the outer world and is vaguely puzzling for humans. To bridge this gap machines needed to act and talk like humans and hence NLP was invented to proffer intelligent human-to-machine interaction. Natural language processing is a subset of Artificial Intelligence, Computer science, and Human linguistic processing providing the ability for computers/ machines to understand, process, and acquire significant insights from human natural languages.


Natural Language Processing for Absolute Beginners

#artificialintelligence

Before diving into the definition of natural language processing it is extremely important to explore why it came into existence. Our personal computers communicate in a language known as machine language. Unlike human natural language, a machine language uses a series of zeroes and ones often called bits to communicate to the outer world and is vaguely puzzling for humans. To bridge this gap it was necessary for machines to act and talk like humans and hence NLP was invented to proffer intelligent human to machine interaction. Natural language processing is a subset of Artificial Intelligence, Computer science and Human linguistic processing providing the ability for computers/ machines to understand, process and acquire significant insights from human natural languages.


Lemmatization In Natural Language Processing -- NLP

#artificialintelligence

In my previous article I discussed about'Stemming' a process where a given word is chopped off to its root word. If you haven't red my previous article on'Stemming' I insist you to read it before moving any further on this article. Unlike stemming which chop off the given word to its root word'Lemmatization' is a almost similar but it always return you the chopped word which has some dictionary meaning. But lemmatization do care if the word it is returning has meaning or no. A word that is returned by lemmatization can also be called a'lemma'.


Text Processing: A Step by Step Guide through Twitter Sentimental Analysis - YOUR DATA GUY

#artificialintelligence

According to Taweh Beysolow, "Natural Language Processing (NLP) is a subfield of computer science that is focused on allowing computers to understand language in a'natural' way, as humans do." NLP has evolved so rapidly gaining traction in its applications inn artificial intelligence (AI). In this project, we will explore one of the most exciting NLP applications i.e. We will build a machine learning model that can categorize tweets as positive (pro-vaccine), negative (anti-vaccine) or neutral. Stay tuned and let's jump into the project.


What Is Natural Language Processing and How Does It Work?

#artificialintelligence

Have you ever wondered how virtual assistants like Siri and Cortana work? How do they understand what you're saying? Well, part of the answer is natural language processing. This interesting field of artificial intelligence has led to some huge breakthroughs over the last few years, but how exactly does it work? Read on to learn more about natural language processing, how it works, and how it's being used to make our lives more convenient.


Topic Modelling

#artificialintelligence

Natural language processing is the processing of languages used in the system that exists in the library of nltk where this is processed to cut, extract and transform to new data so that we get good insights into it. It uses only the languages that exist in the library because NLP-related things exist there itself so it cannot understand the things beyond what is present in it. If you do processing on another language then you have to add that language to the existing library. For example, NLP is used in Email Spam filtering where when such data is given then it converts to new data which is understandable by the system and a model is built on it to make predictions on spam or no spam mails. NLP is used in text processing mainly and there are many kinds of tasks that are made easier using NLP.


The Basics Of Natural Language Processing in 10 Minutes

#artificialintelligence

Do you also want to learn NLP as Quick as Possible? Perhaps you are here because you also want to learn natural language processing as quickly as possible, like me. To install Jupyter notebook, just open your cmd(terminal) and type pip install jupyter-notebook after that type jupyter notebook to run it then you can see that your notebook is open at http://127.0.0.1:8888/ token . NLTK: It is a python library that can we used to perform all the NLP tasks(stemming, lemmatization, etc..) Before learning anything let's first understand NLP. Natural Language refers to the way we humans communicate with each other and processing is basically formatting the data in an understandable form.


Stemming of words in Natural Language Processing, what is it? - The Tech Check

#artificialintelligence

Stemming is one of the most common data pre-processing operations we do in almost all Natural Language Processing (NLP) projects. If you're new to this space, it is possible that you don't exactly know what this is even though you have come across this word. You might also be confused between stemming and lemmatization, which are two similar operations. In this post, we'll see what exactly is stemming, with a few examples here and there. I hope I'll be able to explain this process in simple words for you.