AITopics | gensim

Collaborating Authors

gensim

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

GenSim: A General Social Simulation Platform with Large Language Model based Agents

Tang, Jiakai, Gao, Heyang, Pan, Xuchen, Wang, Lei, Tan, Haoran, Gao, Dawei, Chen, Yushuo, Chen, Xu, Lin, Yankai, Li, Yaliang, Ding, Bolin, Zhou, Jingren, Wang, Jun, Wen, Ji-Rong

arXiv.org Artificial IntelligenceOct-9-2024

With the rapid advancement of large language models (LLMs), recent years have witnessed many promising studies on leveraging LLM-based agents to simulate human social behavior. While prior work has demonstrated significant potential across various domains, much of it has focused on specific scenarios involving a limited number of agents and has lacked the ability to adapt when errors occur during simulation. To overcome these limitations, we propose a novel LLM-agent-based simulation platform called \textit{GenSim}, which: (1) \textbf{Abstracts a set of general functions} to simplify the simulation of customized social scenarios; (2) \textbf{Supports one hundred thousand agents} to better simulate large-scale populations in real-world contexts; (3) \textbf{Incorporates error-correction mechanisms} to ensure more reliable and long-term simulations. To evaluate our platform, we assess both the efficiency of large-scale agent simulations and the effectiveness of the error-correction mechanisms. To our knowledge, GenSim represents an initial step toward a general, large-scale, and correctable social simulation platform based on LLM agents, promising to further advance the field of social science.

agent, platform, simulation, (12 more...)

arXiv.org Artificial Intelligence

2410.0436

Country:

Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Asia > China (0.04)

Genre:

Research Report > Experimental Study (0.69)
Research Report > Strength High (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.94)

Add feedback

A Comparative Study of Text Embedding Models for Semantic Text Similarity in Bug Reports

Patil, Avinash, Han, Kihwan, Jadon, Aryan

arXiv.org Artificial IntelligenceNov-30-2023

Bug reports are an essential aspect of software development, and it is crucial to identify and resolve them quickly to ensure the consistent functioning of software systems. Retrieving similar bug reports from an existing database can help reduce the time and effort required to resolve bugs. In this paper, we compared the effectiveness of semantic textual similarity methods for retrieving similar bug reports based on a similarity score. We explored several embedding models such as TF-IDF (Baseline), FastText, Gensim, BERT, and ADA. We used the Software Defects Data containing bug reports for various software projects to evaluate the performance of these models. Our experimental results showed that BERT generally outperformed the rest of the models regarding recall, followed by ADA, Gensim, FastText, and TFIDF. Our study provides insights into the effectiveness of different embedding methods for retrieving similar bug reports and highlights the impact of selecting the appropriate one for this task. Our code is available on GitHub.

accuracy, bug report, duplicate, (15 more...)

arXiv.org Artificial Intelligence

2308.09193

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > Santa Clara County > Sunnyvale (0.05)
Asia > Japan (0.04)
(10 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.66)

Technology:

Information Technology > Software (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.95)

Add feedback

GitHub - RaRe-Technologies/gensim: Topic Modelling for Humans

#artificialintelligenceDec-13-2022, 07:41:07 GMT

Gensim is a Python library for topic modelling, document indexing and similarity retrieval with large corpora. Target audience is the natural language processing (NLP) and information retrieval (IR) community. If this feature list left you scratching your head, you can first read more about the Vector Space Model and unsupervised document analysis on Wikipedia. This software depends on NumPy and Scipy, two Python packages for scientific computing. You must have them installed prior to installing gensim.

artificial intelligence, gensim, natural language, (9 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Hands-on intro to Language Processing (NLP)

#artificialintelligenceAug-7-2022, 07:55:59 GMT

This article discusses three techniques that practitioners could use to effectively start working with natural language processing (NLP). This will also give good visibility to people interested in having a sense of what NLP is about -- if you are an expert, please feel free to connect, comment, or suggest. At erreVol, we leverage similar tools to extract useful insights from transcripts of earnings reports of public corporations -- the interested reader can go test the platform. Note, we will present lines of codes for the reader interested in replicating or using what is presented below. Otherwise, please feel free to skip those technical lines as the reading should result seamless.

corpus, language processing, vector tfidfvectorizer, (13 more...)

#artificialintelligence

Genre: Research Report (0.51)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.31)

Add feedback

Getting started with Gensim for basic NLP tasks – Analytics India Magazine

#artificialintelligenceFeb-20-2022, 01:10:38 GMT

Gensim is an open-source python package for natural language processing with a special focus on topic modelling.

analytic india magazine, gensim

#artificialintelligence

Country: Asia > India (0.40)

Industry: Media > News (0.73)

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Using NLP to improve your Resume - KDnuggets

#artificialintelligenceFeb-23-2021, 19:26:58 GMT

Now you can read an overall summary of the job role and your existing Resume! Did you miss anything about the job role that is being highlighted in summary? Small nuanced details can help you sell yourself. Does your summarized document make sense and bring out your essential qualities? Perhaps a concise summary alone is not sufficient.

job description, job specification, keyword, (15 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Natural Language (1.00)

Add feedback

Learn NLP the Stanford Way -- Lesson 2

#artificialintelligenceDec-9-2020, 03:15:15 GMT

In the previous post, we introduced NLP. To find out word meanings with the Python programming language, we used the NLTK package and worked our way into word embeddings using the gensim package and Word2vec. Since we only touched the Word2Vec technique from a 10,000-feet overview, we are now going to dive deeper into the training method to create a Word2vec model. The Word2vec (Mikolov et al. 2013)[1][2] is not a singular technique or algorithm. It's actually a family of neural network architectures and optimization techniques that can produce good results learning embeddings for large datasets.

cbow model, glove model, neural network, (16 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Learn NLP the Stanford way -- Lesson 1

#artificialintelligenceNov-9-2020, 15:00:42 GMT

The AI area of Natural Language Processing, or NLP, throughout its gigantic language models -- yes, GPT-3, I'm watching you -- presents what it's perceived as a revolution in machines' capabilities to perform the most distinct language tasks. Due to that, the perception of the public as a whole is split: some perceive that these new language models are going to pave the way to a Skynet type of technology, while others dismiss them as hype-fueled technologies that will live in dusty shelves, or HDD drives, in little to no time. Motivated by this, I'm creating this series of stories that will approach NLP from scratch in a friendly way. To join me, you'll need to have little experience with Python and Jupyter Notebooks, and for the most part, I won't even ask you to have anything installed on your machine. This series will differ dramatically from the Stanford course in terms of the depth that we'll approach statistics and calculus.

library, vector, word2vec model, (16 more...)

#artificialintelligence

Genre: Instructional Material > Course Syllabus & Notes (0.48)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.55)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.52)

Add feedback

Time-based Sequence Model for Personalization and Recommendation Systems

Ishkhanov, Tigran, Naumov, Maxim, Chen, Xianjie, Zhu, Yan, Zhong, Yuan, Azzolini, Alisson Gusatti, Sun, Chonglin, Jiang, Frank, Malevich, Andrey, Xiong, Liang

arXiv.org Machine LearningAug-27-2020

Recommendation systems play an important role in many e-commerce applications as well as search and ranking services [6, 15, 21, 26, 30, 31, 41, 48]. There are two main strategies to perform recommendations: content and collaborative filtering. In content filtering the user creates a profile based on its interest, while human experts create a profile for the product. An algorithm matches the two profiles and recommends the closest matches to the user. For example, this approach is taken by the Pandora Music Genome Project [29]. In collaborative filtering, the recommendations are based only on user past behavior from which the future behavior is derived. The advantage of this approach is that it requires no external information and is not domain specific. The challenge is that in the beginning very few user-item interactions are available. For instance, this cold start problem is addressed by Netflix by asking the user for a few favorite movies when creating their profile for the first time [27].

artificial intelligence, machine learning, sequence, (19 more...)

arXiv.org Machine Learning

2008.11922

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > San Mateo County > Menlo Park (0.04)

Genre: Research Report (0.64)

Industry:

Information Technology > Services (0.86)
Leisure & Entertainment (0.86)
Media (0.75)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Python Libraries for Natural Language Processing

#artificialintelligenceJul-31-2020, 12:15:26 GMT

Natural Language Processing is considered one of the many critical aspects of making intelligent systems. By training your solution with data gathered from the real-world, you can make it faster and more relevant to users, generating crucial insight about your customer base. In this article, we will be taking a look at how Python offers some of the most useful and powerful libraries for leveraging the power of Natural Language Processing into your project and where exactly do they fit in. Often recognized as a professional-grade Python library for advanced Natural Language Processing, spaCy excels at working with incredibly large-scale information extraction tasks. Built using Python and Cython, spaCy combines the best of both languages, the convenience from Python and the speed from Cython to deliver one of the best-in-class NLP experiences. Stanford CoreNLP is a suite of tools built for implementing a Natural Language Processing into your project.

artificial intelligence, natural language, text processing, (17 more...)

#artificialintelligence

Country: Africa > Middle East > Egypt > Giza Governorate > Giza (0.07)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)

Add feedback