A software security engineer has identified 12 Python libraries uploaded on the official Python Package Index (PyPI) that contained malicious code. The 12 packages have been discovered in two separate scans by a security engineer who goes online by the name of Bertus, and have long been removed from PyPI before this article's publication. All packages were put together and worked following a similar pattern. Their creator(s) copied the code of popular packages and created a new library, but with a slightly modified name. For example, four packages (diango, djago, dajngo, djanga) were misspellings of Django, the name of a very popular Python framework.
Python is one of the world most popular and widely used high-level, general-purpose Language. There are many big organization using python for software development because of its versatile features. It provide extensive support of libraries. Scrappy is widely used Python web scraping library. It is used for creating crawling programs.
NLTK is a leading platform for building Python programs to work with human language data. It provides easy-to-use interfaces to lexical resources such as WordNet.It also has text processing libraries for classification, tokenization, stemming, tagging, parsing, and semantic reasoning Pattern has tools for natural language processing like part-of-speech taggers, n-gram search, sentiment analysis, WordNet.It supports machine learning vector space model, clustering, SVM. TextBlob is a Python library for processing textual data. It provides a simple API for diving into common natural language processing tasks such as part-of-speech tagging, noun phrase extraction, sentiment analysis, classification, translation, and more. Gensim is a Python library for topic modelling, document indexing and similarity retrieval with large corpora.