documentarray
Building an AI-powered PDF Search Engine with Python: Part 1
With neural search seeing rapid adoption, more people are looking at using it for indexing and searching through their unstructured data. I know several folks already building PDF search engines powered by AI, so I figured I'd give it a stab too. How hard could it possibly be? This is just a rough and ready roadmap -- so stay tuned to see how things really pan out. If you want to follow along at home (and maybe fix a few of my bugs!), check the repo: I want to build a search engine for a dataset of arbitrary PDFs.
docarray.md
For data scientists and engineers, speed is important along with accuracy. For accuracy, we built Finetuner, which lets you finetune neural networks to achieve top performance on downstream tasks. Concerning speed, Jina was already fast, but now it's even faster. DocArray has been created to remove all the shortcomings in existing data structures, especially for ML and data science-related tasks. Here is a comparison of DocArray with other data structures.
Building a neural-search-powered chatbot
When most people think search, they think of a standard search box. Type words in, smack the search button, and pages of luscious results come back. But search is buried elsewhere too. Those customer support chatbots you know and love? But instead of returning pages of results, they only return the one most relevant hit, and do so in a conversational UI.
- Pacific Ocean > North Pacific Ocean > South China Sea (0.05)
- Asia > China (0.05)
Advancing Neural Search with Jina 2.0
To understand the basics of neural search and how it differs from conventional search please go through my previous blog on "Next-gen powered by Jina". It explains how Jina- a cloud-native, open-source company is pioneering the field of neural search. It builds on the idea of semantic search and explains the basic building blocks of the Jina framework required to build intelligent search applications. Just as a recap the idea behind neural search is to leverage state-of-the-art deep neural networks to intelligently retrieve contextual and semantically relevant information from the heaps of data. A neural search system can go way beyond simple text search by allowing you to search through all the formats of data including images, videos, audios, and even PDFs.