AITopics | Information Retrieval

Collaborating Authors

Information Retrieval

Our accustomed systems of retrieving particular bits of information no longer fill the needs of many people. Searching traditional indexes of print publications has been aided by computerized databases, but still usually requires time-consuming serial searching of one database after the other, and then moving on to other methods of searching for internet sources. And what if the information being sought is a sound byte? A video clip? Yesterday's e-mail exchange between respected scientists? Artificial intelligence may hold the key to information retrieval in an age where widely different formats contain the information being sought, and the universe of knowledge is simply too big and growing too rapidly for successful searching to proceed at a human's slow speed.

News Overviews Instructional Materials AI-Alerts Classics

WooCommerce Onpage SEO

#artificialintelligenceNov-8-2021, 20:46:06 GMT

The idea with this WooCommerce Training Program is to implement techniques without using paid tools and start working towards optimizing your website for search engines, increase traffic and sales. You will learn about different techniques for keyword research, keyword implementation, how to write titles, meta description, how to optimize website file pages, how to rename your files for search engine optimization. How to connect your website with search console. Overall, you will learn the digital experience elements that impacts Onpage SEO and how to implement in WooCommerce. After completing this training program you should be able to perform keyword research based on user intent and priority, implement the keywords in your title, headings, and content like product descriptions.

keyword research, product description, woocommerce onpage seo, (4 more...)

#artificialintelligence

Genre: Instructional Material > Course Syllabus & Notes (0.51)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.64)

Add feedback

How to Use Data Science for Search Engine Optimization

#artificialintelligenceNov-7-2021, 18:55:46 GMT

Data science is one of the hottest topics in the market nowadays. It is one of those industries that has revolutionized the world. It associates two chief technologies, big data and artificial intelligence, and utilizes them to examine and process datasets. It also uses machine learning, which helps to strengthen artificial intelligence. Data science has thoroughly improved and modernized every industry it has touched, including marketing, finance, social media, SEO, etc.

algorithm, data science, seo specialist, (8 more...)

#artificialintelligence

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.40)

Add feedback

SPANN: Highly-efficient Billion-scale Approximate Nearest Neighbor Search

Chen, Qi, Zhao, Bing, Wang, Haidong, Li, Mingqin, Liu, Chuanjie, Li, Zengzhong, Yang, Mao, Wang, Jingdong

arXiv.org Artificial IntelligenceNov-5-2021

The in-memory algorithms for approximate nearest neighbor search (ANNS) have achieved great success for fast high-recall search, but are extremely expensive when handling very large scale database. Thus, there is an increasing request for the hybrid ANNS solutions with small memory and inexpensive solid-state drive (SSD). In this paper, we present a simple but efficient memory-disk hybrid indexing and search system, named SPANN, that follows the inverted index methodology. It stores the centroid points of the posting lists in the memory and the large posting lists in the disk. We guarantee both disk-access efficiency (low latency) and high recall by effectively reducing the disk-access number and retrieving high-quality posting lists. In the index-building stage, we adopt a hierarchical balanced clustering algorithm to balance the length of posting lists and augment the posting list by adding the points in the closure of the corresponding clusters. In the search stage, we use a query-aware scheme to dynamically prune the access of unnecessary posting lists. Experiment results demonstrate that SPANN is 2$\times$ faster than the state-of-the-art ANNS solution DiskANN to reach the same recall quality $90\%$ with same memory cost in three billion-scale datasets. It can reach $90\%$ recall@1 and recall@10 in just around one millisecond with only 32GB memory cost. Code is available at: {\footnotesize\color{blue}{\url{https://github.com/microsoft/SPTAG}}}.

dataset, latency, vector, (16 more...)

arXiv.org Artificial Intelligence

2111.08566

Country: Asia > Afghanistan > Parwan Province > Charikar (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.66)

Add feedback

Outlining and Filling: Hierarchical Query Graph Generation for Answering Complex Questions over Knowledge Graph

Chen, Yongrui, Li, Huiying, Qi, Guilin, Wu, Tianxing, Wang, Tenggou

arXiv.org Artificial IntelligenceNov-1-2021

Query graph building aims to build correct executable SPARQL over the knowledge graph for answering natural language questions. Although recent approaches perform well by NN-based query graph ranking, more complex questions bring three new challenges: complicated SPARQL syntax, huge search space for ranking, and noisy query graphs with local ambiguity. This paper handles these challenges. Initially, we regard common complicated SPARQL syntax as the sub-graphs comprising of vertices and edges and propose a new unified query graph grammar to adapt them. Subsequently, we propose a new two-stage approach to build query graphs. In the first stage, the top-$k$ related instances (entities, relations, etc.) are collected by simple strategies, as the candidate instances. In the second stage, a graph generation model performs hierarchical generation. It first outlines a graph structure whose vertices and edges are empty slots, and then fills the appropriate instances into the slots, thereby completing the query graph. Our approach decomposes the unbearable search space of entire query graphs into affordable sub-spaces of operations, meanwhile, leverages the global structural information to eliminate local ambiguity. The experimental results demonstrate that our approach greatly improves state-of-the-art on the hardest KGQA benchmarks and has an excellent performance on complex questions.

denote, outlining, query graph, (15 more...)

arXiv.org Artificial Intelligence

2111.00732

Country:

North America > United States > Connecticut (0.04)
North America > United States > South Dakota (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (1.00)

Add feedback

Clinical Evidence Engine: Proof-of-Concept For A Clinical-Domain-Agnostic Decision Support Infrastructure

Hou, Bojian, Zhang, Hao, Ladizhinsky, Gur, Ladizhinsky, Gur, Yang, Stephen, Kuleshov, Volodymyr, Wang, Fei, Yang, Qian

arXiv.org Artificial IntelligenceOct-31-2021

Abstruse learning algorithms and complex datasets increasingly characterize modern clinical decision support systems (CDSS). As a result, clinicians cannot easily or rapidly scrutinize the CDSS recommendation when facing a difficult diagnosis or treatment decision in practice. Over-trust or under-trust are frequent. Prior research has explored supporting such assessments by explaining DST data inputs and algorithmic mechanisms. This paper explores a different approach: Providing precisely relevant, scientific evidence from biomedical literature. We present a proof-of-concept system, Clinical Evidence Engine, to demonstrate the technical and design feasibility of this approach across three domains (cardiovascular diseases, autism, cancer). Leveraging Clinical BioBERT, the system can effectively identify clinical trial reports based on lengthy clinical questions (e.g., "risks of catheter infection among adult patients in intensive care unit who require arterial catheters, if treated with povidone iodine-alcohol"). This capability enables the system to identify clinical trials relevant to diagnostic/treatment hypotheses -- a clinician's or a CDSS's. Further, Clinical Evidence Engine can identify key parts of a clinical trial abstract, including patient population (e.g., adult patients in intensive care unit who require arterial catheters), intervention (povidone iodine-alcohol), and outcome (risks of catheter infection). This capability opens up the possibility of enabling clinicians to 1) rapidly determine the match between a clinical trial and a clinical question, and 2) understand the result and contexts of the trial without extensive reading. We demonstrate this potential by illustrating two example use scenarios of the system. We discuss the idea of designing DST explanations not as specific to a DST or an algorithm, but as a domain-agnostic decision support infrastructure.

clinical evidence engine, clinician, literature, (11 more...)

arXiv.org Artificial Intelligence

2111.00621

Country:

North America > United States > New York > New York County > New York City (0.05)
Oceania > Australia > Queensland > Brisbane (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(5 more...)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Therapeutic Area > Neurology > Autism (0.90)

Technology:

Information Technology > Human Computer Interaction (1.00)
Information Technology > Decision Support Systems (1.00)
Information Technology > Information Management (0.95)
(5 more...)

Add feedback

DSC-IITISM at FinCausal 2021: Combining POS tagging with Attention-based Contextual Representations for Identifying Causal Relationships in Financial Documents

Haldar, Gunjan, Mittal, Aman, Gupta, Pradyumna

arXiv.org Artificial IntelligenceOct-31-2021

Causality detection draws plenty of attention in the field of Natural Language Processing and linguistics research. It has essential applications in information retrieval, event prediction, question answering, financial analysis, and market research. In this study, we explore several methods to identify and extract cause-effect pairs in financial documents using transformers. For this purpose, we propose an approach that combines POS tagging with the BIO scheme, which can be integrated with modern transformer models to address this challenge of identifying causality in a given text. Our best methodology achieves an F1-Score of 0.9551, and an Exact Match Score of 0.8777 on the blind test in the FinCausal-2021 Shared Task at the FinCausal 2021 Workshop.

fincausal 2021, prediction, transformer model, (12 more...)

arXiv.org Artificial Intelligence

2111.0049

Country:

Europe > United Kingdom > England > Lancashire > Lancaster (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Asia > India > Jharkhand > Dhanbad (0.04)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.74)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.55)

Add feedback

Privacy in Open Search: A Review of Challenges and Solutions

Sousa, Samuel, Guetl, Christian, Kern, Roman

arXiv.org Artificial IntelligenceOct-24-2021

Privacy is of worldwide concern regarding activities and processes that include sensitive data. For this reason, many countries and territories have been recently approving regulations controlling the extent to which organizations may exploit data provided by people. Artificial intelligence areas, such as machine learning and natural language processing, have already successfully employed privacy-preserving mechanisms in order to safeguard data privacy in a vast number of applications. Information retrieval (IR) is likewise prone to privacy threats, such as attacks and unintended disclosures of documents and search history, which may cripple the security of users and be penalized by data protection laws. This work aims at highlighting and discussing open challenges for privacy in the recent literature of IR, focusing on tasks featuring user-generated text data. Our contribution is threefold: firstly, we present an overview of privacy threats to IR tasks; secondly, we discuss applicable privacy-preserving mechanisms which may be employed in solutions to restrain privacy hazards; finally, we bring insights on the tradeoffs between privacy preservation and utility performance for IR tasks.

application, information, privacy, (15 more...)

arXiv.org Artificial Intelligence

2110.1072

Country:

Europe > Austria > Styria > Graz (0.05)
North America > United States (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report (1.00)
Overview (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.92)

Add feedback

Knowledge Graph informed Fake News Classification via Heterogeneous Representation Ensembles

Koloski, Boshko, Stepišnik-Perdih, Timen, Robnik-Šikonja, Marko, Pollak, Senja, Škrlj, Blaž

arXiv.org Artificial IntelligenceOct-20-2021

Increasing amounts of freely available data both in textual and relational form offers exploration of richer document representations, potentially improving the model performance and robustness. An emerging problem in the modern era is fake news detection -- many easily available pieces of information are not necessarily factually correct, and can lead to wrong conclusions or are used for manipulation. In this work we explore how different document representations, ranging from simple symbolic bag-of-words, to contextual, neural language model-based ones can be used for efficient fake news identification. One of the key contributions is a set of novel document representation learning methods based solely on knowledge graphs, i.e. extensive collections of (grounded) subject-predicate-object triplets. We demonstrate that knowledge graph-based representations already achieve competitive performance to conventionally accepted representation learners. Furthermore, when combined with existing, contextual representations, knowledge graph-based document representations can achieve state-of-the-art performance. To our knowledge this is the first larger-scale evaluation of how knowledge graph-based representations can be systematically incorporated into the process of fake news classification.

document representation, knowledge graph, representation, (13 more...)

arXiv.org Artificial Intelligence

2110.10457

Country:

North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Slovenia > Central Slovenia > Municipality of Ljubljana > Ljubljana (0.05)
(19 more...)

Genre: Research Report (1.00)

Industry:

Media > News (1.00)
Government (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.98)
(2 more...)

Add feedback

Brave ditches Google for its own privacy-centric search engine

#artificialintelligenceOct-19-2021, 21:28:06 GMT

Brave Browser has replaced Google with its own no-tracking privacy-centric Brave Search as the default search engine for new users in five regions. Brave is an open-source Chromium-based browser that focuses on user privacy by automatically blocking ads and tracking scripts and removing the privacy-invasive functions built into Chromium. Historically, Brave used Google as its default search engine when searching from the address bar. However, Google is known for tracking users' activities, behavior, and interests, not making it a good fit for a privacy-centric browser. Today, Brave announced that their privacy-focused Brave Search has now become the default search engine for new users in the United States, Canada, and the United Kingdom.

brave search, google, search engine, (10 more...)

#artificialintelligence

Country:

North America > United States (0.26)
North America > Canada (0.26)
Europe > United Kingdom (0.26)
(2 more...)

Industry:

Information Technology > Security & Privacy (0.54)
Information Technology > Services (0.38)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)

Add feedback

Brave's privacy-first search engine is now built in to its browser

EngadgetOct-19-2021, 16:00:39 GMT

Brave is very confident in its privacy-centric search engine -- so much so that it's giving Google the boot. As of today (October 19th), Brave will use the engine as its browser's default search tool, replacing Google in the US, UK and Canada. Your browser will keep its existing search engine settings, and you can always pick Google or another competitor if you're so inclined. The change in defaults is available across desktop releases as well as Android and iOS. Brave Search is effectively billed as the anti-Google engine.

browser, google, privacy-first search engine, (1 more...)

Engadget

Country:

North America > United States (0.28)
North America > Canada (0.28)
Europe > Germany (0.08)
Europe > France (0.08)

Technology:

Information Technology > Information Management > Search (0.89)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.89)

Add feedback