AITopics

2109.05685

Country:

North America > Canada > Alberta > Census Division No. 2 > Lethbridge County > Lethbridge (0.14)
North America > United States > District of Columbia > Washington (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.72)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.62)

arXiv.org Artificial IntelligenceSep-11-2021

Sequential Modelling with Applications to Music Recommendation, Fact-Checking, and Speed Reading

Hansen, Christian

Sequential modelling entails making sense of sequential data, which naturally occurs in a wide array of domains. One example is systems that interact with users, log user actions and behaviour, and make recommendations of items of potential interest to users on the basis of their previous interactions. In such cases, the sequential order of user interactions is often indicative of what the user is interested in next. Similarly, for systems that automatically infer the semantics of text, capturing the sequential order of words in a sentence is essential, as even a slight re-ordering could significantly alter its original meaning. This thesis makes methodological contributions and new investigations of sequential modelling for the specific application areas of systems that recommend music tracks to listeners and systems that process text semantics in order to automatically fact-check claims, or "speed read" text for efficient further classification.

automatic identification and verification, state-of-the-art speed reading model, veracity prediction model, (17 more...)

2109.06736

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Denmark > Capital Region > Copenhagen (0.04)
North America > United States > New York > New York County > New York City (0.04)
(31 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)
Research Report > Experimental Study (0.92)
Research Report > Promising Solution (0.67)

Industry:

Media > News (1.00)
Media > Music (1.00)
Leisure & Entertainment (1.00)
(3 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
(8 more...)

#artificialintelligenceSep-10-2021, 23:56:30 GMT

Snapchat's Scan feature can identify dogs, plants, clothes, and more

Snapchat's camera has to date mostly been associated with sending disappearing messages and goofy AR effects, like a virtual dancing hot dog. But what if it did things for you, like suggest ways to make your videos look and sound better? Or show you a similar shirt based on the one you're looking at? Starting Thursday, a feature called Scan is being upgraded and placed front and center in the app's camera, letting it identify a range of things in the real world, like clothes or dog breeds. Scan's prominent placement in Snapchat means that the company is slowly becoming not just a messaging app, but a visual search engine.

camera shortcut, scan feature, snapchat, (11 more...)

Industry: Information Technology > Services (0.98)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.35)

#artificialintelligenceSep-9-2021, 23:05:46 GMT

Practical Entity Resolution on AWS to Reconcile Data in the Real World

This post was co-written with Mamoon Chowdry, Solutions Architect, previously at AWS. Businesses and organizations from many industries often struggle to ensure that their data is accurate. Data often has to match people or things exactly in the real world, such as a customer name, an address, or a company. Matching our data is important to validate it, de-duplicate it, or link records in different systems together. Know Your Customer (KYC) regulations also mean that we must be confident in who or what our data is referring to. We must match millions of records from different data sources.

practical entity resolution, reference data, vector, (10 more...)

Industry: Retail > Online (0.40)

Technology: Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.43)

#artificialintelligenceSep-9-2021, 01:40:29 GMT

Top 30 NLP Use Cases: Comprehensive Guide for 2021

Natural language processing (NLP) is a subfield of AI and linguistics which enables computers to understand, interpret and manipulate human language. Although NLP faces different challenges due to the difficulty of human language, this did not become an obstacle in the face of its growth. The global NLP market was estimated at $5B in 2018 and is expected to reach $43B by 2025, and this exponential growth can mostly be attributed to the vast use cases of NLP in every industry today. You may already be familiar with many NLP applications such as autocorrection, translation, or chatbots. However, NLP is the cornerstone of numerous applications we use every day without even noticing.

algorithm, nlp, recognition, (17 more...)

Industry:

Information Technology > Security & Privacy (1.00)
Banking & Finance (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.72)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.47)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.47)

arXiv.org Artificial IntelligenceSep-8-2021

Biomedical Question Answering: A Survey of Approaches and Challenges

Jin, Qiao, Yuan, Zheng, Xiong, Guangzhi, Yu, Qianlan, Ying, Huaiyuan, Tan, Chuanqi, Chen, Mosha, Huang, Songfang, Liu, Xiaozhong, Yu, Sheng

Professionals as well as the general public need effective assistance to access, understand and consume complex biomedical concepts. For example, doctors always want to be aware of up-to-date clinical evidence for the diagnosis and treatment of diseases under the scheme of Evidence-based Medicine [165], and the general public is becoming increasingly interested in learning about their own health conditions on the Internet [54]. Traditionally, Information Retrieval (IR) systems, such as PubMed, have been used to meet such information needs. However, classical IR is still not efficient enough [71, 77, 99, 164]. For instance, Russell-Rose and Chamberlain [164] reported that it requires 4 expert hours to answer complex medical queries using search engines. Compared with the retrieval systems that typically return a list of relevant documents for the users to read, Question Answering (QA) systems that provide direct answers to users' questions are more straightforward and intuitive. In general, QA itself is a challenging benchmark Natural Language Processing (NLP) task for evaluating the abilities of intelligent systems to understand a question, retrieve and utilize relevant materials and generate its answer. With the rapid development of computing hardware, modern QA models, especially those based on deep learning [30, 31, 42, 146, 171], achieve comparable or even better performance than human on many benchmark datasets [67, 83, 154, 155, 215] and have been successfully adopted in general domain search engines and conversational assistants [150, 236]. The Text REtrieval Conference (TREC) QA Track has triggered the modern QA research [197], when QA models were mostly based on IR.

computational linguistic, machine learning, question answering, (18 more...)

doi: 10.1145/3490238

2102.05281

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Canada > Alberta (0.14)
(27 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Question Answering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceSep-7-2021

An N-gram based approach to auto-extracting topics from research articles

Zhu, Linkai, Huang, Maoyi, Chen, Maomao, Wang, Wennan

A lot of manual work goes into identifying a topic for an article. With a large volume of articles, the manual process can be exhausting. Our approach aims to address this issue by automatically extracting topics from the text of large Numbers of articles. This approach takes into account the efficiency of the process. Based on existing N-gram analysis, our research examines how often certain words appear in documents in order to support automatic topic extraction. In order to improve efficiency, we apply custom filtering standards to our research. Additionally, delete as many noncritical or irrelevant phrases as possible. In this way, we can ensure we are selecting unique keyphrases for each article, which capture its core idea. For our research, we chose to center on the autonomous vehicle domain, since the research is relevant to our daily lives. We have to convert the PDF versions of most of the research papers into editable types of files such as TXT. This is because most of the research papers are only in PDF format. To test our proposed idea of automating, numerous articles on robotics have been selected. Next, we evaluate our approach by comparing the result with that obtained manually.

extraction, n-gram analysis, topic extraction, (15 more...)

2110.11879

Country:

Asia > Macao (0.14)
North America > Canada > Ontario > National Capital Region > Ottawa (0.14)
Europe > Sweden > Vaestra Goetaland > Gothenburg (0.05)
(3 more...)

Genre: Research Report > Experimental Study (0.34)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.69)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.50)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.47)

arXiv.org Artificial IntelligenceSep-4-2021

Representation Learning for Efficient and Effective Similarity Search and Recommendation

Hansen, Casper

How data is represented and operationalized is critical for building computational solutions that are both effective and efficient. A common approach is to represent data objects as binary vectors, denoted \textit{hash codes}, which require little storage and enable efficient similarity search through direct indexing into a hash table or through similarity computations in an appropriate space. Due to the limited expressibility of hash codes, compared to real-valued representations, a core open challenge is how to generate hash codes that well capture semantic content or latent properties using a small number of bits, while ensuring that the hash codes are distributed in a way that does not reduce their search efficiency. State of the art methods use representation learning for generating such hash codes, focusing on neural autoencoder architectures where semantics are encoded into the hash codes by learning to reconstruct the original inputs of the hash codes. This thesis addresses the above challenge and makes a number of contributions to representation learning that (i) improve effectiveness of hash codes through more expressive representations and a more effective similarity measure than the current state of the art, namely the Hamming distance, and (ii) improve efficiency of hash codes by learning representations that are especially suited to the choice of search method. The contributions are empirically validated on several tasks related to similarity search and recommendation.

learning representation, unsupervised neural generative semantic hashing, unsupervised semantic hashing, (15 more...)

2109.01815

Country:

North America > United States > California > San Francisco County > San Francisco (0.13)
Europe > Denmark > Capital Region > Copenhagen (0.05)
Europe > Slovenia > Central Slovenia > Municipality of Ljubljana > Ljubljana (0.04)
(12 more...)

Genre:

Research Report > Promising Solution (1.00)
Overview (1.00)

Industry:

Leisure & Entertainment (1.00)
Media > Music (0.67)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (1.00)
(5 more...)

#artificialintelligenceSep-3-2021, 17:01:01 GMT

Scrape Search Engine Results in Real-time with Zenserp

If you have a project or service that requires scraping search results for data, you might be interested in this API that can streamline the process. Zenserp is able to get real-time data from search results on the major search platforms. Their simple API has scalable options that make it a great solution for any sized project. You can try Zenserp for free, to see how powerful this API is. Get detailed scrape results from APIs for specific situations.

api, scrape search engine result, zenserp, (8 more...)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Architecture > Real Time Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.42)

Zhang, Ziqi, Song, Xingyi

An Exploratory Study on Utilising the Web of Linked Data for Product Data Mining

arXiv.org Artificial IntelligenceSep-3-2021

The Linked Open Data practice has led to a significant growth of structured data on the Web in the last decade. Such structured data describe real-world entities in a machine-readable way, and have created an unprecedented opportunity for research in the field of Natural Language Processing. However, there is a lack of studies on how such data can be used, for what kind of tasks, and to what extent they can be useful for these tasks. This work focuses on the e-commerce domain to explore methods of utilising such structured data to create language resources that may be used for product classification and linking. We process billions of structured data points in the form of RDF n-quads, to create multi-million words of product-related corpora that are later used in three different ways for creating of language resources: training word embedding models, continued pre-training of BERT-like language models, and training Machine Translation models that are used as a proxy to generate product-related keywords. Our evaluation on an extensive set of benchmarks shows word embeddings to be the most reliable and consistent method to improve the accuracy on both tasks (with up to 6.9 percentage points in macro-average F1 on some datasets). The other two methods however, are not as useful. Our analysis shows that this could be due to a number of reasons, including the biased domain representation in the structured data and lack of vocabulary coverage. We share our datasets and discuss how our lessons learned could be taken forward to inform future research in this direction.

classification, dataset, language resource, (15 more...)

2109.01411

Country:

North America > United States > New York > New York County > New York City (0.05)
Europe > United Kingdom (0.04)
North America > United States > New Mexico > Santa Fe County > Santa Fe (0.04)
(6 more...)

Genre:

Overview (0.92)
Research Report > New Finding (0.92)

Industry: Information Technology > Services > e-Commerce Services (0.48)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
(7 more...)