AITopics

doi: 10.1016/j.jbi.2022.104215

2101.00146

Country:

Oceania > Australia > New South Wales > Sydney (0.34)
Oceania > Australia > Queensland > Brisbane (0.04)
Oceania > Australia > New South Wales > Kensington (0.04)
(3 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Health Care Technology > Medical Record (1.00)
Health & Medicine > Health Care Providers & Services (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceOct-3-2022

Unsupervised Search Algorithm Configuration using Query Performance Prediction

Roitman, Haggai

Search engine configuration can be quite difficult for inexpert developers. Instead, an auto-configuration approach can be used to speed up development time. Yet, such an automatic process usually requires relevance labels to train a supervised model. In this work, we suggest a simple solution based on query performance prediction that requires no relevance labels but only a sample of queries in a given domain. Using two example usecases we demonstrate the merits of our solution.

artificial intelligence, information retrieval, natural language, (16 more...)

2210.00767

Country:

North America > United States > New York > New York County > New York City (0.06)
Asia > Middle East > Israel (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Information Management > Search (0.93)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.52)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.45)

#artificialintelligenceOct-1-2022, 14:21:57 GMT

Computer Vision - Richard Szeliski

As humans, we perceive the three-dimensional structure of the world around us with apparent ease. Think of how vivid the three-dimensional percept is when you look at a vase of flowers sitting on the table next to you. You can tell the shape and translucency of each petal through the subtle patterns of light and shading that play across its surface and effortlessly segment each flower from the background of the scene (Figure 1.1). Looking at a framed group por- trait, you can easily count (and name) all of the people in the picture and even guess at their emotions from their facial appearance. Perceptual psychologists have spent decades trying to understand how the visual system works and, even though they can devise optical illusions1 to tease apart some of its principles (Figure 1.3), a complete solution to this puzzle remains elusive (Marr 1982; Palmer 1999; Livingstone 2008).

canada government, diagnostic medicine, pattern recognition, (51 more...)

#artificialintelligence

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.45)
North America > United States > New Jersey (0.45)
Europe > Spain (0.45)
(38 more...)

Genre:

Workflow (1.00)
Summary/Review (1.00)
Research Report > New Finding (1.00)
(4 more...)

Industry:

Transportation > Ground > Road (1.00)
Semiconductors & Electronics (1.00)
Media > Television (1.00)
(14 more...)

Technology:

Information Technology > Software > Programming Languages (1.00)
Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Security & Privacy (1.00)
(36 more...)

Woodruff, David P., Zhang, Fred, Zhang, Qiuyi

Optimal Query Complexities for Dynamic Trace Estimation

arXiv.org Artificial IntelligenceSep-30-2022

We consider the problem of minimizing the number of matrix-vector queries needed for accurate trace estimation in the dynamic setting where our underlying matrix is changing slowly, such as during an optimization process. Specifically, for any $m$ matrices $A_1,...,A_m$ with consecutive differences bounded in Schatten-$1$ norm by $\alpha$, we provide a novel binary tree summation procedure that simultaneously estimates all $m$ traces up to $\epsilon$ error with $\delta$ failure probability with an optimal query complexity of $\widetilde{O}\left(m \alpha\sqrt{\log(1/\delta)}/\epsilon + m\log(1/\delta)\right)$, improving the dependence on both $\alpha$ and $\delta$ from Dharangutte and Musco (NeurIPS, 2021). Our procedure works without additional norm bounds on $A_i$ and can be generalized to a bound for the $p$-th Schatten norm for $p \in [1,2]$, giving a complexity of $\widetilde{O}\left(m \alpha\left(\sqrt{\log(1/\delta)}/\epsilon\right)^p +m \log(1/\delta)\right)$. By using novel reductions to communication complexity and information-theoretic analyses of Gaussian matrices, we provide matching lower bounds for static and dynamic trace estimation in all relevant parameters, including the failure probability. Our lower bounds (1) give the first tight bounds for Hutchinson's estimator in the matrix-vector product model with Frobenius norm error even in the static setting, and (2) are the first unconditional lower bounds for dynamic trace estimation, resolving open questions of prior work.

artificial intelligence, machine learning, natural language, (18 more...)

2209.15219

Country:

North America > United States > Texas > Brazos County > College Station (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.72)

arXiv.org Artificial IntelligenceSep-29-2022

Detecting Small Query Graphs in A Large Graph via Neural Subgraph Search

Bai, Yunsheng, Xu, Derek, Sun, Yizhou, Wang, Wei

Recent advances have shown the success of using reinforcement learning and search to solve NP-hard graph-related tasks, such as Traveling Salesman Optimization, Graph Edit Distance computation, etc. However, it remains unclear how one can efficiently and accurately detect the occurrences of a small query graph in a large target graph, which is a core operation in graph database search, biomedical analysis, social group finding, etc. This task is called Subgraph Matching which essentially performs subgraph isomorphism check between a query graph and a large target graph. One promising approach to this classical problem is the "learning-to-search" paradigm, where a reinforcement learning (RL) agent is designed with a learned policy to guide a search algorithm to quickly find the solution without any solved instances for supervision. However, for the specific task of Subgraph Matching, though the query graph is usually small given by the user as input, the target graph is often orders-of-magnitude larger. It poses challenges to the neural network design and can lead to solution and reward sparsity. S with two innovations to tackle the challenges: (1) A novel encoder-decoder neural network architecture to dynamically compute the matching information between the query and the target graphs at each search state; (2) A novel look-ahead loss function for training the policy network. S can significantly improve the subgraph matching performance. With the growing amount of graph data that naturally arises in many domains, solving graph-related tasks via machine learning has gained increasing attention.

machine learning, natural language, node, (19 more...)

2207.10305

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
North America > United States > California > Santa Clara County > Palo Alto (0.04)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Hofstätter, Sebastian, Chen, Jiecao, Raman, Karthik, Zamani, Hamed

FiD-Light: Efficient and Effective Retrieval-Augmented Text Generation

Retrieval-augmented generation models offer many benefits over standalone language models: besides a textual answer to a given query they provide provenance items retrieved from an updateable knowledge base. However, they are also more complex systems and need to handle long inputs. In this work, we introduce FiD-Light to strongly increase the efficiency of the state-of-the-art retrieval-augmented FiD model, while maintaining the same level of effectiveness. Our FiD-Light model constrains the information flow from the encoder (which encodes passages separately) to the decoder (using concatenated encoded representations). Furthermore, we adapt FiD-Light with re-ranking capabilities through textual source pointers, to improve the top-ranked provenance precision. Our experiments on a diverse set of seven knowledge intensive tasks (KILT) show FiD-Light consistently improves the Pareto frontier between query latency and effectiveness. FiD-Light with source pointing sets substantial new state-of-the-art results on six KILT tasks for combined text generation and provenance retrieval evaluation, while maintaining reasonable efficiency. Enabling machine learning models to access information contained in parametric or non-parametric storage (i.e., retrieval-enhanced machine learning) can lead to efficiency and/or effectiveness improvements in a wide range of learning tasks (Zamani et al., 2022). For example, retrievalaugmented generation (Lewis et al., 2020), which is the focus of this paper, has a manifold of benefits over closed-loop language modelling in knowledge intensive tasks: Answers can be grounded in (multiple) specific pieces of information which enables clear attribution (Dehghani et al., 2019; Rashkin et al., 2021; Lamm et al., 2021); the knowledge base can easily be managed, updated, and swapped (Izacard et al., 2022); the decomposition of retrieval and generation module offers clear efficiency-effectiveness tradeoff controls; and the data structure of combined retrieval and text generation enables many insightful failure analyses. However, with these benefits also come downsides, such as a higher system complexity with higher training and inference cost.

information retrieval, large language model, machine learning, (18 more...)

2209.1429

Country:

Asia > Middle East > Jordan (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Massachusetts > Hampshire County > Amherst (0.04)

Genre:

Research Report > New Finding (0.93)
Research Report > Experimental Study (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.48)
(2 more...)

SHiFT: An Efficient, Flexible Search Engine for Transfer Learning

Renggli, Cedric, Yao, Xiaozhe, Kolar, Luka, Rimanic, Luka, Klimovic, Ana, Zhang, Ce

Transfer learning can be seen as a data- and compute-efficient alternative to training models from scratch. The emergence of rich model repositories, such as TensorFlow Hub, enables practitioners and researchers to unleash the potential of these models across a wide range of downstream tasks. As these repositories keep growing exponentially, efficiently selecting a good model for the task at hand becomes paramount. By carefully comparing various selection and search strategies, we realize that no single method outperforms the others, and hybrid or mixed strategies can be beneficial. Therefore, we propose SHiFT, the first downstream task-aware, flexible, and efficient model search engine for transfer learning. These properties are enabled by a custom query language SHiFT-QL together with a cost-based decision maker, which we empirically validate. Motivated by the iterative nature of machine learning development, we further support efficient incremental executions of our queries, which requires a careful implementation when jointly used with our optimizations.

information retrieval, machine learning, natural language, (17 more...)

2204.01457

Country:

Europe > Switzerland > Zürich > Zürich (0.04)
Europe > Romania > Sud - Muntenia Development Region > Giurgiu County > Giurgiu (0.04)

Genre: Research Report > New Finding (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.83)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

DAMO-NLP at NLPCC-2022 Task 2: Knowledge Enhanced Robust NER for Speech Entity Linking

Huang, Shen, Zhai, Yuchen, Long, Xinwei, Jiang, Yong, Wang, Xiaobin, Zhang, Yin, Xie, Pengjun

Speech Entity Linking aims to recognize and disambiguate named entities in spoken languages. Conventional methods suffer gravely from the unfettered speech styles and the noisy transcripts generated by ASR systems. In this paper, we propose a novel approach called Knowledge Enhanced Named Entity Recognition (KENER), which focuses on improving robustness through painlessly incorporating proper knowledge in the entity recognition stage and thus improving the overall performance of entity linking. KENER first retrieves candidate entities for a sentence without mentions, and then utilizes the entity descriptions as extra information to help recognize mentions. The candidate entities retrieved by a dense retrieval module are especially useful when the input is short or noisy. Moreover, we investigate various data sampling strategies and design effective loss functions, in order to improve the quality of retrieved entities in both recognition and disambiguation stages. Lastly, a linking with filtering module is applied as the final safeguard, making it possible to filter out wrongly-recognized mentions. Our system achieves 1st place in Track 1 and 2nd place in Track 2 of NLPCC-2022 Shared Task 2. Keywords: Entity Linking Robust NER

artificial intelligence, information retrieval, natural language, (17 more...)

2209.13187

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.71)

Multilingual Search with Subword TF-IDF

Wangperawong, Artit

Multilingual search can be achieved with subword tokenization. The accuracy of traditional TF-IDF approaches depend on manually curated tokenization, stop words and stemming rules, whereas subword TF-IDF (STF-IDF) can offer higher accuracy without such heuristics. Moreover, multilingual support can be incorporated inherently as part of the subword tokenization model training. XQuAD evaluation demonstrates the advantages of STF-IDF: superior information retrieval accuracy of 85.4% for English and over 80% for 10 other languages without any heuristics-based preprocessing. The software to reproduce these results are open-sourced as a part of Text2Text: https://github.com/artitw/text2text

information retrieval, machine learning, tokenization, (19 more...)

2209.14281

Country: North America > United States (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.98)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.35)

Daily Mail - Science & techSep-27-2022, 09:45:23 GMT

Google creates 'NASA DART' search engine Easter Egg to celebrate launch of test mission

Google has created a browser'Easter Egg' of a spacecraft crashing into the web browser when a user searches'NASA Dart', to celebrate the success of the planetary defence test. The graphic shows a probe shooting across the DART-related search results, before it collides and disappears in a cloud of dust, leaving the page askew. The demonstration is triggered by the search terms'NASA DART', 'DART', 'DART probe' or'double asteroid redirection test', the full name of the mission. NASA tweeted about the Easter Egg earlier today, telling followers: 'Your Google search could reveal something smashing! Search for "NASA DART" on to see a demonstration of browser, uh, planetary defense.'

asteroid, dart, dimorpho, (14 more...)

Daily Mail - Science & tech

Country:

North America > United States > Maryland (0.05)
North America > United States > California (0.05)

Industry:

Government > Space Agency (1.00)
Government > Regional Government > North America Government > United States Government (1.00)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.41)