AITopics

2301.10904

Country:

North America > United States > Virginia (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > California > San Diego County > La Jolla (0.04)
Africa > Sudan (0.04)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.71)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.46)

Ozmen, Muberra, Cotnareanu, Joseph, Coates, Mark

Substituting Data Annotation with Balanced Updates and Collective Loss in Multi-label Text Classification

arXiv.org Artificial IntelligenceSep-24-2023

Multi-label text classification (MLTC) is the task of assigning multiple labels to a given text, and has a wide range of application domains. Most existing approaches require an enormous amount of annotated data to learn a classifier and/or a set of well-defined constraints on the label space structure, such as hierarchical relations which may be complicated to provide as the number of labels increases. In this paper, we study the MLTC problem in annotation-free and scarce-annotation settings in which the magnitude of available supervision signals is linear to the number of labels. Our method follows three steps, (1) mapping input text into a set of preliminary label likelihoods by natural language inference using a pre-trained language model, (2) calculating a signed label dependency graph by label descriptions, and (3) updating the preliminary label likelihoods with message passing along the label dependency graph, driven with a collective loss function that injects the information of expected label frequency and average multi-label cardinality of predictions. The experiments show that the proposed framework achieves effective performance under low supervision settings with almost imperceptible computational and memory overheads added to the usage of pre-trained language model outperforming its initial performance by 70\% in terms of example-based F1 score.

classification, information retrieval, machine learning, (19 more...)

2309.13543

Country: North America > Canada > Quebec > Montreal (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.89)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (0.75)

Mistry, Jimit, Arzeno, Natalia M.

Document Understanding for Healthcare Referrals

arXiv.org Artificial IntelligenceSep-22-2023

Reliance on scanned documents and fax communication for healthcare referrals leads to high administrative costs and errors that may affect patient care. In this work we propose a hybrid model leveraging LayoutLMv3 along with domain-specific rules to identify key patient, physician, and exam-related entities in faxed referral documents. We explore some of the challenges in applying a document understanding model to referrals, which have formats varying by medical practice, and evaluate model performance using MUC-5 metrics to obtain appropriate metrics for the practical use case. Our analysis shows the addition of domain-specific rules to the transformer model yields greatly increased precision and F1 scores, suggesting a hybrid model trained on a curated dataset can increase efficiency in referral management.

annotation, entity type, prediction, (17 more...)

2309.13184

Country: North America > United States > Maryland > Baltimore (0.04)

Genre: Research Report (0.64)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.47)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.46)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.35)

Washington Post - Technology NewsSep-21-2023, 17:05:08 GMT

Google pays Apple billions a year to use its search engine. Now executives must testify.

Judge Amit Mehta will consider legal questions that are both wonky and technical. Being a monopoly is not illegal, but abusing such power to quash competition is. The Justice Department is arguing that Google strong-armed Apple and other smartphone makers into these deals. "The exclusive default was not Apple's choice," Dintzer told the court last week, citing Apple's interest in also dealing with Yahoo, before Google demanded exclusivity.

apple, search engine

Washington Post - Technology News

Industry: Law (0.81)

Technology:

Information Technology > Communications > Mobile (0.83)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.40)

PubMed and Beyond: Biomedical Literature Search in the Age of Artificial Intelligence

Jin, Qiao, Leaman, Robert, Lu, Zhiyong

Biomedical research yields a wealth of information, much of which is only accessible through the literature. Consequently, literature search is an essential tool for building on prior knowledge in clinical and biomedical research. Although recent improvements in artificial intelligence have expanded functionality beyond keyword-based search, these advances may be unfamiliar to clinicians and researchers. In response, we present a survey of literature search tools tailored to both general and specific information needs in biomedicine, with the objective of helping readers efficiently fulfill their information needs. We first examine the widely used PubMed search engine, discussing recent improvements and continued challenges. We then describe literature search tools catering to five specific information needs: 1. Identifying high-quality clinical research for evidence-based medicine. 2. Retrieving gene-related information for precision medicine and genomics. 3. Searching by meaning, including natural language questions. 4. Locating related articles with literature recommendation. 5. Mining literature to discover associations between concepts such as diseases and genetic variants. Additionally, we cover practical considerations and best practices for choosing and using these tools. Finally, we provide a perspective on the future of literature search engines, considering recent breakthroughs in large language models such as ChatGPT. In summary, our survey provides a comprehensive view of biomedical literature search functionalities with 36 publicly available tools.

engine, literature, search engine, (15 more...)

2307.09683

Country:

North America > United States (0.14)
Europe (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.96)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Maman, Ben, Zeitler, Johannes, Müller, Meinard, Bermano, Amit H.

Performance Conditioning for Diffusion-Based Multi-Instrument Music Synthesis

Generating multi-instrument music from symbolic music representations is an important task in Music Information Retrieval (MIR). A central but still largely unsolved problem in this context is musically and acoustically informed control in the generation process. As the main contribution of this work, we propose enhancing control of multi-instrument synthesis by conditioning a generative model on a specific performance and recording environment, thus allowing for better guidance of timbre and style. Building on state-of-the-art diffusion-based music generative models, we introduce performance conditioning - a simple tool indicating the generative model to synthesize music with style and timbre of specific instruments taken from specific performances. Our prototype is evaluated using uncurated performances with diverse instrumentation and achieves state-of-the-art FAD realism scores while allowing novel timbre and style control. Our project page, including samples and demonstrations, is available at benadar293.github.io/midipm

conditioning, instrument, performance conditioning, (14 more...)

2309.12283

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Oceania > Australia (0.04)
North America > United States > Maryland > Baltimore (0.04)
(8 more...)

Genre: Research Report (0.82)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.34)

Robust Approximation Algorithms for Non-monotone $k$-Submodular Maximization under a Knapsack Constraint

Ha, Dung T. K., Pham, Canh V., Tran, Tan D., Hoang, Huan X.

The problem of non-monotone $k$-submodular maximization under a knapsack constraint ($\kSMK$) over the ground set size $n$ has been raised in many applications in machine learning, such as data summarization, information propagation, etc. However, existing algorithms for the problem are facing questioning of how to overcome the non-monotone case and how to fast return a good solution in case of the big size of data. This paper introduces two deterministic approximation algorithms for the problem that competitively improve the query complexity of existing algorithms. Our first algorithm, $\LAA$, returns an approximation ratio of $1/19$ within $O(nk)$ query complexity. The second one, $\RLA$, improves the approximation ratio to $1/5-\epsilon$ in $O(nk)$ queries, where $\epsilon$ is an input parameter. Our algorithms are the first ones that provide constant approximation ratios within only $O(nk)$ query complexity for the non-monotone objective. They, therefore, need fewer the number of queries than state-of-the-the-art ones by a factor of $\Omega(\log n)$. Besides the theoretical analysis, we have evaluated our proposed ones with several experiments in some instances: Influence Maximization and Sensor Placement for the problem. The results confirm that our algorithms ensure theoretical quality as the cutting-edge techniques and significantly reduce the number of queries.

algorithm, constraint, query complexity, (13 more...)

2309.12025

Country:

Asia > Vietnam > Hanoi > Hanoi (0.14)
South America > Argentina > Pampas > Buenos Aires F.D. > Buenos Aires (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(4 more...)

Genre:

Research Report > Promising Solution (0.34)
Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (0.77)

BitCoin: Bidirectional Tagging and Supervised Contrastive Learning based Joint Relational Triple Extraction Framework

He, Luyao, Zhang, Zhongbao, Su, Sen, Chen, Yuxin

Relation triple extraction (RTE) is an essential task in information extraction and knowledge graph construction. Despite recent advancements, existing methods still exhibit certain limitations. They just employ generalized pre-trained models and do not consider the specificity of RTE tasks. Moreover, existing tagging-based approaches typically decompose the RTE task into two subtasks, initially identifying subjects and subsequently identifying objects and relations. They solely focus on extracting relational triples from subject to object, neglecting that once the extraction of a subject fails, it fails in extracting all triples associated with that subject. To address these issues, we propose BitCoin, an innovative Bidirectional tagging and supervised Contrastive learning based joint relational triple extraction framework. Specifically, we design a supervised contrastive learning method that considers multiple positives per anchor rather than restricting it to just one positive. Furthermore, a penalty term is introduced to prevent excessive similarity between the subject and object. Our framework implements taggers in two directions, enabling triples extraction from subject to object and object to subject. Experimental results show that BitCoin achieves state-of-the-art results on the benchmark datasets and significantly improves the F1 score on Normal, SEO, EPO, and multiple relation extraction tasks.

bitcoin, extraction, relation, (13 more...)

2309.11853

Country:

North America > United States > New York (0.05)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Banking & Finance > Trading (0.85)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.46)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.35)

arXiv.org Artificial IntelligenceSep-19-2023

OpenMSD: Towards Multilingual Scientific Documents Similarity Measurement

Gao, Yang, Ma, Ji, Korotkov, Ivan, Hall, Keith, Alon, Dana, Metzler, Don

We develop and evaluate multilingual scientific documents similarity measurement models in this work. Such models can be used to find related works in different languages, which can help multilingual researchers find and explore papers more efficiently. We propose the first multilingual scientific documents dataset, Open-access Multilingual Scientific Documents (OpenMSD), which has 74M papers in 103 languages and 778M citation pairs. With OpenMSD, we pretrain science-specialized language models, and explore different strategies to derive "related" paper pairs to fine-tune the models, including using a mixture of citation, co-citation, and bibliographic-coupling pairs. To further improve the models' performance for non-English papers, we explore the use of generative language models to enrich the non-English papers with English summaries. This allows us to leverage the models' English capabilities to create better representations for non-English papers. Our best model significantly outperforms strong baselines by 7-16% (in mean average precision).

dataset, openmsd, proceedings, (14 more...)

2309.10539

Country:

Asia > China > Hong Kong (0.04)
North America > United States > Washington > King County > Seattle (0.04)
Asia > Singapore (0.04)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.46)

The New YorkerSep-18-2023, 15:00:00 GMT

Best Healthy Search-Engine-Optimized Steamed-Carrots Recipe for Healthy Steamed Carrots

My name is Kelly, and I'm a part-time food blogger, part-time Instagram influencer, part-time spin instructor, and one-quarter-time mom to my two ungrateful stepsons, Richard and Robert. But do you know what I am full-time? A lover of healthy steamed carrots and gorgeous search-engine-optimized food content. Before I give you my absolute favorite recipe for the best healthy steamed carrots that the whole biological family will love, here's a mini-memoir to whet your appetite and improve my Google rankings: Looking for a delicious and healthy recipe for the whole healthy family that loves steamed carrots? This delicious and healthy recipe for healthy and delicious steamed carrots was passed down from my tyrannical great-great-grandmother Ida to my thieving great-grandmother Ethel to my conniving grandmother Sharon to my perpetually disappointed mother, Elaine.

best healthy search-engine-optimized steamed-carrot recipe, recipe, richard and robert, (4 more...)

The New Yorker

Technology:

Information Technology > Communications > Social Media (0.79)
Information Technology > Information Management > Search (0.63)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.63)