AITopics | Baumel, Tal

Collaborating Authors

Baumel, Tal

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

In-Context Learning on a Budget: A Case Study in Named Entity Recognition

Berger, Uri, Baumel, Tal, Stanovsky, Gabriel

arXiv.org Artificial IntelligenceJun-19-2024

Few shot in-context learning (ICL) typically assumes access to large annotated training sets. However, in many real world scenarios, such as domain adaptation, there is only a limited budget to annotate a small number of samples, with the goal of maximizing downstream performance. We study various methods for selecting samples to annotate within a predefined budget, specifically focusing on the named entity recognition (NER) task, which has real-world applications, is expensive to annotate, and is relatively less studied in ICL setups. Across different models and datasets, we find that a relatively small pool of annotated samples can achieve results comparable to using the entire training set. Moreover, we discover that random selection of samples for annotation yields surprisingly good performance. Finally, we observe that a diverse annotation pool is correlated with improved performance. We hope that future work adopts our realistic paradigm which takes annotation budget into account.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2406.13274

Country:

North America > United States (0.28)
North America > Canada (0.28)
Asia > Middle East > UAE (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback

Federated Multilingual Models for Medical Transcript Analysis

Manoel, Andre, Garcia, Mirian Hipolito, Baumel, Tal, Su, Shize, Chen, Jialei, Miller, Dan, Karmon, Danny, Sim, Robert, Dimitriadis, Dimitrios

arXiv.org Artificial IntelligenceNov-3-2022

Federated Learning (FL) is a novel machine learning approach that allows the model trainer to access more data samples, by training the model across multiple decentralized data sources, while data access constraints are in place. Such trained models can achieve significantly higher performance beyond what can be done when trained on a single data source. As part of FL's promises, none of the training data is ever transmitted to any central location, ensuring that sensitive data remains local and private. These characteristics make FL perfectly suited for large-scale applications in healthcare, where a variety of compliance constraints restrict how data may be handled, processed, and stored. Despite the apparent benefits of federated learning, the heterogeneity in the local data distributions pose significant challenges, and such challenges are even more pronounced in the case of multilingual data providers. In this paper we present a federated learning system for training a large-scale multi-lingual model suitable for fine-tuning on downstream tasks such as medical entity tagging. Our work represents one of the first such production-scale systems, capable of training across multiple highly heterogeneous data providers, and achieving levels of accuracy that could not be otherwise achieved by using central training with public data. Finally, we show that the global model performance can be further improved by a training step performed locally.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2211.09722

Country: Europe (0.28)

Genre: Research Report > New Finding (0.46)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine > Health Care Technology > Medical Record (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

Multi-Label Classification of Patient Notes: Case Study on ICD Code Assignment

Baumel, Tal (Ben-Gurion University) | Nassour-Kassis, Jumana (Ben-Gurion University) | Cohen, Raphael (Chorus.ai) | Elhadad, Michael (Ben-Gurion University) | Elhadad, Nóemie (Columbia University)

AAAI ConferencesApr-6-2018

The automatic coding of clinical documentation according to diagnosis codes is a useful task in the Electronic Health Record, but a challenging one due to the large number of codes and the length of patient notes. We investigate four models for assigning multiple ICD codes to discharge summaries, and experiment with data from the MIMIC II and III clinical datasets. We present Hierarchical Attention-bidirectional Gated Recurrent Unit (HA-GRU), a hierarchical approach to tag a document by identifying the sentences relevant for each label. HA-GRU achieves state-of-the art results. Furthermore, the learned sentence-level attention layer highlights the model decision process, allows for easier error analysis, and suggests future directions for improvement.

icd code assignment, multi-label classification, patient note, (1 more...)

AAAI Conferences

Workshops at the Thirty-Second AAAI Conference on Artificial Intelligence

Industry:

Health & Medicine > Health Care Providers & Services (0.60)
Health & Medicine > Health Care Technology > Medical Record (0.53)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.40)

Add feedback

Topic Concentration in Query Focused Summarization Datasets

Baumel, Tal (Ben-Gurion University) | Cohen, Raphael (Ben-Gurion University) | Elhadad, Michael (Ben-Gurion University)

AAAI ConferencesApr-19-2016

Query-Focused Summarization (QFS) summarizes a document cluster in response to a specific input query. QFS algorithms must combine query relevance assessment, central content identification, and redundancy avoidance. Frustratingly, state of the art algorithms designed for QFS do not significantly improve upon generic summarization methods, which ignore query relevance, when evaluated on traditional QFS datasets. We hypothesize this lack of success stems from the nature of the dataset. We define a task-based method to quantify topic concentration in datasets, i.e., the ratio of sentences within the dataset that are relevant to the query, and observe that the DUC 2005, 2006 and 2007 datasets suffer from very high topic concentration. We introduce TD-QFS, a new QFS dataset with controlled levels of topic concentration. We compare competitive baseline algorithms on TD-QFS and report strong improvement in ROUGE performance for algorithms that properly model query relevance as opposed to generic summarizers. We further present three new and simple QFS algorithms, RelSum, ThresholdSum, and TFIDF-KLSum that outperform state of the art QFS algorithms on the TD-QFS dataset by a large margin.

alzheimer s disease, dataset, neurology, (21 more...)

AAAI Conferences

Thirtieth AAAI Conference on Artificial Intelligence

Industry: Health & Medicine > Therapeutic Area (0.30)

Technology: Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.70)

Add feedback