AITopics | Dagan, Ido

Collaborating Authors

Dagan, Ido

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Bi-Fact: A Bidirectional Factorization-based Evaluation of Intent Extraction from UI Trajectories

Caduri, Sapir, Efros, Anatoly, Kahlon, Noam, Cohen, Danielle, Halpern, Yoni, Dagan, Ido

arXiv.org Artificial IntelligenceMar-5-2025

Evaluating intent extraction from GUIs demands accurate, fine-grained metrics. This paper introduces Bi-Fact, a novel method that decomposes intents into atomic facts and performs bidirectional comparisons to assess precision and recall. Experiments demonstrate Bi-Fact's superior correlation with human judgments compared to existing metrics, establishing a more robust evaluation framework for UI-driven intent understanding.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2502.13149

Country: North America > United States (0.14)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.48)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.47)

Add feedback

Beyond Pairwise: Global Zero-shot Temporal Graph Generation

Eirew, Alon, Bar, Kfir, Dagan, Ido

arXiv.org Artificial IntelligenceFeb-16-2025

Temporal relation extraction (TRE) is a fundamental task in natural language processing (NLP) that involves identifying the temporal relationships between events in a document. Despite the advances in large language models (LLMs), their application to TRE remains limited. Most existing approaches rely on pairwise classification, in which event pairs are considered individually, leading to computational inefficiency and a lack of global consistency in the resulting temporal graph. In this work, we propose a novel zero-shot method for TRE that generates a document's complete temporal graph at once, then applies transitive constraints optimization to refine predictions and enforce temporal consistency across relations. Additionally, we introduce OmniTemp, a new dataset with complete annotations for all pairs of targeted events within a document. Through experiments and analyses, we demonstrate that our method significantly outperforms existing zero-shot approaches while achieving competitive performance with supervised models.

large language model, natural language, relation, (18 more...)

arXiv.org Artificial Intelligence

2502.11114

Country:

Europe (1.00)
Asia (0.68)
North America > United States (0.46)
North America > Mexico (0.28)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

EventFull: Complete and Consistent Event Relation Annotation

Eirew, Alon, Nachshoni, Eviatar, Slobodkin, Aviv, Dagan, Ido

arXiv.org Artificial IntelligenceDec-17-2024

MEANTIME (Minard et al., 2016), and EventStoryLine Identifying the semantic relations between events (Caselli and Vossen, 2017) restrict event mentioned in a text, notably temporal, causal and pairs to a span of two consecutive sentences. This coreference relations, has been a fundamental goal limitation inherently prevents testing and training in NLP. Substantial efforts have been devoted to developing models on longer-range relations. Other datasets, various datasets that capture some or all of such as TimeBank (Pustejovsky et al., 2003b) and these relations (O'Gorman et al., 2016; Hong et al., MAVEN-ERE (Wang et al., 2022), did not publish 2016; Wang et al., 2022). These datasets were then a systematic annotation execution protocol that leveraged to develop and to evaluate corresponding guarantees actual complete annotation, and were models for detecting event-event relations (Hu subsequently criticized for being incomplete in et al., 2023; Guan et al., 2024). The output of such their relation annotation (Pustejovsky and Stubbs, models has been utilized in a range of downstream 2011; Rogers et al., 2024). Further, some researchers applications, with recent examples including event aimed to avoid the cost of manual annotation forecasting (Ma et al., 2023), misinformation detection altogether and employed fully-or partlyautomatic (Lei and Huang, 2023), and treatment timeline dataset creation methods (Mirza et al., extraction (Yao et al., 2024), among others.

artificial intelligence, natural language, relation, (17 more...)

arXiv.org Artificial Intelligence

2412.12733

Country:

North America > Mexico > Mexico City (0.14)
North America > United States > Texas (0.14)
North America > United States > Oregon (0.14)
(4 more...)

Genre: Research Report (0.82)

Industry: Media > News (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)

Add feedback

QAPyramid: Fine-grained Evaluation of Content Selection for Text Summarization

Zhang, Shiyue, Wan, David, Cattan, Arie, Klein, Ayal, Dagan, Ido, Bansal, Mohit

arXiv.org Artificial IntelligenceDec-9-2024

How to properly conduct human evaluations for text summarization is a longstanding challenge. The Pyramid human evaluation protocol, which assesses content selection by breaking the reference summary into sub-units and verifying their presence in the system summary, has been widely adopted. However, it suffers from a lack of systematicity in the definition and granularity of the sub-units. We address these problems by proposing QAPyramid, which decomposes each reference summary into finer-grained question-answer (QA) pairs according to the QA-SRL framework. We collect QA-SRL annotations for reference summaries from CNN/DM and evaluate 10 summarization systems, resulting in 8.9K QA-level annotations. We show that, compared to Pyramid, QAPyramid provides more systematic and fine-grained content selection evaluation while maintaining high inter-annotator agreement without needing expert annotations. Furthermore, we propose metrics that automate the evaluation pipeline and achieve higher correlations with QAPyramid than other widely adopted metrics, allowing future work to accurately and efficiently benchmark summarization systems.

computational linguistic, large language model, machine learning, (22 more...)

arXiv.org Artificial Intelligence

2412.07096

Country:

Europe (1.00)
Asia > Middle East > UAE (0.14)
North America > United States > Pennsylvania (0.14)
(5 more...)

Genre: Research Report (0.64)

Industry:

Leisure & Entertainment > Sports > Soccer (0.68)
Health & Medicine > Therapeutic Area > Immunology (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Localizing Factual Inconsistencies in Attributable Text Generation

Cattan, Arie, Roit, Paul, Zhang, Shiyue, Wan, David, Aharoni, Roee, Szpektor, Idan, Bansal, Mohit, Dagan, Ido

arXiv.org Artificial IntelligenceOct-9-2024

There has been an increasing interest in detecting hallucinations in model-generated texts, both manually and automatically, at varying levels of granularity. However, most existing methods fail to precisely pinpoint the errors. In this work, we introduce QASemConsistency, a new formalism for localizing factual inconsistencies in attributable text generation, at a fine-grained level. Drawing inspiration from Neo-Davidsonian formal semantics, we propose decomposing the generated text into minimal predicate-argument level propositions, expressed as simple question-answer (QA) pairs, and assess whether each individual QA pair is supported by a trusted reference text. As each QA pair corresponds to a single semantic relation between a predicate and an argument, QASemConsistency effectively localizes the unsupported information. We first demonstrate the effectiveness of the QASemConsistency methodology for human annotation, by collecting crowdsourced annotations of granular consistency errors, while achieving a substantial inter-annotator agreement ($\kappa > 0.7)$. Then, we implement several methods for automatically detecting localized factual inconsistencies, with both supervised entailment models and open-source LLMs.

large language model, machine learning, question answering, (21 more...)

arXiv.org Artificial Intelligence

2410.07473

Country:

Europe (1.00)
North America > United States > Louisiana (0.14)
Asia > Middle East > UAE (0.14)
(3 more...)

Genre: Research Report > New Finding (0.67)

Industry:

Health & Medicine (1.00)
Leisure & Entertainment > Sports (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.89)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.88)

Add feedback

Explicating the Implicit: Argument Detection Beyond Sentence Boundaries

Roit, Paul, Slobodkin, Aviv, Hirsch, Eran, Cattan, Arie, Klein, Ayal, Pyatkin, Valentina, Dagan, Ido

arXiv.org Artificial IntelligenceAug-8-2024

Detecting semantic arguments of a predicate word has been conventionally modeled as a sentence-level task. The typical reader, however, perfectly interprets predicate-argument relations in a much wider context than just the sentence where the predicate was evoked. In this work, we reformulate the problem of argument detection through textual entailment to capture semantic relations across sentence boundaries. We propose a method that tests whether some semantic relation can be inferred from a full passage by first encoding it into a simple and standalone proposition and then testing for entailment against the passage. Our method does not require direct supervision, which is generally absent due to dataset scarcity, but instead builds on existing NLI and sentence-level SRL resources. Such a method can potentially explicate pragmatically understood relations into a set of explicit sentences. We demonstrate it on a recent document-level benchmark, outperforming some supervised methods and contemporary language models.

argument, artificial intelligence, natural language, (19 more...)

arXiv.org Artificial Intelligence

2408.04246

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Colorado (0.28)
North America > United States > Minnesota (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.93)

Add feedback

Attribute First, then Generate: Locally-attributable Grounded Text Generation

Slobodkin, Aviv, Hirsch, Eran, Cattan, Arie, Schuster, Tal, Dagan, Ido

arXiv.org Artificial IntelligenceJul-4-2024

Recent efforts to address hallucinations in Large Language Models (LLMs) have focused on attributed text generation, which supplements generated texts with citations of supporting sources for post-generation fact-checking and corrections. Yet, these citations often point to entire documents or paragraphs, burdening users with extensive verification work. In this paper, we introduce a locally-attributable text generation approach, prioritizing concise attributions. Our method, named "Attribute First, then Generate", breaks down the conventional end-to-end generation process into three intuitive steps: content selection, sentence planning, and sequential sentence generation. By initially identifying relevant source segments ("select first") and then conditioning the generation process on them ("then generate"), we ensure these segments also act as the output's fine-grained attributions ("select" becomes "attribute"). Tested on Multi-document Summarization and Long-form Question-answering, our method not only yields more concise citations than the baselines but also maintains - and in some cases enhances - both generation quality and attribution accuracy. Furthermore, it significantly reduces the time required for fact verification by human assessors.

artificial intelligence, large language model, natural language, (19 more...)

arXiv.org Artificial Intelligence

2403.17104

Country:

Europe (1.00)
South America (0.68)
North America > United States > Hawaii (0.14)
(2 more...)

Genre: Research Report > New Finding (0.45)

Industry:

Law Enforcement & Public Safety > Crime Prevention & Enforcement (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Consumer Products & Services (0.68)
Law > Environmental Law (0.67)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Identifying User Goals from UI Trajectories

Berkovitch, Omri, Caduri, Sapir, Kahlon, Noam, Efros, Anatoly, Caciularu, Avi, Dagan, Ido

arXiv.org Artificial IntelligenceJun-30-2024

Autonomous agents that interact with graphical user interfaces (GUIs) hold significant potential for enhancing user experiences. To further improve these experiences, agents need to be personalized and proactive. By effectively comprehending user intentions through their actions and interactions with GUIs, agents will be better positioned to achieve these goals. This paper introduces the task of goal identification from observed UI trajectories, aiming to infer the user's intended task based on their GUI interactions. We propose a novel evaluation metric to assess whether two task descriptions are paraphrases within a specific UI environment. By Leveraging the inverse relation with the UI automation task, we utilized the Android-In-The-Wild and Mind2Web datasets for our experiments. Using our metric and these datasets, we conducted several experiments comparing the performance of humans and state-of-the-art models, specifically GPT-4 and Gemini-1.5 Pro. Our results show that Gemini performs better than GPT but still underperforms compared to humans, indicating significant room for improvement.

large language model, machine learning, trajectory, (20 more...)

arXiv.org Artificial Intelligence

2406.14314

Country: North America > United States (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Leisure & Entertainment (0.46)

Technology:

Information Technology > Human Computer Interaction (1.00)
Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Efficient Data Generation for Source-grounded Information-seeking Dialogs: A Use Case for Meeting Transcripts

Golany, Lotem, Galgani, Filippo, Mamo, Maya, Parasol, Nimrod, Vandsburger, Omer, Bar, Nadav, Dagan, Ido

arXiv.org Artificial IntelligenceJun-21-2024

Automating data generation with Large Language Models (LLMs) has become increasingly popular. In this work, we investigate the feasibility and effectiveness of LLM-based data generation in the challenging setting of source-grounded information-seeking dialogs, with response attribution, over long documents. Our source texts consist of long and noisy meeting transcripts, adding to the task complexity. Since automating attribution remains difficult, we propose a semi-automatic approach: dialog queries and responses are generated with LLMs, followed by human verification and identification of attribution spans. Using this approach, we created MISeD -- Meeting Information Seeking Dialogs dataset -- a dataset of information-seeking dialogs focused on meeting transcripts. Models finetuned with MISeD demonstrate superior performance compared to off-the-shelf models, even those of larger size. Finetuning on MISeD gives comparable response generation quality to finetuning on fully manual data, while improving attribution quality and reducing time and effort.

attribution, large language model, natural language, (18 more...)

arXiv.org Artificial Intelligence

2405.01121

Country:

Europe (0.68)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report > New Finding (0.67)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

The Power of Summary-Source Alignments

Ernst, Ori, Shapira, Ori, Slobodkin, Aviv, Adar, Sharon, Bansal, Mohit, Goldberger, Jacob, Levy, Ran, Dagan, Ido

arXiv.org Artificial IntelligenceJun-2-2024

Multi-document summarization (MDS) is a challenging task, often decomposed to subtasks of salience and redundancy detection, followed by text generation. In this context, alignment of corresponding sentences between a reference summary and its source documents has been leveraged to generate training data for some of the component tasks. Yet, this enabling alignment step has usually been applied heuristically on the sentence level on a limited number of subtasks. In this paper, we propose extending the summary-source alignment framework by (1) applying it at the more fine-grained proposition span level, (2) annotating alignment manually in a multi-document setup, and (3) revealing the great potential of summary-source alignments to yield several datasets for at least six different tasks. Specifically, for each of the tasks, we release a manually annotated test set that was derived automatically from the alignment annotation. We also release development and train sets in the same way, but from automatically derived alignments. Using the datasets, each task is demonstrated with baseline models and corresponding evaluation metrics to spur future research on this broad challenge.

computational linguistic, large language model, machine learning, (21 more...)

arXiv.org Artificial Intelligence

2406.00842

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.50)

Industry:

Media > Music (1.00)
Leisure & Entertainment > Sports > Football (1.00)

Technology:

Information Technology > Communications > Social Media (0.93)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback