AITopics | Martin, Alexander

Plotting

Martin, Alexander

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Cross-Document Event-Keyed Summarization

Walden, William, Kuchmiichuk, Pavlo, Martin, Alexander, Jin, Chihsheng, Cao, Angela, Sun, Claire, Allen, Curisia, White, Aaron Steven

arXiv.org Artificial IntelligenceDec-15-2024

Event-keyed summarization (EKS) requires summarizing a specific event described in a document given the document text and an event representation extracted from it. In this work, we extend EKS to the cross-document setting (CDEKS), in which summaries must synthesize information from accounts of the same event as given by multiple sources. We introduce SEAMUS (Summaries of Events Across Multiple Sources), a high-quality dataset for CDEKS based on an expert reannotation of the FAMUS dataset for cross-document argument extraction. We present a suite of baselines on SEAMUS -- covering both smaller, fine-tuned models, as well as zero- and few-shot prompted LLMs -- along with detailed ablations and a human evaluation study, showing SEAMUS to be a valuable benchmark for this new task.

large language model, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2410.14795

Country: Asia > Middle East > Saudi Arabia (0.68)

Genre: Research Report (1.00)

Industry:

Law (0.67)
Law Enforcement & Public Safety > Crime Prevention & Enforcement (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.70)

Add feedback

MultiVENT 2.0: A Massive Multilingual Benchmark for Event-Centric Video Retrieval

Kriz, Reno, Sanders, Kate, Etter, David, Murray, Kenton, Carpenter, Cameron, Van Ochten, Kelly, Recknor, Hannah, Guallar-Blasco, Jimena, Martin, Alexander, Colaianni, Ronald, King, Nolan, Yang, Eugene, Van Durme, Benjamin

arXiv.org Artificial IntelligenceOct-15-2024

Efficiently retrieving and synthesizing information from large-scale multimodal collections has become a critical challenge. However, existing video retrieval datasets suffer from scope limitations, primarily focusing on matching descriptive but vague queries with small collections of professionally edited, English-centric videos. To address this gap, we introduce $\textbf{MultiVENT 2.0}$, a large-scale, multilingual event-centric video retrieval benchmark featuring a collection of more than 218,000 news videos and 3,906 queries targeting specific world events. These queries specifically target information found in the visual content, audio, embedded text, and text metadata of the videos, requiring systems leverage all these sources to succeed at the task. Preliminary results show that state-of-the-art vision-language models struggle significantly with this task, and while alternative approaches show promise, they are still insufficient to adequately address this problem. These findings underscore the need for more robust multimodal retrieval systems, as effective video retrieval is a crucial step towards multimodal content understanding and generation tasks.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2410.11619

Country:

North America > United States (0.46)
Europe (0.28)

Genre: Research Report > New Finding (0.86)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Communications (0.97)

Add feedback

Grounding Partially-Defined Events in Multimodal Data

Sanders, Kate, Kriz, Reno, Etter, David, Recknor, Hannah, Martin, Alexander, Carpenter, Cameron, Lin, Jingyang, Van Durme, Benjamin

arXiv.org Artificial IntelligenceOct-7-2024

How are we able to learn about complex current events just from short snippets of video? While natural language enables straightforward ways to represent under-specified, partially observable events, visual data does not facilitate analogous methods and, consequently, introduces unique challenges in event understanding. With the growing prevalence of vision-capable AI agents, these systems must be able to model events from collections of unstructured video data. To tackle robust event modeling in multimodal settings, we introduce a multimodal formulation for partially-defined events and cast the extraction of these events as a three-stage span retrieval task. We propose a corresponding benchmark for this task, MultiVENT-G, that consists of 14.5 hours of densely annotated current event videos and 1,168 text documents, containing 22.8K labeled event-centric entities. We propose a collection of LLM-driven approaches to the task of multimodal event analysis, and evaluate them on MultiVENT-G. Results illustrate the challenges that abstract event understanding poses and demonstrates promise in event-centric video-language systems.

large language model, machine learning, natural language, (24 more...)

arXiv.org Artificial Intelligence

2410.05267

Country:

North America > United States (1.00)
Europe (1.00)
Asia (1.00)

Genre: Research Report (1.00)

Industry:

Media (1.00)
Leisure & Entertainment (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
(2 more...)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
(3 more...)

Add feedback

Hi5: 2D Hand Pose Estimation with Zero Human Annotation

Hasan, Masum, Ozel, Cengiz, Long, Nina, Martin, Alexander, Potter, Samuel, Adnan, Tariq, Lee, Sangwu, Zadeh, Amir, Hoque, Ehsan

arXiv.org Artificial IntelligenceJun-5-2024

We propose a new large synthetic hand pose estimation dataset, Hi5, and a novel inexpensive method for collecting high-quality synthetic data that requires no human annotation or validation. Leveraging recent advancements in computer graphics, high-fidelity 3D hand models with diverse genders and skin colors, and dynamic environments and camera movements, our data synthesis pipeline allows precise control over data diversity and representation, ensuring robust and fair model training. We generate a dataset with 583,000 images with accurate pose annotation using a single consumer PC that closely represents real-world variability. Pose estimation models trained with Hi5 perform competitively on real-hand benchmarks while surpassing models trained with real data when tested on occlusions and perturbations. Our experiments show promising results for synthetic data as a viable solution for data representation problems in real datasets. Overall, this paper provides a promising new approach to synthetic data creation and annotation that can reduce costs and increase the diversity and quality of data for hand pose estimation.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2406.03599

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report (1.00)

Industry:

Health & Medicine (1.00)
Media > Television (0.34)
Media > Photography (0.34)
Media > Film (0.34)

Technology:

Information Technology > Artificial Intelligence > Vision > Video Understanding (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Event-Keyed Summarization

Gantt, William, Martin, Alexander, Kuchmiichuk, Pavlo, White, Aaron Steven

arXiv.org Artificial IntelligenceFeb-10-2024

We introduce event-keyed summarization (EKS), a novel task that marries traditional summarization and document-level event extraction, with the goal of generating a contextualized summary for a specific event, given a document and an extracted event structure. We introduce a dataset for this task, MUCSUM, consisting of summaries of all events in the classic MUC-4 dataset, along with a set of baselines that comprises both pretrained LM standards in the summarization literature, as well as larger frontier models. We show that ablations that reduce EKS to traditional summarization or structure-to-text yield inferior summaries of target events and that MUCSUM is a robust benchmark for this task. Lastly, we conduct a human evaluation of both reference and model summaries, and provide some detailed analysis of the results.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2402.06973

Country:

Europe (1.00)
North America > United States > Virginia (0.14)
North America > United States > California (0.14)
(2 more...)

Genre: Research Report (1.00)

Industry: Law Enforcement & Public Safety > Terrorism (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.46)

Add feedback

FAMuS: Frames Across Multiple Sources

Vashishtha, Siddharth, Martin, Alexander, Gantt, William, Van Durme, Benjamin, White, Aaron Steven

arXiv.org Artificial IntelligenceNov-9-2023

Understanding event descriptions is a central aspect of language processing, but current approaches focus overwhelmingly on single sentences or documents. Aggregating information about an event \emph{across documents} can offer a much richer understanding. To this end, we present FAMuS, a new corpus of Wikipedia passages that \emph{report} on some event, paired with underlying, genre-diverse (non-Wikipedia) \emph{source} articles for the same event. Events and (cross-sentence) arguments in both report and source are annotated against FrameNet, providing broad coverage of different event types. We present results on two key event understanding tasks enabled by FAMuS: \emph{source validation} -- determining whether a document is a valid source for a target report event -- and \emph{cross-document argument extraction} -- full-document argument extraction for a target event from both its report and the correct source article. We release both FAMuS and our models to support further research.

computational linguistic, large language model, machine learning, (25 more...)

arXiv.org Artificial Intelligence

2311.05601

Country:

Europe (1.00)
North America > United States > Washington > King County > Seattle (0.14)
North America > United States > Virginia (0.14)
(6 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.71)
(3 more...)

Add feedback

MegaWika: Millions of reports and their sources across 50 diverse languages

Barham, Samuel, Weller, Orion, Yuan, Michelle, Murray, Kenton, Yarmohammadi, Mahsa, Jiang, Zhengping, Vashishtha, Siddharth, Martin, Alexander, Liu, Anqi, White, Aaron Steven, Boyd-Graber, Jordan, Van Durme, Benjamin

arXiv.org Artificial IntelligenceJul-13-2023

To foster the development of new models for collaborative AI-assisted report generation, we introduce MegaWika, consisting of 13 million Wikipedia articles in 50 diverse languages, along with their 71 million referenced source materials. We process this dataset for a myriad of applications, going beyond the initial Wikipedia citation extraction and web scraping of content, including translating non-English articles for cross-lingual applications and providing FrameNet parses for automated semantic analysis. MegaWika is the largest resource for sentence-level report generation and the only report generation dataset that is multilingual. We manually analyze the quality of this resource through a semantically stratified sample. Finally, we provide baseline results and trained models for crucial steps in automated report generation: cross-lingual question answering and citation retrieval.

computational linguistic, machine learning, question answering, (20 more...)

arXiv.org Artificial Intelligence

2307.07049

Country: North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)

Genre: Research Report (0.82)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (0.94)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.68)

Add feedback

An Online-Learning Approach to Inverse Optimization

Bärmann, Andreas, Martin, Alexander, Pokutta, Sebastian, Schneider, Oskar

arXiv.org Artificial IntelligenceOct-30-2018

Human decision-makers are very good at taking decisions under rather imprecise specification of the decision-making problem, both in terms of constraints as well as objective. One 1 might argue that the human decision-maker can pretty reliably learn from observed previous decisions - a traditional learning-by-example setup. At the same time, when we try to turn these decision-making problems into actual optimization problems, we often run into all types of issues in terms of specifying the model. In an optimal world, we would be able to infer or learn the optimization problem from previously observed decisions taken by an expert. This problem naturally occurs in many settings where we do not have direct access to the decision-maker's preference or objective function but can observe his behaviour, and where the learner as well as the decision-maker have access to the same information. Natural examples are as diverse as making recommendations based on user history and strategic planning problems, where the agent's preferences are unknown but the system is observable. Other examples include knowledge transfer from a human planner into a decision support system: often human operators have arrived at finely-tuned "objective functions" through many years of experience, and in many cases it is desirable to replicate the decision-making process both for scaling up and also for potentially including it in large-scale scenario analysis and simulation to explore responses under varying conditions. Here we consider the learning of preferences or objectives from an expert by means of observing his actions.

computer based training, educational technology, objective function, (20 more...)

arXiv.org Artificial Intelligence

1810.12997

Country:

North America > United States (0.93)
Europe (0.67)

Genre: Research Report (1.00)

Industry: Education > Educational Setting > Online (0.41)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback