AITopics | Anguilla

Collaborating Authors

Anguilla

WikiVideo: Article Generation from Multiple Videos

Martin, Alexander, Kriz, Reno, Walden, William Gantt, Sanders, Kate, Recknor, Hannah, Yang, Eugene, Ferraro, Francis, Van Durme, Benjamin

arXiv.org Artificial IntelligenceApr-1-2025

We present the challenging task of automatically creating a high-level Wikipedia-style article that aggregates information from multiple diverse videos about real-world events, such as natural disasters or political elections. Videos are intuitive sources for retrieval-augmented generation (RAG), but most contemporary RAG workflows focus heavily on text and existing methods for video-based summarization focus on low-level scene understanding rather than high-level event semantics. To close this gap, we introduce WikiVideo, a benchmark consisting of expert-written articles and densely annotated videos that provide evidence for articles' claims, facilitating the integration of video into RAG pipelines and enabling the creation of in-depth content that is grounded in multimodal sources. We further propose Collaborative Article Generation (CAG), a novel interactive method for article creation from multiple videos. CAG leverages an iterative interaction between an r1-style reasoning model and a VideoLLM to draw higher level inferences about the target event than is possible with VideoLLMs alone, which fixate on low-level visual features. We benchmark state-of-the-art VideoLLMs and CAG in both oracle retrieval and RAG settings and find that CAG consistently outperforms alternative methods, while suggesting intriguing avenues for future work.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2504.00939

Country:

Europe > France > Île-de-France > Paris > Paris (0.29)
North America > The Bahamas (0.14)
North America > United States > Georgia (0.14)
(43 more...)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment (1.00)
Law Enforcement & Public Safety > Fire & Emergency Services (1.00)
Government > Voting & Elections (1.00)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Lossless Acceleration of Large Language Models with Hierarchical Drafting based on Temporal Locality in Speculative Decoding

Cho, Sukmin, Choi, Sangjin, Hwang, Taeho, Seo, Jeongyeon, Jeong, Soyeong, Lee, Huije, Song, Hoyun, Park, Jong C., Kwon, Youngjin

arXiv.org Artificial IntelligenceFeb-8-2025

Accelerating inference in Large Language Models (LLMs) is critical for real-time interactions, as they have been widely incorporated into real-world services. Speculative decoding, a fully algorithmic solution, has gained attention for improving inference speed by drafting and verifying tokens, thereby generating multiple tokens in a single forward pass. However, current drafting strategies usually require significant fine-tuning or have inconsistent performance across tasks. To address these challenges, we propose Hierarchy Drafting (HD), a novel lossless drafting approach that organizes various token sources into multiple databases in a hierarchical framework based on temporal locality. In the drafting step, HD sequentially accesses multiple databases to obtain draft tokens from the highest to the lowest locality, ensuring consistent acceleration across diverse tasks and minimizing drafting latency. Our experiments on Spec-Bench using LLMs with 7B and 13B parameters demonstrate that HD outperforms existing database drafting methods, achieving robust inference speedups across model sizes, tasks, and temperatures.

artificial intelligence, large language model, natural language, (19 more...)

arXiv.org Artificial Intelligence

2502.05609

Country:

Europe > Austria > Vienna (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.05)
North America > Cuba (0.04)
(19 more...)

Genre: Research Report > New Finding (0.68)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

WorldCuisines: A Massive-Scale Benchmark for Multilingual and Multicultural Visual Question Answering on Global Cuisines

Winata, Genta Indra, Hudi, Frederikus, Irawan, Patrick Amadeus, Anugraha, David, Putri, Rifki Afina, Wang, Yutong, Nohejl, Adam, Prathama, Ubaidillah Ariq, Ousidhoum, Nedjma, Amriani, Afifa, Rzayev, Anar, Das, Anirban, Pramodya, Ashmari, Adila, Aulia, Wilie, Bryan, Mawalim, Candy Olivia, Cheng, Ching Lam, Abolade, Daud, Chersoni, Emmanuele, Santus, Enrico, Ikhwantri, Fariz, Kuwanto, Garry, Zhao, Hanyang, Wibowo, Haryo Akbarianto, Lovenia, Holy, Cruz, Jan Christian Blaise, Putra, Jan Wira Gotama, Myung, Junho, Susanto, Lucky, Machin, Maria Angelica Riera, Zhukova, Marina, Anugraha, Michael, Adilazuarda, Muhammad Farid, Santosa, Natasha, Limkonchotiwat, Peerat, Dabre, Raj, Audino, Rio Alexander, Cahyawijaya, Samuel, Zhang, Shi-Xiong, Salim, Stephanie Yulia, Zhou, Yi, Gui, Yinxuan, Adelani, David Ifeoluwa, Lee, En-Shiun Annie, Okada, Shogo, Purwarianti, Ayu, Aji, Alham Fikri, Watanabe, Taro, Wijaya, Derry Tanti, Oh, Alice, Ngo, Chong-Wah

arXiv.org Artificial IntelligenceNov-28-2024

Vision Language Models (VLMs) often struggle with culture-specific knowledge, particularly in languages other than English and in underrepresented cultural contexts. To evaluate their understanding of such knowledge, we introduce WorldCuisines, a massive-scale benchmark for multilingual and multicultural, visually grounded language understanding. This benchmark includes a visual question answering (VQA) dataset with text-image pairs across 30 languages and dialects, spanning 9 language families and featuring over 1 million data points, making it the largest multicultural VQA benchmark to date. It includes tasks for identifying dish names and their origins. We provide evaluation datasets in two sizes (12k and 60k instances) alongside a training dataset (1 million instances). Our findings show that while VLMs perform better with correct location context, they struggle with adversarial contexts and predicting specific regional cuisines and languages. To support future research, we release a knowledge base with annotated food entries and images along with the VQA data.

age range, dataset, preprint arxiv, (16 more...)

arXiv.org Artificial Intelligence

2410.12705

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States (0.14)
Asia > Brunei (0.14)
(170 more...)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.70)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.48)

Add feedback

Grounding Natural Language to SQL Translation with Data-Based Self-Explanations

Fan, Yuankai, Ren, Tonghui, Huang, Can, He, Zhenying, Wang, X. Sean

arXiv.org Artificial IntelligenceNov-5-2024

Natural Language Interfaces for Databases empower non-technical users to interact with data using natural language (NL). Advanced approaches, utilizing either neural sequence-to-sequence or more recent sophisticated large-scale language models, typically implement NL to SQL (NL2SQL) translation in an end-to-end fashion. However, like humans, these end-to-end translation models may not always generate the best SQL output on their first try. In this paper, we propose CycleSQL, an iterative framework designed for end-to-end translation models to autonomously generate the best output through self-evaluation. The main idea of CycleSQL is to introduce data-grounded NL explanations of query results as self-provided feedback, and use the feedback to validate the correctness of the translation iteratively, hence improving the overall translation accuracy. Extensive experiments, including quantitative and qualitative evaluations, are conducted to study CycleSQL by applying it to seven existing translation models on five widely used benchmarks. The results show that 1) the feedback loop introduced in CycleSQL can consistently improve the performance of existing models, and in particular, by applying CycleSQL to RESDSQL, obtains a translation accuracy of 82.0% (+2.6%) on the validation set, and 81.6% (+3.2%) on the test set of Spider benchmark; 2) the generated NL explanations can also provide insightful information for users, aiding in the comprehension of translation results and consequently enhancing the interpretability of NL2SQL translation.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2411.02948

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.05)
North America > Aruba (0.04)
North America > Anguilla (0.04)
(13 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Transportation > Passenger (1.00)
Transportation > Air (1.00)
Aerospace & Defense > Aircraft (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval > Query Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

VLEIBot: A New 45-mg Swimming Microrobot Driven by a Bioinspired Anguilliform Propulsor

Blankenship, Elijah K., Trygstad, Conor K., Gonçalves, Francisco M. F. R., Pérez-Arancibia, Néstor O.

arXiv.org Artificial IntelligenceAug-24-2024

This paper presents the VLEIBot^* (Very Little Eel-Inspired roBot), a 45-mg/23-mm^3 microrobotic swimmer that is propelled by a bioinspired anguilliform propulsor. The propulsor is excited by a single 6-mg high-work-density (HWD) microactuator and undulates periodically due to wave propagation phenomena generated by fluid-structure interaction (FSI) during swimming. The microactuator is composed of a carbon-fiber beam, which functions as a leaf spring, and shape-memory alloy (SMA) wires, which deform cyclically when excited periodically using Joule heating. The VLEIBot can swim at speeds as high as 15.1mm * s^{-1} (0.33 Bl * s^{-1}}) when driven with a heuristically-optimized propulsor. To improve maneuverability, we evolved the VLEIBot design into the 90-mg/47-mm^3 VLEIBot^+, which is driven by two propulsors and fully controllable in the two-dimensional (2D) space. The VLEIBot^+ can swim at speeds as high as 16.1mm * s^{-1} (0.35 Bl * s^{-1}), when driven with heuristically-optimized propulsors, and achieves turning rates as high as 0.28 rad * s^{-1}, when tracking path references. The measured root-mean-square (RMS) values of the tracking errors are as low as 4 mm.

experiment, robot, vleibot, (14 more...)

arXiv.org Artificial Intelligence

2403.07073

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > Anguilla (0.04)
North America > United States > California > Los Angeles County > Pasadena (0.04)
(14 more...)

Genre: Research Report (0.64)

Industry: Materials (0.34)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

Morphing median fin enhances untethered bionic robotic tuna's linear acceleration and turning maneuverability

Huang, Hongbin, Lin, Zhonglu, Zheng, Wei, Zhang, Jinhu, Liu, Zhibin, Zhou, Wei, Zhang, Yu

arXiv.org Artificial IntelligenceJul-26-2024

Median fins of fish-like swimmers play a crucial role in linear acceleration and maneuvering processes. However, few research focused on untethered robotic fish experiments. Imitating the behaviour of real tuna, we developed a free-swimming bionic tuna with a foldable dorsal fin. The erection of dorsal fin, at proper conditions, can reduce head heave by 50%, enhance linear acceleration by 15.7%, increase turning angular velocity by 32.78%, and turning radius decreasing by 33.13%. Conversely, erecting the dorsal fin increases the wetted surface area, resulting in decreased maximum speed and efficiency during steady swimming phase. This finding partially explains why tuna erect their median fins during maneuvers or acceleration and fold them afterward to reduce drag. In addition, we verified that folding the median fins after acceleration does not significantly affect locomotion efficiency. This study supports the application of morphing median fins in undulating underwater robots and helps to further understand the impact of median fins on fish locomotion.

bionic tuna, fin, median fin, (13 more...)

arXiv.org Artificial Intelligence

2407.18843

Country:

Asia > China > Fujian Province > Xiamen (0.06)
North America > United States (0.04)
North America > Anguilla (0.04)
Europe > United Kingdom (0.04)

Genre: Research Report (1.00)

Industry: Energy (0.68)

Technology: Information Technology > Artificial Intelligence > Robots (1.00)

Add feedback

MIRAI: Evaluating LLM Agents for Event Forecasting

Ye, Chenchen, Hu, Ziniu, Deng, Yihe, Huang, Zijie, Ma, Mingyu Derek, Zhu, Yanqiao, Wang, Wei

arXiv.org Artificial IntelligenceJul-1-2024

Recent advancements in Large Language Models (LLMs) have empowered LLM agents to autonomously collect world information, over which to conduct reasoning to solve complex problems. Given this capability, increasing interests have been put into employing LLM agents for predicting international events, which can influence decision-making and shape policy development on an international scale. Despite such a growing interest, there is a lack of a rigorous benchmark of LLM agents' forecasting capability and reliability. To address this gap, we introduce MIRAI, a novel benchmark designed to systematically evaluate LLM agents as temporal forecasters in the context of international events. Our benchmark features an agentic environment with tools for accessing an extensive database of historical, structured events and textual news articles. We refine the GDELT event database with careful cleaning and parsing to curate a series of relational prediction tasks with varying forecasting horizons, assessing LLM agents' abilities from short-term to long-term forecasting. We further implement APIs to enable LLM agents to utilize different tools via a code-based interface. In summary, MIRAI comprehensively evaluates the agents' capabilities in three dimensions: 1) autonomously source and integrate critical information from large global databases; 2) write codes using domain-specific APIs and libraries for tool-use; and 3) jointly reason over historical knowledge from diverse formats and time to accurately predict future events. Through comprehensive benchmarking, we aim to establish a reliable framework for assessing the capabilities of LLM agents in forecasting international events, thereby contributing to the development of more accurate and trustworthy models for international relation analysis.

cameocode, isocode, relation, (15 more...)

arXiv.org Artificial Intelligence

2407.01231

Country:

Asia > North Korea (0.14)
Oceania > Australia > Australian Indian Ocean Territories > Territory of Cocos (Keeling) Islands (0.14)
North America > United States > California > Los Angeles County > Los Angeles (0.14)
(234 more...)

Genre: Research Report > New Finding (0.45)

Industry:

Law (1.00)
Government > Foreign Policy (1.00)
Government > Military (0.93)
Information Technology (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Fox News AI Newsletter: Caribbean nation capitalizes on AI boom

FOX NewsJun-26-2024, 15:56:48 GMT

Reproduction in nautical magazines, nautical guides or nautical websites is prohibited. 'TOTALLY INCIDENTAL': A small Caribbean nation is capitalizing on the artificial intelligence boom thanks in part to a coincidence that came about when internet domain codes for countries were awarded decades ago. AI OBSTACLES: Small businesses in the U.S. are making progress in catching up with implementing artificial intelligence to help their operations, even though nearly half are unsure of how to get started, according to new research. PROTECT YOUR DATA: Meta may have paused its plans to train artificial intelligence models for the lucky ones living in Europe, where laws protect people using Facebook and Instagram better than Americans. Here in the good ole USA, both Facebook and Instagram have already been combing through public posts from U.S. accounts to train and improve its AI capabilities, including its chatbot, since last year.

ai boom, caribbean nation capitalize, fox new ai newsletter, (5 more...)

FOX News

Country:

North America > United States (0.59)
North America > Anguilla (0.12)
Europe > United Kingdom (0.07)
Asia > Taiwan > Taiwan Province > Taipei (0.07)

Industry: Media > News (0.65)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence (1.00)

Add feedback

Curating Grounded Synthetic Data with Global Perspectives for Equitable AI

Törnquist, Elin, Caulk, Robert Alexander

arXiv.org Artificial IntelligenceJun-18-2024

The development of robust AI models relies heavily on the quality and variety of training data available. In fields where data scarcity is prevalent, synthetic data generation offers a vital solution. In this paper, we introduce a novel approach to creating synthetic datasets, grounded in real-world diversity and enriched through strategic diversification. We synthesize data using a comprehensive collection of news articles spanning 12 languages and originating from 125 countries, to ensure a breadth of linguistic and cultural representations. Through enforced topic diversification, translation, and summarization, the resulting dataset accurately mirrors real-world complexities and addresses the issue of underrepresentation in traditional datasets. This methodology, applied initially to Named Entity Recognition (NER), serves as a model for numerous AI disciplines where data diversification is critical for generalizability. Preliminary results demonstrate substantial improvements in performance on traditional NER benchmarks, by up to 7.3%, highlighting the effectiveness of our synthetic data in mimicking the rich, varied nuances of global data sources. This paper outlines the strategies employed for synthesizing diverse datasets and provides such a curated dataset for NER.

arxiv preprint arxiv, dataset, entity type, (15 more...)

arXiv.org Artificial Intelligence

2406.10258

Country:

Europe > France (0.28)
South America > Venezuela (0.04)
South America > Uruguay (0.04)
(122 more...)

Genre: Research Report (1.00)

Industry:

Government (1.00)
Leisure & Entertainment > Sports > Football (0.68)
Energy (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

QFMTS: Generating Query-Focused Summaries over Multi-Table Inputs

Zhang, Weijia, Pal, Vaishali, Huang, Jia-Hong, Kanoulas, Evangelos, de Rijke, Maarten

arXiv.org Artificial IntelligenceMay-8-2024

Table summarization is a crucial task aimed at condensing information from tabular data into concise and comprehensible textual summaries. However, existing approaches often fall short of adequately meeting users' information and quality requirements and tend to overlook the complexities of real-world queries. In this paper, we propose a novel method to address these limitations by introducing query-focused multi-table summarization. Our approach, which comprises a table serialization module, a summarization controller, and a large language model (LLM), utilizes textual queries and multiple tables to generate query-dependent table summaries tailored to users' information needs. To facilitate research in this area, we present a comprehensive dataset specifically tailored for this task, consisting of 4909 query-summary pairs, each associated with multiple tables. Through extensive experiments using our curated dataset, we demonstrate the effectiveness of our proposed method compared to baseline approaches. Our findings offer insights into the challenges of complex table reasoning for precise summarization, contributing to the advancement of research in query-focused multi-table summarization.

dataset, evaluation, summarization, (16 more...)

arXiv.org Artificial Intelligence

2405.05109

Country:

Oceania > Samoa (0.05)
Oceania > American Samoa (0.05)
North America > Antigua and Barbuda (0.05)
(21 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Education (1.00)
Leisure & Entertainment > Sports > Motorsports > NASCAR (0.68)
Automobiles & Trucks (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.71)

Add feedback