AITopics | mmr

Collaborating Authors

mmr

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

When One Moment Isn't Enough: Multi-Moment Retrieval with Cross-Moment Interactions

Neural Information Processing SystemsJun-23-2026, 03:52:38 GMT

artificial intelligence, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.68)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Communications (0.68)
Information Technology > Artificial Intelligence > Vision (0.68)

Add feedback

When One Moment Isn't Enough: Multi-Moment Retrieval with Cross-Moment Interactions

Cao, Zhuo, Du, Heming, Zhang, Bingqing, Yu, Xin, Li, Xue, Wang, Sen

arXiv.org Artificial IntelligenceOct-21-2025

Existing Moment retrieval (MR) methods focus on Single-Moment Retrieval (SMR). However, one query can correspond to multiple relevant moments in real-world applications. This makes the existing datasets and methods insufficient for video temporal grounding. By revisiting the gap between current MR tasks and real-world applications, we introduce a high-quality datasets called QVHighlights Multi-Moment Dataset (QV-M$^2$), along with new evaluation metrics tailored for multi-moment retrieval (MMR). QV-M$^2$ consists of 2,212 annotations covering 6,384 video segments. Building on existing efforts in MMR, we propose a framework called FlashMMR. Specifically, we propose a Multi-moment Post-verification module to refine the moment boundaries. We introduce constrained temporal adjustment and subsequently leverage a verification module to re-evaluate the candidate segments. Through this sophisticated filtering pipeline, low-confidence proposals are pruned, and robust multi-moment alignment is achieved. We retrain and evaluate 6 existing MR methods on QV-M$^2$ and QVHighlights under both SMR and MMR settings. Results show that QV-M$^2$ serves as an effective benchmark for training and evaluating MMR models, while FlashMMR provides a strong baseline. Specifically, on QV-M$^2$, it achieves improvements over prior SOTA method by 3.00% on G-mAP, 2.70% on mAP@3+tgt, and 2.56% on mR@3. The proposed benchmark and method establish a foundation for advancing research in more realistic and challenging video temporal grounding scenarios. Code is released at https://github.com/Zhuo-Cao/QV-M2.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2510.17218

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.88)

Industry:

Information Technology (0.46)
Social Sector (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Vision (0.68)

Add feedback

Transformers Can Overcome the Curse of Dimensionality: A Theoretical Study from an Approximation Perspective

Jiao, Yuling, Lai, Yanming, Wang, Yang, Yan, Bokai

arXiv.org Artificial IntelligenceApr-21-2025

The Transformer model is widely used in various application areas of machine learning, such as natural language processing. This paper investigates the approximation of the Hölder continuous function class $\mathcal{H}_{Q}^β\left([0,1]^{d\times n},\mathbb{R}^{d\times n}\right)$ by Transformers and constructs several Transformers that can overcome the curse of dimensionality. These Transformers consist of one self-attention layer with one head and the softmax function as the activation function, along with several feedforward layers. For example, to achieve an approximation accuracy of $ε$, if the activation functions of the feedforward layers in the Transformer are ReLU and floor, only $\mathcal{O}\left(\log\frac{1}ε\right)$ layers of feedforward layers are needed, with widths of these layers not exceeding $\mathcal{O}\left(\frac{1}{ε^{2/β}}\log\frac{1}ε\right)$. If other activation functions are allowed in the feedforward layers, the width of the feedforward layers can be further reduced to a constant. These results demonstrate that Transformers have a strong expressive capability. The construction in this paper is based on the Kolmogorov-Arnold Representation Theorem and does not require the concept of contextual mapping, hence our proof is more intuitively clear compared to previous Transformer approximation works. Additionally, the translation technique proposed in this paper helps to apply the previous approximation results of feedforward neural networks to Transformer research.

artificial intelligence, machine learning, transformer, (15 more...)

arXiv.org Artificial Intelligence

2504.13558

Country: Asia > China (0.68)

Genre: Research Report > New Finding (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Diversity Enhances an LLM's Performance in RAG and Long-context Task

Wang, Zhchao, Bi, Bin, Luo, Yanqi, Asur, Sitaram, Cheng, Claire Na

arXiv.org Artificial IntelligenceFeb-13-2025

The rapid advancements in large language models (LLMs) have highlighted the challenge of context window limitations, primarily due to the quadratic time complexity of the self-attention mechanism ($O(N^2)$, where $N$ denotes the context window length). This constraint impacts tasks such as retrieval-augmented generation (RAG) in question answering (Q\&A) and long context summarization. A common approach involves selecting content with the highest similarity to the query; however, this often leads to redundancy and the exclusion of diverse yet relevant information. Building on principles from Maximal Marginal Relevance (MMR) and Farthest Point Sampling (FPS), we integrate diversity into the content selection process. Our findings reveal that incorporating diversity substantially increases the recall of selecting relevant sentences or chunks before LLM-based Q\&A and summarization. These results highlight the importance of maintaining diversity in future LLM applications to further improve summarization and Q\&A outcomes.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2502.09017

Country:

Asia > Middle East > Jordan (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
(6 more...)

Genre: Research Report (0.70)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.91)

Add feedback

Motion Planning of Nonholonomic Cooperative Mobile Manipulators

Patra, Keshab, Sinha, Arpita, Guha, Anirban

arXiv.org Artificial IntelligenceFeb-8-2025

We propose a real-time implementable motion planning technique for cooperative object transportation by nonholonomic mobile manipulator robots (MMRs) in an environment with static and dynamic obstacles. The proposed motion planning technique works in two steps. A novel visibility vertices-based path planning algorithm computes a global piece-wise linear path between the start and the goal location in the presence of static obstacles offline. It defines the static obstacle free space around the path with a set of convex polygons for the online motion planner. We employ a Nonliner Model Predictive Control (NMPC) based online motion planning technique for nonholonomic MMRs that jointly plans for the mobile base and the manipulators arm. It efficiently utilizes the locomotion capability of the mobile base and the manipulation capability of the arm. The motion planner plans feasible motion for the MMRs and generates trajectory for object transportation considering the kinodynamic constraints and the static and dynamic obstacles. The efficiency of our approach is validated by numerical simulation and hardware experiments in varied environments.

artificial intelligence, manipulator, planning & scheduling, (19 more...)

arXiv.org Artificial Intelligence

2502.05462

Genre: Research Report (0.40)

Industry: Energy > Oil & Gas (0.34)

Technology:

Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)

Add feedback

Inducing Diversity in Differentiable Search Indexing

Phatak, Abhijeet, Sachdev, Jayant, Rosario, Sean D, Kirti, Swati, Tripathy, Chittaranjan

arXiv.org Artificial IntelligenceFeb-4-2025

Differentiable Search Indexing (DSI) is a recent paradigm for information retrieval which uses a transformer-based neural network architecture as the document index to simplify the retrieval process. A differentiable index has many advantages enabling modifications, updates or extensions to the index. In this work, we explore balancing relevance and novel information content (diversity) for training DSI systems inspired by Maximal Marginal Relevance (MMR), and show the benefits of our approach over the naive DSI training. We present quantitative and qualitative evaluations of relevance and diversity measures obtained using our method on NQ320K and MSMARCO datasets in comparison to naive DSI. With our approach, it is possible to achieve diversity without any significant impact to relevance. Since we induce diversity while training DSI, the trained model has learned to diversify while being relevant. This obviates the need for a post-processing step to induce diversity in the recall set as typically performed using MMR. Our approach will be useful for Information Retrieval problems where both relevance and diversity are important such as in sub-topic retrieval. Our work can also be easily be extended to the incremental DSI settings which would enable fast updates to the index while retrieving a diverse recall set.

information retrieval, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2502.02788

Country:

North America > United States > New York > New York County > New York City (0.05)
North America > United States > California > Santa Clara County > Sunnyvale (0.04)
North America > United States > California > Sacramento County > Sacramento (0.04)
North America > United States > California > Los Angeles County > Los Angeles (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Kinodynamic Motion Planning for Collaborative Object Transportation by Multiple Mobile Manipulators

Patra, Keshab, Sinha, Arpita, Guha, Anirban

arXiv.org Artificial IntelligenceSep-23-2024

This work proposes a kinodynamic motion planning technique for collaborative object transportation by multiple mobile manipulators in dynamic environments. A global path planner computes a linear piecewise path from start to goal. A novel algorithm detects the narrow regions between the static obstacles and aids in defining the obstacle-free region to enhance the feasibility of the global path. We then formulate a local online motion planning technique for trajectory generation that minimizes the control efforts in a receding horizon manner. It plans the trajectory for finite time horizons, considering the kinodynamic constraints and the static and dynamic obstacles. The planning technique jointly plans for the mobile bases and the arms to utilize the locomotion capability of the mobile base and the manipulation capability of the arm efficiently. We use a convex cone approach to avoid self-collision of the formation by modifying the mobile manipulators admissible state without imposing additional constraints. Numerical simulations and hardware experiments showcase the efficiency of the proposed approach.

dynamic obstacle, manipulator, obstacle, (15 more...)

arXiv.org Artificial Intelligence

2409.1491

Country:

Asia > India > Maharashtra > Mumbai (0.04)
North America > United States > Iowa > Story County > Ames (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.50)

Industry: Transportation (0.47)

Technology:

Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)

Add feedback

Unveiling Population Heterogeneity in Health Risks Posed by Environmental Hazards Using Regression-Guided Neural Network

Nam, Jong Woo, Choi, Eun Young, Ailshire, Jennifer A., Chiang, Yao-Yi

arXiv.org Artificial IntelligenceSep-20-2024

Environmental hazards place certain individuals at disproportionately higher risks. As these hazards increasingly endanger human health, precise identification of the most vulnerable population subgroups is critical for public health. Moderated multiple regression (MMR) offers a straightforward method for investigating this by adding interaction terms between the exposure to a hazard and other population characteristics to a linear regression model. However, when the vulnerabilities are hidden within a cross-section of many characteristics, MMR is often limited in its capabilities to find any meaningful discoveries. Here, we introduce a hybrid method, named regression-guided neural networks (ReGNN), which utilizes artificial neural networks (ANNs) to non-linearly combine predictors, generating a latent representation that interacts with a focal predictor (i.e. variable measuring exposure to an environmental hazard). We showcase the use of ReGNN for investigating the population heterogeneity in the health effects of exposure to air pollution (PM2.5) on cognitive functioning scores. We demonstrate that population heterogeneity that would otherwise be hidden using traditional MMR can be found using ReGNN by comparing its results to the fit results of the traditional MMR models. In essence, ReGNN is a novel tool that enhances traditional regression models by effectively summarizing and quantifying an individual's susceptibility to health risks.

coefficient, neural network, predictor, (13 more...)

arXiv.org Artificial Intelligence

2409.13205

Country:

North America > United States > California (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Minnesota (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.95)

Industry:

Health & Medicine > Consumer Health (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.90)
Government > Regional Government > North America Government > United States Government (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.80)

Add feedback

Better RAG using Relevant Information Gain

Pickett, Marc, Hartman, Jeremy, Bhowmick, Ayan Kumar, Alam, Raquib-ul, Vempaty, Aditya

arXiv.org Artificial IntelligenceJul-16-2024

A common way to extend the memory of large language models (LLMs) is by retrieval augmented generation (RAG), which inserts text retrieved from a larger memory into an LLM's context window. However, the context window is typically limited to several thousand tokens, which limits the number of retrieved passages that can inform a model's response. For this reason, it's important to avoid occupying context window space with redundant information by ensuring a degree of diversity among retrieved passages. At the same time, the information should also be relevant to the current task. Most prior methods that encourage diversity among retrieved results, such as Maximal Marginal Relevance (MMR), do so by incorporating an objective that explicitly trades off diversity and relevance. We propose a novel simple optimization metric based on relevant information gain, a probabilistic measure of the total information relevant to a query for a set of retrieved results. By optimizing this metric, diversity organically emerges from our system. When used as a drop-in replacement for the retrieval component of a RAG system, this method yields state-of-the-art performance on question answering tasks from the Retrieval Augmented Generation Benchmark (RGB), outperforming existing metrics that directly optimize for relevance and diversity.

dartboard, diversity, query, (16 more...)

arXiv.org Artificial Intelligence

2407.12101

Country:

Oceania > Australia (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.55)

Add feedback

VRSD: Rethinking Similarity and Diversity for Retrieval in Large Language Models

Gao, Hang, Zhang, Yongfeng

arXiv.org Artificial IntelligenceJul-5-2024

Vector retrieval algorithms are vital for semantic queries in the evolving landscape of Large Language Models (LLMs). Retrieving vectors that simultaneously meet criteria for both similarity and diversity significantly enhances the capabilities of LLM-based agents. Despite the widespread use of the Maximal Marginal Relevance (MMR) in retrieval scenarios with relevance and diversity requirements, fluctuations caused by variations in the parameter $ \lambda $ within the MMR complicate the determination of the optimization trajectory in vector spaces, thus obscuring the direction of enhancement. Moreover, there is a lack of a robust theoretical analysis for the constraints of similarity and diversity in retrieval processes. This paper introduces a novel approach to characterizing both constraints through the relationship between the sum vector and the query vector. The proximity of these vectors addresses the similarity constraint, while necessitating that individual vectors within the sum vector divergently align with the query vector to satisfy the diversity constraint. We also formulate a new combinatorial optimization challenge, taking a selection of $k$ vectors from a set of candidates such that their sum vector maximally aligns with the query vector, a problem we demonstrate to be NP-complete. This establishes the profound difficulty of pursuing similarity and diversity simultaneously in vector retrieval and lays a theoretical groundwork for further research. Additionally, we present the heuristic algorithm Vectors Retrieval with Similarity and Diversity (VRSD) which not only has a definitive optimization goal and eschews the need for preset parameters but also offers a modest reduction in time complexity compared to MMR. Empirical validation further confirm that VRSD significantly surpasses MMR across various datasets.

arXiv.org Artificial Intelligence

2407.04573

Country: North America > United States > New Jersey > Middlesex County > New Brunswick (0.04)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback