AITopics | Renkens, Vincent

Collaborating Authors

Renkens, Vincent

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

MARRS: Multimodal Reference Resolution System

Ates, Halim Cagri, Bhargava, Shruti, Li, Site, Lu, Jiarui, Maddula, Siddhardha, Moniz, Joel Ruben Antony, Nalamalapu, Anil Kumar, Nguyen, Roman Hoang, Ozyildirim, Melis, Patel, Alkesh, Piraviperumal, Dhivya, Renkens, Vincent, Samal, Ankit, Tran, Thy, Tseng, Bo-Hsiang, Yu, Hong, Zhang, Yuan, Zou, Rong

arXiv.org Artificial IntelligenceNov-2-2023

Successfully handling context is essential for any dialog understanding task. This context maybe be conversational (relying on previous user queries or system responses), visual (relying on what the user sees, for example, on their screen), or background (based on signals such as a ringing alarm or playing music). In this work, we present an overview of MARRS, or Multimodal Reference Resolution System, an on-device framework within a Natural Language Understanding system, responsible for handling conversational, visual and background context. In particular, we present different machine learning models to enable handing contextual queries; specifically, one to enable reference resolution, and one to handle context via query rewriting. We also describe how these models complement each other to form a unified, coherent, lightweight system that can understand context while preserving user privacy.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.18653/v1/2023.crac-main.7

2311.0165

Country: North America > United States (0.46)

Genre:

Overview (0.55)
Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (0.30)

Add feedback

Referring to Screen Texts with Voice Assistants

Bhargava, Shruti, Dhoot, Anand, Jonsson, Ing-Marie, Nguyen, Hoang Long, Patel, Alkesh, Yu, Hong, Renkens, Vincent

arXiv.org Artificial IntelligenceJun-10-2023

Voice assistants help users make phone calls, send messages, create events, navigate, and do a lot more. However, assistants have limited capacity to understand their users' context. In this work, we aim to take a step in this direction. Our work dives into a new experience for users to refer to phone numbers, addresses, email addresses, URLs, and dates on their phone screens. Our focus lies in reference understanding, which becomes particularly interesting when multiple similar texts are present on screen, similar to visual grounding. We collect a dataset and propose a lightweight general-purpose model for this novel experience. Due to the high cost of consuming pixels directly, our system is designed to rely on the extracted text from the UI. Our model is modular, thus offering flexibility, improved interpretability, and efficient runtime memory utilization.

category, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2306.07298

Country: North America > United States > California (0.14)

Genre: Research Report (0.40)

Industry:

Information Technology (0.46)
Media (0.36)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.94)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.73)
(2 more...)

Add feedback