AITopics | Agarwal, Shubham

Collaborating Authors

Agarwal, Shubham

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

StarVector: Generating Scalable Vector Graphics Code from Images

Rodriguez, Juan A., Agarwal, Shubham, Laradji, Issam H., Rodriguez, Pau, Vazquez, David, Pal, Christopher, Pedersoli, Marco

arXiv.org Artificial IntelligenceDec-17-2023

Scalable Vector Graphics (SVGs) have become integral in modern image rendering applications due to their infinite scalability in resolution, versatile usability, and editing capabilities. SVGs are particularly popular in the fields of web development and graphic design. Existing approaches for SVG modeling using deep learning often struggle with generating complex SVGs and are restricted to simpler ones that require extensive processing and simplification. This paper introduces StarVector, a multimodal SVG generation model that effectively integrates Code Generation Large Language Models (CodeLLMs) and vision models. Our approach utilizes a CLIP image encoder to extract visual representations from pixel-based images, which are then transformed into visual tokens via an adapter module. These visual tokens are pre-pended to the SVG token embeddings, and the sequence is modeled by the StarCoder model using next-token prediction, effectively learning to align the visual and code tokens. This enables StarVector to generate unrestricted SVGs that accurately represent pixel images. To evaluate StarVector's performance, we present SVG-Bench, a comprehensive benchmark for evaluating SVG methods across multiple datasets and relevant metrics. Within this benchmark, we introduce novel datasets including SVG-Stack, a large-scale dataset of real-world SVG examples, and use it to pre-train StarVector as a large foundation model for SVGs. Our results demonstrate significant enhancements in visual quality and complexity handling over current methods, marking a notable advancement in SVG generation technology. Code and models: https://github.com/joanrod/star-vector

large language model, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2312.11556

Country: North America > Canada > Quebec (0.14)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Outage-Watch: Early Prediction of Outages using Extreme Event Regularizer

Agarwal, Shubham, Chakraborty, Sarthak, Garg, Shaddy, Bisht, Sumit, Jain, Chahat, Gonuguntla, Ashritha, Saini, Shiv

arXiv.org Artificial IntelligenceNov-10-2023

Cloud services are omnipresent and critical cloud service failure is a fact of life. In order to retain customers and prevent revenue loss, it is important to provide high reliability guarantees for these services. One way to do this is by predicting outages in advance, which can help in reducing the severity as well as time to recovery. It is difficult to forecast critical failures due to the rarity of these events. Moreover, critical failures are ill-defined in terms of observable data. Our proposed method, Outage-Watch, defines critical service outages as deteriorations in the Quality of Service (QoS) captured by a set of metrics. Outage-Watch detects such outages in advance by using current system state to predict whether the QoS metrics will cross a threshold and initiate an extreme event. A mixture of Gaussian is used to model the distribution of the QoS metrics for flexibility and an extreme event regularizer helps in improving learning in tail of the distribution. An outage is predicted if the probability of any one of the QoS metrics crossing threshold changes significantly. Our evaluation on a real-world SaaS company dataset shows that Outage-Watch significantly outperforms traditional methods with an average AUC of 0.98. Additionally, Outage-Watch detects all the outages exhibiting a change in service metrics and reduces the Mean Time To Detection (MTTD) of outages by up to 88% when deployed in an enterprise cloud-service system, demonstrating efficacy of our proposed method.

cloud computing, data mining, machine learning, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3611643.3616316

2309.1734

Country: North America > United States > California > San Francisco County > San Francisco (0.16)

Genre: Research Report > New Finding (0.46)

Industry: Information Technology > Services (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Cloud Computing (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Add feedback

History for Visual Dialog: Do we really need it?

Agarwal, Shubham, Bui, Trung, Lee, Joon-Young, Konstas, Ioannis, Rieser, Verena

arXiv.org Artificial IntelligenceMay-8-2020

Visual Dialog involves "understanding" the dialog history (what has been discussed previously) and the current question (what is asked), in addition to grounding information in the image, to generate the correct response. In this paper, we show that co-attention models which explicitly encode dialog history outperform models that don't, achieving state-of-the-art performance (72 % NDCG on val set). However, we also expose shortcomings of the crowd-sourcing dataset collection procedure by showing that history is indeed only required for a small amount of the data and that the current evaluation metric encourages generic replies. To that end, we propose a challenging subset (VisDialConv) of the VisDial val set and provide a benchmark of 63% NDCG.

crowdsourcing, dialog, neural network, (22 more...)

arXiv.org Artificial Intelligence

2005.07493

Country:

North America > United States (0.14)
North America > Canada (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Communications > Social Media > Crowdsourcing (0.67)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.56)

Add feedback

Planning for Goal-Oriented Dialogue Systems

Muise, Christian, Chakraborti, Tathagata, Agarwal, Shubham, Bajgar, Ondrej, Chaudhary, Arunima, Lastras-Montano, Luis A., Ondrej, Josef, Vodolan, Miroslav, Wiecha, Charlie

arXiv.org Artificial IntelligenceOct-17-2019

Generating complex multi-turn goal-oriented dialogue agents is a difficult problem that has seen a considerable focus from many leaders in the tech industry, including IBM, Google, Amazon, and Microsoft. This is in large part due to the rapidly growing market demand for dialogue agents capable of goal-oriented behaviour. Due to the business process nature of these conversations, end-to-end machine learning systems are generally not a viable option, as the generated dialogue agents must be deployable and verifiable on behalf of the businesses authoring them. In this work, we propose a paradigm shift in the creation of goal-oriented complex dialogue systems that dramatically eliminates the need for a designer to manually specify a dialogue tree, which nearly all current systems have to resort to when the interaction pattern falls outside standard patterns such as slot filling. We propose a declarative representation of the dialogue agent to be processed by state-of-the-art planning technology. Our proposed approach covers all aspects of the process; from model solicitation to the execution of the generated plans/dialogue agents. Along the way, we introduce novel planning encodings for declarative dialogue synthesis, a variety of interfaces for working with the specification as a dialogue architect, and a robust executor for generalized contingent plans. We have created prototype implementations of all components, and in this paper, we further demonstrate the resulting system empirically.

planning & scheduling, specification, survey article, (20 more...)

arXiv.org Artificial Intelligence

1910.08137

Country:

Europe (0.46)
Asia > Middle East > Israel (0.14)
North America > United States > California (0.14)

Genre:

Research Report (0.63)
Overview (0.45)

Industry:

Information Technology (0.69)
Consumer Products & Services (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
(2 more...)

Add feedback

Generating Dialogue Agents via Automated Planning

Botea, Adi, Muise, Christian, Agarwal, Shubham, Alkan, Oznur, Bajgar, Ondrej, Daly, Elizabeth, Kishimoto, Akihiro, Lastras, Luis, Marinescu, Radu, Ondrej, Josef, Pedemonte, Pablo, Vodolan, Miroslav

arXiv.org Artificial IntelligenceFeb-2-2019

Dialogue systems have many applications such as customer support or question answering. Typically they have been limited to shallow single turn interactions. However more advanced applications such as career coaching or planning a trip require a much more complex multi-turn dialogue. Current limitations of conversational systems have made it difficult to support applications that require personalization, customization and context dependent interactions. We tackle this challenging problem by using domain-independent AI planning to automatically create dialogue plans, customized to guide a dialogue towards achieving a given goal. The input includes a library of atomic dialogue actions, an initial state of the dialogue, and a goal. Dialogue plans are plugged into a dialogue system capable to orchestrate their execution. Use cases demonstrate the viability of the approach. Our work on dialogue planning has been integrated into a product, and it is in the process of being deployed into another.

artificial intelligence, dialogue, planning & scheduling, (20 more...)

arXiv.org Artificial Intelligence

1902.00771

Country: North America > United States (0.28)

Genre: Research Report (0.40)

Industry: Consumer Products & Services > Travel (0.35)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.93)

Add feedback

A Knowledge-Grounded Multimodal Search-Based Conversational Agent

Agarwal, Shubham, Dusek, Ondrej, Konstas, Ioannis, Rieser, Verena

arXiv.org Artificial IntelligenceOct-20-2018

Multimodal search-based dialogue is a challenging new task: It extends visually grounded question answering systems into multi-turn conversations with access to an external database. We address this new challenge by learning a neural response generation system from the recently released Multimodal Dialogue (MMD) dataset (Saha et al., 2017). We introduce a knowledge-grounded multimodal conversational model where an encoded knowledge base (KB) representation is appended to the decoder input. Our model substantially outperforms strong baselines in terms of text-based similarity measures (over 9 BLEU points, 3 of which are solely due to the use of additional information from the KB.

deep learning, neural network, proc, (22 more...)

arXiv.org Artificial Intelligence

1810.11954

Country: North America > Canada (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)
Information Technology > Artificial Intelligence > Representation & Reasoning > Search (0.60)

Add feedback