AITopics | Manocha, Dinesh

Plotting

Manocha, Dinesh

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

M-MELD: A Multilingual Multi-Party Dataset for Emotion Recognition in Conversations

Ghosh, Sreyan, Ramaneswaran, S, Tyagi, Utkarsh, Srivastava, Harshvardhan, Lepcha, Samden, Sakshi, S, Manocha, Dinesh

arXiv.org Artificial IntelligenceMar-31-2023

Expression of emotions is a crucial part of daily human communication. Emotion recognition in conversations (ERC) is an emerging field of study, where the primary task is to identify the emotion behind each utterance in a conversation. Though a lot of work has been done on ERC in the past, these works only focus on ERC in the English language, thereby ignoring any other languages. In this paper, we present Multilingual MELD (M-MELD), where we extend the Multimodal EmotionLines Dataset (MELD) \cite{poria2018meld} to 4 other languages beyond English, namely Greek, Polish, French, and Spanish. Beyond just establishing strong baselines for all of these 4 languages, we also propose a novel architecture, DiscLSTM, that uses both sequential and conversational discourse context in a conversational dialogue for ERC. Our proposed approach is computationally efficient, can transfer across languages using just a cross-lingual encoder, and achieves better performance than most uni-modal text approaches in the literature on both MELD and M-MELD. We make our data and code publicly on GitHub.

machine learning, natural language, utterance, (17 more...)

arXiv.org Artificial Intelligence

2203.16799

Country:

Asia > India (0.29)
North America > United States > Maryland (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Emotion (0.93)
Information Technology > Artificial Intelligence > Natural Language > Discourse & Dialogue (0.68)

Add feedback

VERN: Vegetation-aware Robot Navigation in Dense Unstructured Outdoor Environments

Sathyamoorthy, Adarsh Jagan, Weerakoon, Kasun, Guan, Tianrui, Russell, Mason, Conover, Damon, Pusey, Jason, Manocha, Dinesh

arXiv.org Artificial IntelligenceMar-25-2023

We propose a novel method for autonomous legged robot navigation in densely vegetated environments with a variety of pliable/traversable and non-pliable/untraversable vegetation. We present a novel few-shot learning classifier that can be trained on a few hundred RGB images to differentiate flora that can be navigated through, from the ones that must be circumvented. Using the vegetation classification and 2D lidar scans, our method constructs a vegetation-aware traversability cost map that accurately represents the pliable and non-pliable obstacles with lower, and higher traversability costs, respectively. Our cost map construction accounts for misclassifications of the vegetation and further lowers the risk of collisions, freezing and entrapment in vegetation during navigation. Furthermore, we propose holonomic recovery behaviors for the robot for scenarios where it freezes, or gets physically entrapped in dense, pliable vegetation. We demonstrate our method on a Boston Dynamics Spot robot in real-world unstructured environments with sparse and dense tall grass, bushes, trees, etc. We observe an increase of 25-90% in success rates, 10-90% decrease in freezing rate, and up to 65% decrease in the false positive rate compared to existing methods.

artificial intelligence, machine learning, robot, (18 more...)

arXiv.org Artificial Intelligence

2303.14502

Country: North America > United States > Maryland (0.14)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Robots > Locomotion (0.88)

Add feedback

Towards Improved Room Impulse Response Estimation for Speech Recognition

Ratnarajah, Anton, Ananthabhotla, Ishwarya, Ithapu, Vamsi Krishna, Hoffmann, Pablo, Manocha, Dinesh, Calamia, Paul

arXiv.org Artificial IntelligenceMar-19-2023

We propose a novel approach for blind room impulse response (RIR) estimation systems in the context of a downstream application scenario, far-field automatic speech recognition (ASR). We first draw the connection between improved RIR estimation and improved ASR performance, as a means of evaluating neural RIR estimators. We then propose a generative adversarial network (GAN) based architecture that encodes RIR features from reverberant speech and constructs an RIR from the encoded features, and uses a novel energy decay relief loss to optimize for capturing energy-based properties of the input reverberant speech. We show that our model outperforms the state-of-the-art baselines on acoustic benchmarks (by 17\% on the energy decay relief and 22\% on an early-reflection energy metric), as well as in an ASR evaluation task (by 6.9\% in word error rate).

improved room impulse response estimation, machine learning, speech recognition, (1 more...)

arXiv.org Artificial Intelligence

2211.04473

Genre: Research Report (0.69)

Technology: Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)

Add feedback

DS-MPEPC: Safe and Deadlock-Avoiding Robot Navigation in Cluttered Dynamic Scenes

Arul, Senthil Hariharan, Park, Jong Jin, Manocha, Dinesh

arXiv.org Artificial IntelligenceMar-17-2023

We present an algorithm for safe robot navigation in complex dynamic environments using a variant of model predictive equilibrium point control. We use an optimization formulation to navigate robots gracefully in dynamic environments by optimizing over a trajectory cost function at each timestep. We present a novel trajectory cost formulation that significantly reduces the conservative and deadlock behaviors and generates smooth trajectories. In particular, we propose a new collision probability function that effectively captures the risk associated with a given configuration and the time to avoid collisions based on the velocity direction. Moreover, we propose a terminal state cost based on the expected time-to-goal and time-to-collision values that helps in avoiding trajectories that could result in deadlock. We evaluate our cost formulation in multiple simulated and real-world scenarios, including narrow corridors with dynamic obstacles, and observe significantly improved navigation behavior and reduced deadlocks as compared to prior methods.

artificial intelligence, robot, trajectory, (14 more...)

arXiv.org Artificial Intelligence

2303.10133

Country: North America > United States > Maryland > Prince George's County > College Park (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.95)

Add feedback

Real-Time Decentralized Navigation of Nonholonomic Agents Using Shifted Yielding Areas

He, Liang, Pan, Zherong, Manocha, Dinesh

arXiv.org Artificial IntelligenceMar-16-2023

We present a lightweight, decentralized algorithm for navigating multiple nonholonomic agents through challenging environments with narrow passages. Our key idea is to allow agents to yield to each other in large open areas instead of narrow passages, to increase the success rate of conventional decentralized algorithms. At pre-processing time, our method computes a medial axis for the freespace. A reference trajectory is then computed and projected onto the medial axis for each agent. During run time, when an agent senses other agents moving in the opposite direction, our algorithm uses the medial axis to estimate a Point of Impact (POI) as well as the available area around the POI. If the area around the POI is not large enough for yielding behaviors to be successful, we shift the POI to nearby large areas by modulating the agent's reference trajectory and traveling speed. We evaluate our method on a row of 4 environments with up to 15 robots, and we find our method incurs a marginal computational overhead of 10-30 ms on average, achieving real-time performance. Afterward, our planned reference trajectories can be tracked using local navigation algorithms to achieve up to a $100\%$ higher success rate over local navigation algorithms alone.

agent, artificial intelligence, real time system, (16 more...)

arXiv.org Artificial Intelligence

2303.09139

Country:

North America > United States > Maryland (0.14)
Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (0.82)

Industry: Transportation (0.69)

Technology:

Information Technology > Architecture > Real Time Systems (0.90)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.68)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.47)

Add feedback

AZTR: Aerial Video Action Recognition with Auto Zoom and Temporal Reasoning

Wang, Xijun, Xian, Ruiqi, Guan, Tianrui, de Melo, Celso M., Nogar, Stephen M., Bera, Aniket, Manocha, Dinesh

arXiv.org Artificial IntelligenceMar-2-2023

We propose a novel approach for aerial video action recognition. Our method is designed for videos captured using UAVs and can run on edge or mobile devices. We present a learning-based approach that uses customized auto zoom to automatically identify the human target and scale it appropriately. This makes it easier to extract the key features and reduces the computational overhead. We also present an efficient temporal reasoning algorithm to capture the action information along the spatial and temporal domains within a controllable computational cost. Our approach has been implemented and evaluated both on the desktop with high-end GPUs and on the low power Robotics RB5 Platform for robots and drones. In practice, we achieve 6.1-7.4% improvement over SOTA in Top-1 accuracy on the RoCoG-v2 dataset, 8.3-10.4% improvement on the UAV-Human dataset and 3.2% improvement on the Drone Action dataset.

machine learning, recognition, temporal reasoning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ICRA48891.2023.10160564

2303.01589

Country: North America > United States > Maryland > Prince George's County > College Park (0.14)

Genre: Research Report > Promising Solution (0.34)

Industry: Information Technology (0.69)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Temporal Reasoning (0.63)

Add feedback

RTAW: An Attention Inspired Reinforcement Learning Method for Multi-Robot Task Allocation in Warehouse Environments

Agrawal, Aakriti, Bedi, Amrit Singh, Manocha, Dinesh

arXiv.org Artificial IntelligenceFeb-27-2023

We present a novel reinforcement learning based algorithm for multi-robot task allocation problem in warehouse environments. We formulate it as a Markov Decision Process and solve via a novel deep multi-agent reinforcement learning method (called RTAW) with attention inspired policy architecture. Hence, our proposed policy network uses global embeddings that are independent of the number of robots/tasks. We utilize proximal policy optimization algorithm for training and use a carefully designed reward to obtain a converged policy. The converged policy ensures cooperation among different robots to minimize total travel delay (TTD) which ultimately improves the makespan for a sufficiently large task-list. In our extensive experiments, we compare the performance of our RTAW algorithm to state of the art methods such as myopic pickup distance minimization (greedy) and regret based baselines on different navigation schemes. We show an improvement of upto 14% (25-1000 seconds) in TTD on scenarios with hundreds or thousands of tasks for different challenging warehouse layouts and task generation schemes. We also demonstrate the scalability of our approach by showing performance with up to $1000$ robots in simulations.

artificial intelligence, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2209.05738

Country: North America > United States (0.14)

Genre: Research Report (0.84)

Industry: Transportation > Ground > Road (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Beyond Exponentially Fast Mixing in Average-Reward Reinforcement Learning via Multi-Level Monte Carlo Actor-Critic

Suttle, Wesley A., Bedi, Amrit Singh, Patel, Bhrij, Sadler, Brian M., Koppel, Alec, Manocha, Dinesh

arXiv.org Artificial IntelligenceFeb-1-2023

Many existing reinforcement learning (RL) methods employ stochastic gradient iteration on the back end, whose stability hinges upon a hypothesis that the data-generating process mixes exponentially fast with a rate parameter that appears in the step-size selection. Unfortunately, this assumption is violated for large state spaces or settings with sparse rewards, and the mixing time is unknown, making the step size inoperable. In this work, we propose an RL methodology attuned to the mixing time by employing a multi-level Monte Carlo estimator for the critic, the actor, and the average reward embedded within an actor-critic (AC) algorithm. This method, which we call \textbf{M}ulti-level \textbf{A}ctor-\textbf{C}ritic (MAC), is developed especially for infinite-horizon average-reward settings and neither relies on oracle knowledge of the mixing time in its parameter selection nor assumes its exponential decay; it, therefore, is readily applicable to applications with slower mixing times. Nonetheless, it achieves a convergence rate comparable to the state-of-the-art AC algorithms. We experimentally show that these alleviated restrictions on the technical conditions required for stability translate to superior performance in practice for RL problems with sparse rewards.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2301.12083

Country: North America > United States (0.46)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.34)

Add feedback

FedBC: Calibrating Global and Local Models via Federated Learning Beyond Consensus

Bedi, Amrit Singh, Fan, Chen, Koppel, Alec, Sahu, Anit Kumar, Sadler, Brian M., Huang, Furong, Manocha, Dinesh

arXiv.org Artificial IntelligenceFeb-1-2023

In this work, we quantitatively calibrate the performance of global and local models in federated learning through a multi-criterion optimization-based framework, which we cast as a constrained program. The objective of a device is its local objective, which it seeks to minimize while satisfying nonlinear constraints that quantify the proximity between the local and the global model. By considering the Lagrangian relaxation of this problem, we develop a novel primal-dual method called Federated Learning Beyond Consensus (\texttt{FedBC}). Theoretically, we establish that \texttt{FedBC} converges to a first-order stationary point at rates that matches the state of the art, up to an additional error term that depends on a tolerance parameter introduced to scalarize the multi-criterion formulation. Finally, we demonstrate that \texttt{FedBC} balances the global and local model test accuracy metrics across a suite of datasets (Synthetic, MNIST, CIFAR-10, Shakespeare), achieving competitive performance with state-of-the-art.

artificial intelligence, fedbc, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2206.10815

Country: North America > United States > Maryland > Prince George's County (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Synthetic Wave-Geometric Impulse Responses for Improved Speech Dereverberation

Aralikatti, Rohith, Tang, Zhenyu, Manocha, Dinesh

arXiv.org Artificial IntelligenceDec-10-2022

We present a novel approach to improve the performance of learning-based speech dereverberation using accurate synthetic datasets. Our approach is designed to recover the reverb-free signal from a reverberant speech signal. We show that accurately simulating the low-frequency components of Room Impulse Responses (RIRs) is important to achieving good dereverberation. We use the GWA dataset that consists of synthetic RIRs generated in a hybrid fashion: an accurate wave-based solver is used to simulate the lower frequencies and geometric ray tracing methods simulate the higher frequencies. We demonstrate that speech dereverberation models trained on hybrid synthetic RIRs outperform models trained on RIRs generated by prior geometric ray tracing methods on four real-world RIR datasets.

artificial intelligence, machine learning, rir, (17 more...)

arXiv.org Artificial Intelligence

2212.0536

Country: North America > United States > Maryland (0.14)

Genre: Research Report (0.84)

Industry: Energy > Oil & Gas (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Speech > Speech Recognition (0.70)

Add feedback