AITopics | Bera, Aniket

Collaborating Authors

Bera, Aniket

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

SG-LSTM: Social Group LSTM for Robot Navigation Through Dense Crowds

Bhaskara, Rashmi, Chiu, Maurice, Bera, Aniket

arXiv.org Artificial IntelligenceAug-6-2023

With the increasing availability and affordability of personal robots, they will no longer be confined to large corporate warehouses or factories but will instead be expected to operate in less controlled environments alongside larger groups of people. In addition to ensuring safety and efficiency, it is crucial to minimize any negative psychological impact robots may have on humans and follow unwritten social norms in these situations. Our research aims to develop a model that can predict the movements of pedestrians and perceptually-social groups in crowded environments. We introduce a new Social Group Long Short-term Memory (SG-LSTM) model that models human groups and interactions in dense environments using a socially-aware LSTM to produce more accurate trajectory predictions. Our approach enables navigation algorithms to calculate collision-free paths faster and more accurately in crowded environments. Additionally, we also release a large video dataset with labeled pedestrian groups for the broader social navigation community. We show comparisons with different metrics on different datasets (ETH, Hotel, MOT15) and different prediction approaches (LIN, LSTM, O-LSTM, S-LSTM) as well as runtime performance.

artificial intelligence, machine learning, robot, (18 more...)

arXiv.org Artificial Intelligence

2303.0432

Country: North America > United States (0.14)

Genre: Research Report (1.00)

Industry: Transportation (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

ARC: Alignment-based Redirection Controller for Redirected Walking in Complex Environments

Williams, Niall L., Bera, Aniket, Manocha, Dinesh

arXiv.org Artificial IntelligenceJul-4-2023

We present a novel redirected walking controller based on alignment that allows the user to explore large and complex virtual environments, while minimizing the number of collisions with obstacles in the physical environment. Our alignment-based redirection controller, ARC, steers the user such that their proximity to obstacles in the physical environment matches the proximity to obstacles in the virtual environment as closely as possible. To quantify a controller's performance in complex environments, we introduce a new metric, Complexity Ratio (CR), to measure the relative environment complexity and characterize the difference in navigational complexity between the physical and virtual environments. Through extensive simulation-based experiments, we show that ARC significantly outperforms current state-of-the-art controllers in its ability to steer the user on a collision-free path. We also show through quantitative and qualitative measures of performance that our controller is robust in complex environments with many obstacles. Our method is applicable to arbitrary environments and operates without any user input or parameter tweaking, aside from the layout of the environments. We have implemented our algorithm on the Oculus Quest head-mounted display and evaluated its performance in environments with varying complexity. Our project website is available at https://gamma.umd.edu/arc/.

artificial intelligence, controller, human computer interaction, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/TVCG.2021.3067781

2101.04912

Country: North America > United States > Maryland (0.14)

Genre: Research Report (1.00)

Industry:

Information Technology (0.46)
Education (0.46)

Technology:

Information Technology > Human Computer Interaction (0.94)
Information Technology > Artificial Intelligence > Robots (0.93)

Add feedback

SeekNet: Improved Human Instance Segmentation and Tracking via Reinforcement Learning Based Optimized Robot Relocation

Narayanan, Venkatraman, Manoghar, Bala Murali, RV, Rama Prashanth, Pham, Phu, Bera, Aniket

arXiv.org Artificial IntelligenceMar-8-2023

Amodal recognition is the ability of the system to detect occluded objects. Most SOTA Visual Recognition systems lack the ability to perform amodal recognition. Few studies have achieved amodal recognition through passive prediction or embodied recognition approaches. However, these approaches suffer from challenges in real-world applications, such as dynamic obstacles. We propose SeekNet, an improved optimization method for amodal recognition through embodied visual recognition. Additionally, we implement SeekNet for social robots, where there are multiple interactions with crowded pedestrians. We also demonstrate the benefits of our algorithm on occluded human detection and tracking over other baselines. Additionally, we set up a multi-robot environment with SeekNet to identify and track visual disease markers for airborne disease in crowded areas. We conduct our experiments in a simulated indoor environment and show that our method enhances the overall accuracy of the amodal recognition task and achieves the largest improvement in detection accuracy over time in comparison to the baseline approaches.

artificial intelligence, machine learning, segmentation, (15 more...)

arXiv.org Artificial Intelligence

2011.08682

Country: North America > United States (0.28)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area (0.48)
Leisure & Entertainment > Games > Computer Games (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

EWareNet: Emotion Aware Human Intent Prediction and Adaptive Spatial Profile Fusion for Social Robot Navigation

Narayanan, Venkatraman, Manoghar, Bala Murali, RV, Rama Prashanth, Bera, Aniket

arXiv.org Artificial IntelligenceMar-7-2023

We present EWareNet, a novel intent and affect-aware social robot navigation algorithm among pedestrians. Our approach predicts the trajectory-based pedestrian intent from gait sequence, which is then used for intent-guided navigation taking into account social and proxemic constraints. We propose a transformer-based model that works on commodity RGB-D cameras mounted onto a moving robot. Our intent prediction routine is integrated into a mapless navigation scheme and makes no assumptions about the environment of pedestrian motion. Our navigation scheme consists of a novel obstacle profile representation methodology that is dynamically adjusted based on the pedestrian pose, intent, and affect. The navigation scheme is based on a reinforcement learning algorithm that takes pedestrian intent and robot's impact on pedestrian intent into consideration, in addition to the environmental configuration. We outperform current state-of-art algorithms for intent prediction from 3D gaits.

artificial intelligence, emotion, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2011.09438

Country: Europe > Switzerland (0.28)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

AZTR: Aerial Video Action Recognition with Auto Zoom and Temporal Reasoning

Wang, Xijun, Xian, Ruiqi, Guan, Tianrui, de Melo, Celso M., Nogar, Stephen M., Bera, Aniket, Manocha, Dinesh

arXiv.org Artificial IntelligenceMar-2-2023

We propose a novel approach for aerial video action recognition. Our method is designed for videos captured using UAVs and can run on edge or mobile devices. We present a learning-based approach that uses customized auto zoom to automatically identify the human target and scale it appropriately. This makes it easier to extract the key features and reduces the computational overhead. We also present an efficient temporal reasoning algorithm to capture the action information along the spatial and temporal domains within a controllable computational cost. Our approach has been implemented and evaluated both on the desktop with high-end GPUs and on the low power Robotics RB5 Platform for robots and drones. In practice, we achieve 6.1-7.4% improvement over SOTA in Top-1 accuracy on the RoCoG-v2 dataset, 8.3-10.4% improvement on the UAV-Human dataset and 3.2% improvement on the Drone Action dataset.

machine learning, recognition, temporal reasoning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/ICRA48891.2023.10160564

2303.01589

Country: North America > United States > Maryland > Prince George's County > College Park (0.14)

Genre: Research Report > Promising Solution (0.34)

Industry: Information Technology (0.69)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Temporal Reasoning (0.63)

Add feedback

PaCMO: Partner Dependent Human Motion Generation in Dyadic Human Activity using Neural Operators

Rahman, Md Ashiqur, Ghosh, Jasorsi, Viswanath, Hrishikesh, Azizzadenesheli, Kamyar, Bera, Aniket

arXiv.org Artificial IntelligenceNov-25-2022

We address the problem of generating 3D human motions in dyadic activities. In contrast to the concurrent works, which mainly focus on generating the motion of a single actor from the textual description, we generate the motion of one of the actors from the motion of the other participating actor in the action. This is a particularly challenging, under-explored problem, that requires learning intricate relationships between the motion of two actors participating in an action and also identifying the action from the motion of one actor. To address these, we propose partner conditioned motion operator (PaCMO), a neural operator-based generative model which learns the distribution of human motion conditioned by the partner's motion in function spaces through adversarial training. Our model can handle long unlabeled action sequences at arbitrary time resolution. We also introduce the "Functional Frechet Inception Distance" ($F^2ID$) metric for capturing similarity between real and generated data for function spaces. We test PaCMO on NTU RGB+D and DuetDance datasets and our model produces realistic results evidenced by the $F^2ID$ score and the conducted user study.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2211.1621

Country: North America > United States > Indiana (0.28)

Genre:

Research Report (0.82)
Questionnaire & Opinion Survey (0.54)

Industry: Energy (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Vision (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Affect2MM: Affective Analysis of Multimedia Content Using Emotion Causality

Mittal, Trisha, Mathur, Puneet, Bera, Aniket, Manocha, Dinesh

arXiv.org Artificial IntelligenceMar-11-2021

We present Affect2MM, a learning method for time-series emotion prediction for multimedia content. Our goal is to automatically capture the varying emotions depicted by characters in real-life human-centric situations and behaviors. We use the ideas from emotion causation theories to computationally model and determine the emotional state evoked in clips of movies. Affect2MM explicitly models the temporal causality using attention-based methods and Granger causality. We use a variety of components like facial features of actors involved, scene understanding, visual aesthetics, action/situation description, and movie script to obtain an affective-rich representation to understand and perceive the scene. We use an LSTM-based learning model for emotion perception. To evaluate our method, we analyze and compare our performance on three datasets, SENDv1, MovieGraphs, and the LIRIS-ACCEDE dataset, and observe an average of 10-15% increase in the performance over SOTA methods for all three datasets.

dataset, deep learning, neural network, (20 more...)

arXiv.org Artificial Intelligence

2103.06541

Country:

North America > United States > Maryland (0.14)
North America > United States > California (0.14)
Europe > United Kingdom > England (0.14)

Genre:

Overview (0.68)
Research Report (0.64)

Industry:

Health & Medicine (0.69)
Media > Film (0.68)
Leisure & Entertainment (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback

Can a Robot Trust You? A DRL-Based Approach to Trust-Driven Human-Guided Navigation

Dorbala, Vishnu Sashank, Srinivasan, Arjun, Bera, Aniket

arXiv.org Artificial IntelligenceNov-1-2020

Humans are known to construct cognitive maps of their everyday surroundings using a variety of perceptual inputs. As such, when a human is asked for directions to a particular location, their wayfinding capability in converting this cognitive map into directional instructions is challenged. Owing to spatial anxiety, the language used in the spoken instructions can be vague and often unclear. To account for this unreliability in navigational guidance, we propose a novel Deep Reinforcement Learning (DRL) based trust-driven robot navigation algorithm that learns humans' trustworthiness to perform a language guided navigation task. Our approach seeks to answer the question as to whether a robot can trust a human's navigational guidance or not. To this end, we look at training a policy that learns to navigate towards a goal location using only trustworthy human guidance, driven by its own robot trust metric. We look at quantifying various affective features from language-based instructions and incorporate them into our policy's observation space in the form of a human trust metric. We utilize both these trust metrics into an optimal cognitive reasoning scheme that decides when and when not to trust the given guidance. Our results show that the learned policy can navigate the environment in an optimal, time-efficient manner as opposed to an explorative approach that performs the same task. We showcase the efficacy of our results both in simulation and a real-world environment.

guidance, neural network, survey article, (22 more...)

arXiv.org Artificial Intelligence

2011.00554

Country: North America > United States > Maryland (0.14)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
(2 more...)

Add feedback

FVA: Modeling Perceived Friendliness of Virtual Agents Using Movement Characteristics

Randhavane, Tanmay, Bera, Aniket, Kapsaskis, Kyra, Gray, Kurt, Manocha, Dinesh

arXiv.org Artificial IntelligenceJun-30-2019

We present a new approach for improving the friendliness and warmth of a virtual agent in an AR environment by generating appropriate movement characteristics. Our algorithm is based on a novel data-driven friendliness model that is computed using a user-study and psychological characteristics. We use our model to control the movements corresponding to the gaits, gestures, and gazing of friendly virtual agents (FVAs) as they interact with the user's avatar and other agents in the environment. We have integrated FVA agents with an AR environment using with a Microsoft HoloLens. Our algorithm can generate plausible movements at interactive rates to increase the social presence. We also investigate the perception of a user in an AR setting and observe that an FVA has a statistically significant improvement in terms of the perceived friendliness and social presence of a user compared to an agent without the friendliness modeling. We observe an increment of 5.71% in the mean responses to a friendliness measure and an improvement of 4.03% in the mean responses to a social presence measure.

deep learning, neural network, participant, (21 more...)

arXiv.org Artificial Intelligence

1907.00377

Country:

North America > United States (0.68)
Asia (0.46)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Questionnaire & Opinion Survey (0.87)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

MixedPeds: Pedestrian Detection in Unannotated Videos Using Synthetically Generated Human-Agents for Training

Cheung, Ernest (The University of North Carolina at Chapel Hill) | Wong, Anson (The University of North Carolina at Chapel Hill) | Bera, Aniket (The University of North Carolina at Chapel Hill) | Manocha, Dinesh (The University of North Carolina at Chapel Hill)

AAAI ConferencesFeb-8-2018

We present a new method for training pedestrian detectors on an unannotated set of images. We produce a mixed reality dataset that is composed of real-world background images and synthetically generated static human-agents. Our approach is general, robust, and makes few assumptions about the unannotated dataset. We automatically extract from the dataset: i) the vanishing point to calibrate the virtual camera, and ii) the pedestrians' scales to generate a Spawn Probability Map, which is a novel concept that guides our algorithm to place the pedestrians at appropriate locations. After putting synthetic human-agents in the unannotated images, we use these augmented images to train a Pedestrian Detector, with the annotations generated along with the synthetic agents. We conducted our experiments using Faster R-CNN by comparing the detection results on the unannotated dataset performed by the detector trained using our approach and detectors trained with other manually labeled datasets. We showed that our approach improves the average precision by 5-13% over these detectors.

artificial intelligence, dataset, ground transportation, (18 more...)

AAAI Conferences

Thirty-Second AAAI Conference on Artificial Intelligence

Country: Europe > Russia (0.14)

Genre: Research Report (0.34)

Industry:

Transportation > Ground > Road (0.85)
Automobiles & Trucks (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback