AITopics | Dong, Haiwei

Collaborating Authors

Dong, Haiwei

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

MetaDecorator: Generating Immersive Virtual Tours through Multimodality

Xie, Shuang, Liu, Yang, Lee, Jeannie S. A., Dong, Haiwei

arXiv.org Artificial IntelligenceJan-27-2025

Abstract--MetaDecorator, is a framework that empowers users to personalize virtual spaces. By leveraging text-driven prompts and image synthesis techniques, MetaDecorator adorns static panoramas captured by 360 imaging devices, transforming them into uniquely styled and visually appealing environments. This significantly enhances the realism and engagement of virtual tours compared to traditional offerings. Beyond the core framework, we also discuss the integration of Large Language Models (LLMs) and haptics in the VR application to provide a more immersive experience. This framework shown in FIGURE 1 a significant transformation with the introduction consists of two main stages.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2501.16164

Country: North America > Canada > Ontario > National Capital Region > Ottawa (0.15)

Genre: Research Report (0.40)

Technology:

Information Technology > Human Computer Interaction (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.56)

Add feedback

Leveraging LLMs to Create a Haptic Devices' Recommendation System

Liu, Yang, Dong, Haiwei, Saddik, Abdulmotaleb El

arXiv.org Artificial IntelligenceJan-21-2025

Haptic technology has seen significant growth, yet a lack of awareness of existing haptic device design knowledge hinders development. This paper addresses these limitations by leveraging advancements in Large Language Models (LLMs) to develop a haptic agent, focusing specifically on Grounded Force Feedback (GFF) devices recommendation. Our approach involves automating the creation of a structured haptic device database using information from research papers and product specifications. This database enables the recommendation of relevant GFF devices based on user queries. To ensure precise and contextually relevant recommendations, the system employs a dynamic retrieval method that combines both conditional and semantic searches. Benchmarking against the established UEQ and existing haptic device searching tools, the proposed haptic recommendation agent ranks in the top 10\% across all UEQ categories with mean differences favoring the agent in nearly all subscales, and maintains no significant performance bias across different user groups, showcasing superior usability and user satisfaction.

haptic device, large language model, natural language, (19 more...)

arXiv.org Artificial Intelligence

2501.12573

Country: North America > Canada > Ontario (0.14)

Genre:

Questionnaire & Opinion Survey (1.00)
Research Report > New Finding (0.69)
Research Report > Experimental Study (0.47)

Technology:

Information Technology > Human Computer Interaction (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

MADRL-Based Rate Adaptation for 360{\deg} Video Streaming with Multi-Viewpoint Prediction

Wang, Haopeng, Long, Zijian, Dong, Haiwei, Saddik, Abdulmotaleb El

arXiv.org Artificial IntelligenceMay-17-2024

Over the last few years, 360{\deg} video traffic on the network has grown significantly. A key challenge of 360{\deg} video playback is ensuring a high quality of experience (QoE) with limited network bandwidth. Currently, most studies focus on tile-based adaptive bitrate (ABR) streaming based on single viewport prediction to reduce bandwidth consumption. However, the performance of models for single-viewpoint prediction is severely limited by the inherent uncertainty in head movement, which can not cope with the sudden movement of users very well. This paper first presents a multimodal spatial-temporal attention transformer to generate multiple viewpoint trajectories with their probabilities given a historical trajectory. The proposed method models viewpoint prediction as a classification problem and uses attention mechanisms to capture the spatial and temporal characteristics of input video frames and viewpoint trajectories for multi-viewpoint prediction. After that, a multi-agent deep reinforcement learning (MADRL)-based ABR algorithm utilizing multi-viewpoint prediction for 360{\deg} video streaming is proposed for maximizing different QoE objectives under various network conditions. We formulate the ABR problem as a decentralized partially observable Markov decision process (Dec-POMDP) problem and present a MAPPO algorithm based on centralized training and decentralized execution (CTDE) framework to solve the problem. The experimental results show that our proposed method improves the defined QoE metric by up to 85.5% compared to existing ABR methods.

artificial intelligence, machine learning, trajectory, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/JIOT.2024.3398548

2405.07759

Country:

Asia (1.00)
North America > Canada > Ontario > National Capital Region > Ottawa (0.28)
North America > Canada > Ontario > Toronto (0.28)

Genre: Research Report > New Finding (0.66)

Industry:

Leisure & Entertainment (0.93)
Information Technology > Security & Privacy (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.93)

Add feedback

Bringing Robots Home: The Rise of AI Robots in Consumer Electronics

Dong, Haiwei, Liu, Yang, Chu, Ted, Saddik, Abdulmotaleb El

arXiv.org Artificial IntelligenceMar-21-2024

On March 18, 2024, NVIDIA unveiled Project GR00T, a general-purpose multimodal generative AI model designed specifically for training humanoid robots. Preceding this event, Tesla's unveiling of the Optimus Gen 2 humanoid robot on December 12, 2023, underscored the profound impact robotics is poised to have on reshaping various facets of our daily lives. While robots have long dominated industrial settings, their presence within our homes is a burgeoning phenomenon. This can be attributed, in part, to the complexities of domestic environments and the challenges of creating robots that can seamlessly integrate into our daily routines.

artificial intelligence, machine learning, robot, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/MCE.2024.3381573

2403.14449

Country: Asia > Japan (0.15)

Genre: Research Report (0.40)

Industry:

Banking & Finance (0.95)
Information Technology > Security & Privacy (0.70)

Technology:

Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.50)

Add feedback

Human-Centric Resource Allocation for the Metaverse With Multiaccess Edge Computing

Long, Zijian, Dong, Haiwei, Saddik, Abdulmotaleb El

arXiv.org Artificial IntelligenceDec-23-2023

Multi-access edge computing (MEC) is a promising solution to the computation-intensive, low-latency rendering tasks of the metaverse. However, how to optimally allocate limited communication and computation resources at the edge to a large number of users in the metaverse is quite challenging. In this paper, we propose an adaptive edge resource allocation method based on multi-agent soft actor-critic with graph convolutional networks (SAC-GCN). Specifically, SAC-GCN models the multi-user metaverse environment as a graph where each agent is denoted by a node. Each agent learns the interplay between agents by graph convolutional networks with self-attention mechanism to further determine the resource usage for one user in the metaverse. The effectiveness of SAC-GCN is demonstrated through the analysis of user experience, balance of resource allocation, and resource utilization rate by taking a virtual city park metaverse as an example. Experimental results indicate that SAC-GCN outperforms other resource allocation methods in improving overall user experience, balancing resource allocation, and increasing resource utilization rate by at least 27%, 11%, and 8%, respectively.

cloud computing, machine learning, reinforcement learning, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/JIOT.2023.3283335

2312.15313

Country:

Asia (0.93)
North America > Canada > Ontario > National Capital Region > Ottawa (0.28)
North America > Canada > Ontario > Toronto (0.28)

Genre: Research Report > Promising Solution (0.34)

Industry:

Telecommunications (0.69)
Education (0.68)
Information Technology > Services (0.46)

Technology:

Information Technology > Human Computer Interaction (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Cloud Computing (1.00)
(2 more...)

Add feedback

A Deep Reinforcement Learning Framework for Optimizing Congestion Control in Data Centers

Ketabi, Shiva, Chen, Hongkai, Dong, Haiwei, Ganjali, Yashar

arXiv.org Artificial IntelligenceJan-29-2023

Various congestion control protocols have been designed to achieve high performance in different network environments. Modern online learning solutions that delegate the congestion control actions to a machine cannot properly converge in the stringent time scales of data centers. We leverage multiagent reinforcement learning to design a system for dynamic tuning of congestion control parameters at end-hosts in a data center. The system includes agents at the end-hosts to monitor and report the network and traffic states, and agents to run the reinforcement learning algorithm given the states. Based on the state of the environment, the system generates congestion control parameters that optimize network performance metrics such as throughput and latency. As a case study, we examine BBR, an example of a prominent recently-developed congestion control protocol. Our experiments demonstrate that the proposed system has the potential to mitigate the problems of static parameters.

artificial intelligence, machine learning, reinforcement learning, (15 more...)

arXiv.org Artificial Intelligence

2301.12558

Country: North America > Canada > Ontario > Toronto (0.28)

Genre: Research Report (0.50)

Industry:

Information Technology > Services (0.83)
Education (0.75)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Development of an automatic 3D human head scanning-printing system

Zhang, Longyu, Han, Bote, Dong, Haiwei, Saddik, Abdulmotaleb El

arXiv.org Artificial IntelligenceDec-25-2022

In anthropological studies, researchers have been investigating the relationship between facial shape variations and neurological and psychiatric disorders. For example, Hennesy et al. used 3D head models acquired from laser scanners to identify schizophrenia from facial dysmorphic features [3]. A fast algorithm for 3D face reconstruction with uncalibrated photometric stereo technology was also proposed by Qi et al. [4]. Human avatar animation has also become popular with the development of 3D graphics and gaming. Lee and Magnenat-Thalman introduced a method to reconstruct 3D facial models for animation from two orthogonal images (frontal and profile view) or from range data [5]. Additionally, Kan and Ferko adopted this same principle to build an automatic system where they use the facial feature matching of two images and a parametrized head model to create 3D head models as avatars in 3D games [6]. An important part of 3D human model is head model, which can be used to establish standards for the design of products that fit onto the face or head, such as respiratory masks, glasses, helmets or other head-mounted devices [7]. An interesting initiative was the Size-China project [8,9]. To find the proper fit for Asians, who have different head shapes compared with Westerners in facialhead products such as helmets, face masks, and caps, and to derive standards with anthropometric database, Ball et al. created an Asian anthropometric database built from 3D scans of 2000 Asian people using a stationary head and face color 3D scanner by Cyberware

artificial intelligence, image understanding, machine learning, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/s11042-016-3949-2

2212.14773

Country:

North America > Canada (0.28)
Asia > China (0.24)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Robots > Humanoid Robots (0.63)
Information Technology > Artificial Intelligence > Vision > Image Understanding (0.48)

Add feedback

Development of a Self-Calibrated Motion Capture System by Nonlinear Trilateration of Multiple Kinects v2

Yang, Bowen, Dong, Haiwei, Saddik, Abdulmotaleb El

arXiv.org Artificial IntelligenceDec-25-2022

In this paper, a Kinect-based distributed and real-time motion capture system is developed. A trigonometric method is applied to calculate the relative position of Kinect v2 sensors with a calibration wand and register the sensors' positions automatically. By combining results from multiple sensors with a nonlinear least square method, the accuracy of the motion capture is optimized. Moreover, to exclude inaccurate results from sensors, a computational geometry is applied in the occlusion approach, which discovers occluded joint data. The synchronization approach is based on an NTP protocol that synchronizes the time between the clocks of a server and clients dynamically, ensuring that the proposed system is a real-time system. Experiments for validating the proposed system are conducted from the perspective of calibration, occlusion, accuracy, and efficiency. Furthermore, to demonstrate the practical performance of our system, a comparison of previously developed motion capture systems (the linear trilateration approach and the geometric trilateration approach) with the benchmark OptiTrack system is conducted, therein showing that the accuracy of our proposed system is $38.3\%$ and 24.1% better than the two aforementioned trilateration systems, respectively.

artificial intelligence, sensor, video understanding, (15 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/JSEN.2017.2671420

2212.14771

Country: North America > Canada > Ontario (0.28)

Genre: Research Report (0.50)

Industry:

Health & Medicine (1.00)
Leisure & Entertainment > Games (0.46)
Media > Film (0.46)

Technology: Information Technology > Artificial Intelligence > Vision > Video Understanding (1.00)

Add feedback

Sitting Posture Recognition Using a Spiking Neural Network

Wang, Jianquan, Hafidh, Basim, Dong, Haiwei, Saddik, Abdulmotaleb El

arXiv.org Artificial IntelligenceDec-25-2022

Abstract--To increase the quality of citizens' lives, we designed They use a recurrent network structure so that the intermediate I. That improve the quality of life of citizens in smart cities. One of is, the membrane potential of the neurons is related to the the visions towards smart cities is digital twins [1], which are a quantity, frequency and interval of the input spikes and not replica of any living or nonliving entity. In this paper, a spiking neural network is Burden of Disease (GBD) study [2], increasingly more people constructed in the form of a liquid state machine. The purpose are suffering from lower back pain among other conditions of this work was to design, implement, and validate a sensing due to inappropriate sitting behaviors. To improve quality of chair system for computer-human interactions with the spiking life, it is essential to design personalized sensing systems neural network.

artificial intelligence, machine learning, neuron, (20 more...)

arXiv.org Artificial Intelligence

doi: 10.1109/JSEN.2020.3016611

2212.12908

Country:

Asia (1.00)
North America > Canada > Ontario (0.68)

Genre: Research Report (0.83)

Industry:

Health & Medicine > Therapeutic Area > Musculoskeletal (0.88)
Health & Medicine > Therapeutic Area > Neurology (0.86)

Technology:

Information Technology > Internet of Things (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback