Goto

Collaborating Authors

 Overview


Human-centered trust framework: An HCI perspective

arXiv.org Artificial Intelligence

The rationale of this work is based on the current user trust discourse of Artificial Intelligence (AI). We aim to produce novel HCI approaches that use trust as a facilitator for the uptake (or appropriation) of current technologies. We propose a framework (HCTFrame) to guide non-experts to unlock the full potential of user trust in AI design. Results derived from a data triangulation of findings from three literature reviews demystify some misconceptions of user trust in computer science and AI discourse, and three case studies are conducted to assess the effectiveness of a psychometric scale in mapping potential users' trust breakdowns and concerns. This work primarily contributes to the fight against the tendency to design technical-centered vulnerable interactions, which can eventually lead to additional real and perceived breaches of trust. The proposed framework can be used to guide system designers on how to map and define user trust and the socioethical and organisational needs and characteristics of AI system design. It can also guide AI system designers on how to develop a prototype and operationalise a solution that meets user trust requirements. The article ends by providing some user research tools that can be employed to measure users' trust intentions and behaviours towards a proposed solution.


Analyzing Deep Learning Representations of Point Clouds for Real-Time In-Vehicle LiDAR Perception

arXiv.org Artificial Intelligence

LiDAR sensors are an integral part of modern autonomous vehicles as they provide an accurate, high-resolution 3D representation of the vehicle's surroundings. However, it is computationally difficult to make use of the ever-increasing amounts of data from multiple high-resolution LiDAR sensors. As frame-rates, point cloud sizes and sensor resolutions increase, real-time processing of these point clouds must still extract semantics from this increasingly precise picture of the vehicle's environment. One deciding factor of the run-time performance and accuracy of deep neural networks operating on these point clouds is the underlying data representation and the way it is computed. In this work, we examine the relationship between the computational representations used in neural networks and their performance characteristics. To this end, we propose a novel computational taxonomy of LiDAR point cloud representations used in modern deep neural networks for 3D point cloud processing. Using this taxonomy, we perform a structured analysis of different families of approaches. Thereby, we uncover common advantages and limitations in terms of computational efficiency, memory requirements, and representational capacity as measured by semantic segmentation performance. Finally, we provide some insights and guidance for future developments in neural point cloud processing methods.


Interactive Fashion Content Generation Using LLMs and Latent Diffusion Models

arXiv.org Artificial Intelligence

Fashionable image generation aims to synthesize images of diverse fashion prevalent around the globe, helping fashion designers in real-time visualization by giving them a basic customized structure of how a specific design preference would look in real life and what further improvements can be made for enhanced customer satisfaction. Moreover, users can alone interact and generate fashionable images by just giving a few simple prompts. Recently, diffusion models have gained popularity as generative models owing to their flexibility and generation of realistic images from Gaussian noise. Latent diffusion models are a type of generative model that use diffusion processes to model the generation of complex data, such as images, audio, or text. They are called "latent" because they learn a hidden representation, or latent variable, of the data that captures its underlying structure. We propose a method exploiting the equivalence between diffusion models and energy-based models (EBMs) and suggesting ways to compose multiple probability distributions. We describe a pipeline on how our method can be used specifically for new fashionable outfit generation and virtual try-on using LLM-guided text-to-image generation. Our results indicate that using an LLM to refine the prompts to the latent diffusion model assists in generating globally creative and culturally diversified fashion styles and reducing bias.


Integrating Generative Artificial Intelligence in Intelligent Vehicle Systems

arXiv.org Artificial Intelligence

This paper aims to serve as a comprehensive guide for researchers and practitioners, offering insights into the current state, potential applications, and future research directions for generative artificial intelligence and foundation models within the context of intelligent vehicles. As the automotive industry progressively integrates AI, generative artificial intelligence technologies hold the potential to revolutionize user interactions, delivering more immersive, intuitive, and personalised in-car experiences. We provide an overview of current applications of generative artificial intelligence in the automotive domain, emphasizing speech, audio, vision, and multimodal interactions. We subsequently outline critical future research areas, including domain adaptability, alignment, multimodal integration and others, as well as, address the challenges and risks associated with ethics. By fostering collaboration and addressing these research areas, generative artificial intelligence can unlock its full potential, transforming the driving experience and shaping the future of intelligent vehicles.


Identification of the Factors Affecting the Reduction of Energy Consumption and Cost in Buildings Using Data Mining Techniques

arXiv.org Artificial Intelligence

Optimizing energy consumption and coordination of utility systems have long been a concern of the building industry. Buildings are one of the largest energy consumers in the world, making their energy efficiency crucial for preventing waste and reducing costs. Additionally, buildings generate substantial amounts of raw data, which can be used to understand energy consumption patterns and assist in developing optimization strategies. Using a real-world dataset, this research aims to identify the factors that influence building cost reduction and energy consumption. To achieve this, we utilize three regression models (Lasso Regression, Decision Tree, and Random Forest) to predict primary fuel usage, electrical energy consumption, and cost savings in buildings. An analysis of the factors influencing energy consumption and cost reduction is conducted, and the decision tree algorithm is optimized using metaheuristics. By employing metaheuristic techniques, we fine-tune the decision tree algorithm's parameters and improve its accuracy. Finally, we review the most practical features of potential and nonpotential buildings that can reduce primary fuel usage, electrical energy consumption, and costs


HiPool: Modeling Long Documents Using Graph Neural Networks

arXiv.org Artificial Intelligence

Encoding long sequences in Natural Language Processing (NLP) is a challenging problem. Though recent pretraining language models achieve satisfying performances in many NLP tasks, they are still restricted by a pre-defined maximum length, making them challenging to be extended to longer sequences. So some recent works utilize hierarchies to model long sequences. However, most of them apply sequential models for upper hierarchies, suffering from long dependency issues. In this paper, we alleviate these issues through a graph-based method. We first chunk the sequence with a fixed length to model the sentence-level information. We then leverage graphs to model intra- and cross-sentence correlations with a new attention mechanism. Additionally, due to limited standard benchmarks for long document classification (LDC), we propose a new challenging benchmark, totaling six datasets with up to 53k samples and 4034 average tokens' length. Evaluation shows our model surpasses competitive baselines by 2.6% in F1 score, and 4.8% on the longest sequence dataset. Our method is shown to outperform hierarchical sequential models with better performance and scalability, especially for longer sequences.


Hyper-automation-The next peripheral for automation in IT industries

arXiv.org Artificial Intelligence

The extension of legacy business process automation beyond the bounds of specific processes is known as hyperautomation. Hyperautomation provides automation for nearly any repetitive action performed by business users by combining AI tools with RPA. It automates complex IT business processes that a company's top brains might not be able to complete. This is an end-to-end automation of a standard business process deployment. It enables automation to perform task digitalization by combining a brain computer interface (BCI) with AI and RPA automation tools. BCI, in conjunction with automation tools, will advance the detection and generation of automation processes to the next level. It allows enterprises to combine business intelligence systems, address complex requirements, and enhance human expertise and automation experience. Hyperautomation and its importance in today's environment are briefly discussed in this paper. The article then goes on to discuss how BCI and sensors might aid Hyperautomation. The specific sectors of solicitations were examined using a variety of flexible technologies associated to this concept, as well as dedicated workflow techniques, which are also diagrammatically illustrated. Hyperautomation is being utilized to improve the efficiency, accuracy, and human enhancement of automated tasks dramatically. It incorporates a number of automated tools in its discovery, implementation, and automation phases. As a result, it's well-suited to integrating cutting-edge technologies and experimenting with new methods of working. Keywords- Hyperautomation, Brain computer Interface (BCI), Technology, Used case, Sensors, Industries.


Quantum Operation of Affective Artificial Intelligence

arXiv.org Artificial Intelligence

The review analyzes the fundamental principles which Artificial Intelligence should be based on in order to imitate the realistic process of taking decisions by humans experiencing emotions. Two approaches are compared, one based on quantum theory and the other employing classical terms. Both these approaches have a number of similarities, being principally probabilistic. The analogies between quantum measurements under intrinsic noise and affective decision making are elucidated. It is shown that cognitive processes have many features that are formally similar to quantum measurements. This, however, in no way means that for the imitation of human decision making Affective Artificial Intelligence has necessarily to rely on the functioning of quantum systems. Appreciating the common features between quantum measurements and decision making helps for the formulation of an axiomatic approach employing only classical notions. Artificial Intelligence, following this approach, operates similarly to humans, by taking into account the utility of the considered alternatives as well as their emotional attractiveness. Affective Artificial Intelligence, whose operation takes account of the cognition-emotion duality, avoids numerous behavioural paradoxes of traditional decision making. A society of intelligent agents, interacting through the repeated multistep exchange of information, forms a network accomplishing dynamic decision making. The considered intelligent networks can characterize the operation of either a human society of affective decision makers, or the brain composed of neurons, or a typical probabilistic network of an artificial intelligence.


Make Prompt-based Black-Box Tuning Colorful: Boosting Model Generalization from Three Orthogonal Perspectives

arXiv.org Artificial Intelligence

Large language models (LLMs) have shown increasing power on various natural language processing (NLP) tasks. However, tuning these models for downstream tasks usually needs exorbitant costs or is unavailable due to commercial considerations. Recently, black-box tuning has been proposed to address this problem by optimizing task-specific prompts without accessing the gradients and hidden representations. However, most existing works have yet fully exploited the potential of gradient-free optimization under the scenario of few-shot learning. In this paper, we describe BBT-RGB, a suite of straightforward and complementary techniques for enhancing the efficiency and performance of black-box optimization. Specifically, our method includes three plug-and-play components: (1) Two-stage derivative-free optimization strategy that facilitates fast convergence and mitigates overfitting; (2) Automatic verbalizer construction with its novel usage under few-shot settings; (3) Better prompt initialization policy based on instruction search and auto-selected demonstration. Extensive experiments across various tasks on natural language understanding and inference demonstrate the effectiveness of our method. Our codes are publicly available at https://github.com/QiushiSun/BBT-RGB.


Delay-Adapted Policy Optimization and Improved Regret for Adversarial MDP with Delayed Bandit Feedback

arXiv.org Artificial Intelligence

Policy Optimization (PO) is one of the most popular methods in Reinforcement Learning (RL). Thus, theoretical guarantees for PO algorithms have become especially important to the RL community. In this paper, we study PO in adversarial MDPs with a challenge that arises in almost every real-world application -- \textit{delayed bandit feedback}. We give the first near-optimal regret bounds for PO in tabular MDPs, and may even surpass state-of-the-art (which uses less efficient methods). Our novel Delay-Adapted PO (DAPO) is easy to implement and to generalize, allowing us to extend our algorithm to: (i) infinite state space under the assumption of linear $Q$-function, proving the first regret bounds for delayed feedback with function approximation. (ii) deep RL, demonstrating its effectiveness in experiments on MuJoCo domains.