Overview
Concept2Box: Joint Geometric Embeddings for Learning Two-View Knowledge Graphs
Huang, Zijie, Wang, Daheng, Huang, Binxuan, Zhang, Chenwei, Shang, Jingbo, Liang, Yan, Wang, Zhengyang, Li, Xian, Faloutsos, Christos, Sun, Yizhou, Wang, Wei
Knowledge graph embeddings (KGE) have been extensively studied to embed large-scale relational data for many real-world applications. Existing methods have long ignored the fact many KGs contain two fundamentally different views: high-level ontology-view concepts and fine-grained instance-view entities. They usually embed all nodes as vectors in one latent space. However, a single geometric representation fails to capture the structural differences between two views and lacks probabilistic semantics towards concepts' granularity. We propose Concept2Box, a novel approach that jointly embeds the two views of a KG using dual geometric representations. We model concepts with box embeddings, which learn the hierarchy structure and complex relations such as overlap and disjoint among them. Box volumes can be interpreted as concepts' granularity. Different from concepts, we model entities as vectors. To bridge the gap between concept box embeddings and entity vector embeddings, we propose a novel vector-to-box distance metric and learn both embeddings jointly. Experiments on both the public DBpedia KG and a newly-created industrial KG showed the effectiveness of Concept2Box.
Knowledge Graph for NLG in the context of conversational agents
Ghanem, Hussam, Atmani, Massinissa, Cruz, Christophe
The use of knowledge graphs (KGs) enhances the accuracy and comprehensiveness of the responses provided by a conversational agent. While generating answers during conversations consists in generating text from these KGs, it is still regarded as a challenging task that has gained significant attention in recent years. In this document, we provide a review of different architectures used for knowledge graph-to-text generation including: Graph Neural Networks, the Graph Transformer, and linearization with seq2seq models. We discuss the advantages and limitations of each architecture and conclude that the choice of architecture will depend on the specific requirements of the task at hand. We also highlight the importance of considering constraints such as execution time and model validity, particularly in the context of conversational agents. Based on these constraints and the availability of labeled data for the domains of DAVI, we choose to use seq2seq Transformer-based models (PLMs) for the Knowledge Graph-to-Text Generation task. We aim to refine benchmark datasets of kg-to-text generation on PLMs and to explore the emotional and multilingual dimensions in our future work. Overall, this review provides insights into the different approaches for knowledge graph-to-text generation and outlines future directions for research in this area.
A Review of Driver Gaze Estimation and Application in Gaze Behavior Understanding
Sharma, Pavan Kumar, Chakraborty, Pranamesh
Driver gaze plays an important role in different gaze-based applications such as driver attentiveness detection, visual distraction detection, gaze behavior understanding, and building driver assistance system. The main objective of this study is to perform a comprehensive summary of driver gaze fundamentals, methods to estimate driver gaze, and it's applications in real world driving scenarios. We first discuss the fundamentals related to driver gaze, involving head-mounted and remote setup based gaze estimation and the terminologies used for each of these data collection methods. Next, we list out the existing benchmark driver gaze datasets, highlighting the collection methodology and the equipment used for such data collection. This is followed by a discussion of the algorithms used for driver gaze estimation, which primarily involves traditional machine learning and deep learning based techniques. The estimated driver gaze is then used for understanding gaze behavior while maneuvering through intersections, on-ramps, off-ramps, lane changing, and determining the effect of roadside advertising structures. Finally, we have discussed the limitations in the existing literature, challenges, and the future scope in driver gaze estimation and gaze-based applications.
GenRec: Large Language Model for Generative Recommendation
Ji, Jianchao, Li, Zelong, Xu, Shuyuan, Hua, Wenyue, Ge, Yingqiang, Tan, Juntao, Zhang, Yongfeng
In recent years, large language models (LLM) have emerged as powerful tools for diverse natural language processing tasks. However, their potential for recommender systems under the generative recommendation paradigm remains relatively unexplored. This paper presents an innovative approach to recommendation systems using large language models (LLMs) based on text data. In this paper, we present a novel LLM for generative recommendation (GenRec) that utilized the expressive power of LLM to directly generate the target item to recommend, rather than calculating ranking score for each candidate item one by one as in traditional discriminative recommendation. GenRec uses LLM's understanding ability to interpret context, learn user preferences, and generate relevant recommendation. Our proposed approach leverages the vast knowledge encoded in large language models to accomplish recommendation tasks. We first we formulate specialized prompts to enhance the ability of LLM to comprehend recommendation tasks. Subsequently, we use these prompts to fine-tune the LLaMA backbone LLM on a dataset of user-item interactions, represented by textual data, to capture user preferences and item characteristics. Our research underscores the potential of LLM-based generative recommendation in revolutionizing the domain of recommendation systems and offers a foundational framework for future explorations in this field. We conduct extensive experiments on benchmark datasets, and the experiments shows that our GenRec has significant better results on large dataset.
Microelectronic Morphogenesis: Progress towards Artificial Organisms
McCaskill, John S., Karnaushenko, Daniil, Zhu, Minshen, Schmidt, Oliver G.
Microelectronic morphogenesis is the creation and maintenance of complex functional structures by microelectronic information within shape-changing materials. Only recently has in-built information technology begun to be used to reshape materials and their functions in three dimensions to form smart microdevices and microrobots. Electronic information that controls morphology is inheritable like its biological counterpart, genetic information, and is set to open new vistas of technology leading to artificial organisms when coupled with modular design and self-assembly that can make reversible microscopic electrical connections. Three core capabilities of cells in organisms, self-maintenance (homeostatic metabolism utilizing free energy), self-containment (distinguishing self from non-self), and self-reproduction (cell division with inherited properties), once well out of reach for technology, are now within the grasp of information-directed materials. Construction-aware electronics can be used to proof-read and initiate game-changing error correction in microelectronic self-assembly. Furthermore, non-contact communication and electronically supported learning enable one to implement guided self-assembly and enhance functionality. This article reviews the fundamental breakthroughs that have opened the pathway to this prospective path, analyzes the extent and way in which the core properties of life can be addressed and discusses the potential and indeed necessity of such technology for sustainable high technology in society.
Learning Generic Solutions for Multiphase Transport in Porous Media via the Flux Functions Operator
Diab, Waleed, Chaabi, Omar, Alkobaisi, Shayma, Awotunde, Abeeb, Kobaisi, Mohammed Al
Traditional numerical schemes for simulating fluid flow and transport in porous media can be computationally expensive. Advances in machine learning for scientific computing have the potential to help speed up the simulation time in many scientific and engineering fields. DeepONet has recently emerged as a powerful tool for accelerating the solution of partial differential equations (PDEs) by learning operators (mapping between function spaces) of PDEs. In this work, we learn the mapping between the space of flux functions of the Buckley-Leverett PDE and the space of solutions (saturations). We use Physics-Informed DeepONets (PI-DeepONets) to achieve this mapping without any paired input-output observations, except for a set of given initial or boundary conditions; ergo, eliminating the expensive data generation process. By leveraging the underlying physical laws via soft penalty constraints during model training, in a manner similar to Physics-Informed Neural Networks (PINNs), and a unique deep neural network architecture, the proposed PI-DeepONet model can predict the solution accurately given any type of flux function (concave, convex, or non-convex) while achieving up to four orders of magnitude improvements in speed over traditional numerical solvers. Moreover, the trained PI-DeepONet model demonstrates excellent generalization qualities, rendering it a promising tool for accelerating the solution of transport problems in porous media.
A Survey on Graph Classification and Link Prediction based on GNN
Liu, Xingyu, Chen, Juan, Wen, Quan
Abstract: Traditional convolutional neural networks are limited to handling Euclidean space data, overlooking the vast realm of real-life scenarios represented as graph data, including transportation networks, social networks, and reference networks. The pivotal step in transferring convolutional neural networks to graph data analysis and processing lies in the construction of graph convolutional operators and graph pooling operators. This comprehensive review article delves into the world of graph convolutional neural networks. Subsequently, it elucidates the graph neural network models based on attention mechanisms and autoencoders, summarizing their application in node classification, graph classification, and link prediction along with the associated datasets. I. Introduction The characteristic of deep learning is the accumulation of multiple layers of neural networks, resulting in better learning representation ability.
A Comprehensive Survey of Artificial Intelligence Techniques for Talent Analytics
Qin, Chuan, Zhang, Le, Zha, Rui, Shen, Dazhong, Zhang, Qi, Sun, Ying, Zhu, Chen, Zhu, Hengshu, Xiong, Hui
In today's competitive and fast-evolving business environment, it is a critical time for organizations to rethink how to make talent-related decisions in a quantitative manner. Indeed, the recent development of Big Data and Artificial Intelligence (AI) techniques have revolutionized human resource management. The availability of large-scale talent and management-related data provides unparalleled opportunities for business leaders to comprehend organizational behaviors and gain tangible knowledge from a data science perspective, which in turn delivers intelligence for real-time decision-making and effective talent management at work for their organizations. In the last decade, talent analytics has emerged as a promising field in applied data science for human resource management, garnering significant attention from AI communities and inspiring numerous research efforts. To this end, we present an up-to-date and comprehensive survey on AI technologies used for talent analytics in the field of human resource management. Specifically, we first provide the background knowledge of talent analytics and categorize various pertinent data. Subsequently, we offer a comprehensive taxonomy of relevant research efforts, categorized based on three distinct application-driven scenarios: talent management, organization management, and labor market analysis. In conclusion, we summarize the open challenges and potential prospects for future research directions in the domain of AI-driven talent analytics.
Multimodal Sentiment Analysis: A Survey
Lai, Songning, Hu, Xifeng, Xu, Haoxuan, Ren, Zhaoxia, Liu, Zhi
Multimodal sentiment analysis has become an important research area in the field of artificial intelligence. With the latest advances in deep learning, this technology has reached new heights. It has great potential for both application and research, making it a popular research topic. This review provides an overview of the definition, background, and development of multimodal sentiment analysis. It also covers recent datasets and advanced models, emphasizing the challenges and future prospects of this technology. Finally, it looks ahead to future research directions. It should be noted that this review provides constructive suggestions for promising research directions and building better performing multimodal sentiment analysis models, which can help researchers in this field.
Learning Mixtures of Gaussians Using the DDPM Objective
Shah, Kulin, Chen, Sitan, Klivans, Adam
Recent works have shown that diffusion models can learn essentially any distribution provided one can perform score estimation. Yet it remains poorly understood under what settings score estimation is possible, let alone when practical gradient-based algorithms for this task can provably succeed. In this work, we give the first provably efficient results along these lines for one of the most fundamental distribution families, Gaussian mixture models. We prove that gradient descent on the denoising diffusion probabilistic model (DDPM) objective can efficiently recover the ground truth parameters of the mixture model in the following two settings: 1) We show gradient descent with random initialization learns mixtures of two spherical Gaussians in $d$ dimensions with $1/\text{poly}(d)$-separated centers. 2) We show gradient descent with a warm start learns mixtures of $K$ spherical Gaussians with $\Omega(\sqrt{\log(\min(K,d))})$-separated centers. A key ingredient in our proofs is a new connection between score-based methods and two other approaches to distribution learning, the EM algorithm and spectral methods.