Goto

Collaborating Authors

 cwl






Applications of Multimodal Learning part2(Artificial Intelligence)

#artificialintelligence

Abstract: We are perceiving and communicating with the world in a multisensory manner, where different information sources are sophisticatedly processed and interpreted by separate parts of the human brain to constitute a complex, yet harmonious and unified sensing system. To endow the machines with true intelligence, the multimodal machine learning that incorporates data from various modalities has become an increasingly popular research area with emerging technical advances in recent years. In this paper, we present a survey on multimodal machine learning from a novel perspective considering not only the purely technical aspects but also the nature of different data modalities. We analyze the commonness and uniqueness of each data format ranging from vision, audio, text and others, and then present the technical development categorized by the combination of Vision X, where the vision data play a fundamental role in most multimodal learning works. We investigate the existing literature on multimodal learning from both the representation learning and downstream application levels, and provide an additional comparison in the light of their technical connections with the data nature, e.g., the semantic consistency between image objects and textual descriptions, or the rhythm correspondence between video dance moves and musical beats.


Identification of Cognitive Workload during Surgical Tasks with Multimodal Deep Learning

Jin, Kaizhe, Rubio-Solis, Adrian, Naik, Ravi, Onyeogulu, Tochukwu, Islam, Amirul, Khan, Salman, Teeti, Izzeddin, Kinross, James, Leff, Daniel R, Cuzzolin, Fabio, Mylonas, George

arXiv.org Artificial Intelligence

The operating room (OR) is a dynamic and complex environment consisting of a multidisciplinary team working together in a high take environment to provide safe and efficient patient care. Additionally, surgeons are frequently exposed to multiple psycho-organisational stressors that may cause negative repercussions on their immediate technical performance and long-term health. Many factors can therefore contribute to increasing the Cognitive Workload (CWL) such as temporal pressures, unfamiliar anatomy or distractions in the OR. In this paper, a cascade of two machine learning approaches is suggested for the multimodal recognition of CWL in four different surgical task conditions. Firstly, a model based on the concept of transfer learning is used to identify if a surgeon is experiencing any CWL. Secondly, a Convolutional Neural Network (CNN) uses this information to identify different degrees of CWL associated to each surgical task. The suggested multimodal approach considers adjacent signals from electroencephalogram (EEG), functional near-infrared spectroscopy (fNIRS) and eye pupil diameter. The concatenation of signals allows complex correlations in terms of time (temporal) and channel location (spatial). Data collection was performed by a Multi-sensing AI Environment for Surgical Task & Role Optimisation platform (MAESTRO) developed at the Hamlyn Centre, Imperial College London. To compare the performance of the proposed methodology, a number of state-of-art machine learning techniques have been implemented. The tests show that the proposed model has a precision of 93%.


Weisfeiler and Lehman Go Cellular: CW Networks

Bodnar, Cristian, Frasca, Fabrizio, Otter, Nina, Wang, Yu Guang, Liò, Pietro, Montúfar, Guido, Bronstein, Michael

arXiv.org Machine Learning

Graph Neural Networks (GNNs) are limited in their expressive power, struggle with long-range interactions and lack a principled way to model higher-order structures. These problems can be attributed to the strong coupling between the computational graph and the input graph structure. The recently proposed Message Passing Simplicial Networks naturally decouple these elements by performing message passing on the clique complex of the graph. Nevertheless, these models are severely constrained by the rigid combinatorial structure of Simplicial Complexes (SCs). In this work, we extend recent theoretical results on SCs to regular Cell Complexes, topological objects that flexibly subsume SCs and graphs. We show that this generalisation provides a powerful set of graph ``lifting'' transformations, each leading to a unique hierarchical message passing procedure. The resulting methods, which we collectively call CW Networks (CWNs), are strictly more powerful than the WL test and, in certain cases, not less powerful than the 3-WL test. In particular, we demonstrate the effectiveness of one such scheme, based on rings, when applied to molecular graph problems. The proposed architecture benefits from provably larger expressivity than commonly used GNNs, principled modelling of higher-order signals and from compressing the distances between nodes. We demonstrate that our model achieves state-of-the-art results on a variety of molecular datasets.


The Smoothed Satisfaction of Voting Axioms

Xia, Lirong

arXiv.org Artificial Intelligence

We initiate the work towards a comprehensive picture of the smoothed satisfaction of voting axioms, to provide a finer and more realistic foundation for comparing voting rules. We adopt the smoothed social choice framework, where an adversary chooses arbitrarily correlated "ground truth" preferences for the agents, on top of which random noises are added. We focus on characterizing the smoothed satisfaction of two well-studied voting axioms: Condorcet criterion and participation. We prove that for any fixed number of alternatives, when the number of voters $n$ is sufficiently large, the smoothed satisfaction of the Condorcet criterion under a wide range of voting rules is $1$, $1-\exp(-\Theta(n))$, $\Theta(n^{-0.5})$, $ \exp(-\Theta(n))$, or being $\Theta(1)$ and $1-\Theta(1)$ at the same time; and the smoothed satisfaction of participation is $1-\Theta(n^{-0.5})$. Our results address open questions by Berg and Lepelley in 1994 for these rules, and also confirm the following high-level message: the Condorcet criterion is a bigger concern than participation under realistic models.