Goto

Collaborating Authors

 lta


Bidirectional Contrastive Split Learning for Visual Question Answering

arXiv.org Artificial Intelligence

Visual Question Answering (VQA) based on multi-modal data facilitates real-life applications such as home robots and medical diagnoses. One significant challenge is to devise a robust decentralized learning framework for various client models where centralized data collection is refrained due to confidentiality concerns. This work aims to tackle privacy-preserving VQA by decoupling a multi-modal model into representation modules and a contrastive module and leveraging inter-module gradients sharing and inter-client weight sharing. To this end, we propose Bidirectional Contrastive Split Learning (BiCSL) to train a global multi-modal model on the entire data distribution of decentralized clients. We employ the contrastive loss that enables a more efficient self-supervised learning of decentralized modules. Comprehensive experiments are conducted on the VQA-v2 dataset based on five SOTA VQA models, demonstrating the effectiveness of the proposed method. Furthermore, we inspect BiCSL's robustness against a dual-key backdoor attack on VQA. Consequently, BiCSL shows much better robustness to the multi-modal adversarial attack compared to the centralized learning method, which provides a promising approach to decentralized multi-modal learning.


Learning to Approximate: Auto Direction Vector Set Generation for Hypervolume Contribution Approximation

arXiv.org Artificial Intelligence

Hypervolume contribution is an important concept in evolutionary multi-objective optimization (EMO). It involves in hypervolume-based EMO algorithms and hypervolume subset selection algorithms. Its main drawback is that it is computationally expensive in high-dimensional spaces, which limits its applicability to many-objective optimization. Recently, an R2 indicator variant (i.e., $R_2^{\text{HVC}}$ indicator) is proposed to approximate the hypervolume contribution. The $R_2^{\text{HVC}}$ indicator uses line segments along a number of direction vectors for hypervolume contribution approximation. It has been shown that different direction vector sets lead to different approximation quality. In this paper, we propose \textit{Learning to Approximate (LtA)}, a direction vector set generation method for the $R_2^{\text{HVC}}$ indicator. The direction vector set is automatically learned from training data. The learned direction vector set can then be used in the $R_2^{\text{HVC}}$ indicator to improve its approximation quality. The usefulness of the proposed LtA method is examined by comparing it with other commonly-used direction vector set generation methods for the $R_2^{\text{HVC}}$ indicator. Experimental results suggest the superiority of LtA over the other methods for generating high quality direction vector sets.


National Day Rally 2017: 'Smart' lamp posts to become key nodes for surveillance and data collection

#artificialintelligence

SINGAPORE - Plans are underway to turn every lamp post into a smart lamp post that can carry and transmit information gathered from surveillance cameras and sensors around the country. The network of interconnected lamp posts could form the spine of the Smart Nation Sensor Platform (SNSP), which aims to use artificial intelligence (AI) technologies to analyse, for instance, video footage collected by government agencies. These could be used to detect anomalies and predict situations such as potentially unruly crowds and traffic congestion. "We are making every lamp post a smart lamp post to mount different types of sensors," Prime Minister Lee Hsien Loong said in his National Day Rally speech on Sunday (Aug 20) when he spoke about making Singapore a Smart Nation. The AI-based video analytics system is slated for a trial in Orchard Road and selected housing estates from October (2017).


The Logic of Typical and Atypical Instances (LTA)

AAAI Conferences

The difference between typical instances and atypical instances in a natural categorization process has been introduced by E. Rosh and studied by cognitive psychology and AI. A lot of the knowledge representation systems are expressed in using fuzzy concepts but a degree of membership raises some problem for natural categorizations (especially to classification problems in anthropology, ethnology, archeology, linguistics but also in ontologies), but atypical instances of a concept cannot be apprehended adequately by different degrees from a prototype. Other formal approaches, as paraconsistent logics or non monotonic logics, conceptualize often atypical objects as exceptions. It had yet been developed an alternative way with the logics of determination of the objects (LDO). In this paper, we present the logics of typical and atypical (LTA) in order to give directly a logical approach of typicality / atypicality associated to a concept by a more common way than in LDO, in using only classes and not determination operators. It is introduced a distinction between predicative property and concept defined with its intension and its essence, a part of intension. A typical instance of a concept inherits all properties of intension; a typical instance inherits only properties of essence but it is a full member of the category associated to a concept and not a member with a weak degree of membership. In natural categorization, there are often instances (the exceptions) which do not inherit some properties of the essence; they cannot be considered as atypical instance and belong to the boundary of the category.


A Probabilistic Logic Programming Event Calculus

arXiv.org Artificial Intelligence

We present a system for recognising human activity given a symbolic representation of video content. The input of our system is a set of time-stamped short-term activities (STA) detected on video frames. The output is a set of recognised long-term activities (LTA), which are pre-defined temporal combinations of STA. The constraints on the STA that, if satisfied, lead to the recognition of a LTA, have been expressed using a dialect of the Event Calculus. In order to handle the uncertainty that naturally occurs in human activity recognition, we adapted this dialect to a state-of-the-art probabilistic logic programming framework. We present a detailed evaluation and comparison of the crisp and probabilistic approaches through experimentation on a benchmark dataset of human surveillance videos.