Goto

Collaborating Authors

 Chen, Xiaotang


Split Semantic Detection in Sandplay Images

arXiv.org Artificial Intelligence

Sandplay image, as an important psychoanalysis carrier, is a visual scene constructed by the client selecting and placing sand objects (e.g., sand, river, human figures, animals, vegetation, buildings, etc.). As the projection of the client's inner world, it contains high-level semantic information reflecting the client's subjective psychological states, which is different from the common natural image scene that only contains the objective basic semantics (e.g., object's name, attribute, bounding box, etc.). In this work, we take "split" which is a typical psychological semantics related to many emotional and personality problems as the research goal, and we propose an automatic detection model, which can replace the time-consuming and expensive manual analysis process. To achieve that, we design a distribution map generation method projecting the semantic judgment problem into a visual problem, and a feature dimensionality reduction and extraction algorithm which can provide a good representation of split semantics. Besides, we built a sandplay datasets by collecting one sample from each client and inviting 5 therapists to label each sample, which has a large data cost. Experimental results demonstrated the effectiveness of our proposed method.


See Your Heart: Psychological states Interpretation through Visual Creations

arXiv.org Artificial Intelligence

In psychoanalysis, generating interpretations to one's psychological state through visual creations is facing significant demands. The two main tasks of existing studies in the field of computer vision, sentiment/emotion classification and affective captioning, can hardly satisfy the requirement of psychological interpreting. To meet the demands for psychoanalysis, we introduce a challenging task, \textbf{V}isual \textbf{E}motion \textbf{I}nterpretation \textbf{T}ask (VEIT). VEIT requires AI to generate reasonable interpretations of creator's psychological state through visual creations. To support the task, we present a multimodal dataset termed SpyIn (\textbf{S}and\textbf{p}la\textbf{y} \textbf{In}terpretation Dataset), which is psychological theory supported and professional annotated. Dataset analysis illustrates that SpyIn is not only able to support VEIT, but also more challenging compared with other captioning datasets. Building on SpyIn, we conduct experiments of several image captioning method, and propose a visual-semantic combined model which obtains a SOTA result on SpyIn. The results indicate that VEIT is a more challenging task requiring scene graph information and psychological knowledge. Our work also show a promise for AI to analyze and explain inner world of humanity through visual creations.


A Multi-Task Deep Network for Person Re-Identification

AAAI Conferences

Person re-identification (ReID) focuses on identifying people across different scenes in video surveillance, which is usually formulated as a binary classification task or a ranking task in current person ReID approaches. In this paper, we take both tasks into account and propose a multi-task deep network (MTDnet) that makes use of their own advantages and jointly optimize the two tasks simultaneously for person ReID. To the best of our knowledge, we are the first to integrate both tasks in one network to solve the person ReID. We show that our proposed architecture significantly boosts the performance. Furthermore, deep architecture in general requires a sufficient dataset for training, which is usually not met in person ReID. To cope with this situation, we further extend the MTDnet and propose a cross-domain architecture that is capable of using an auxiliary set to assist training on small target sets. In the experiments, our approach outperforms most of existing person ReID algorithms on representative datasets including CUHK03, CUHK01, VIPeR, iLIDS and PRID2011, which clearly demonstrates the effectiveness of the proposed approach.