vat
Autonomous Alignment with Human Value on Altruism through Considerate Self-imagination and Theory of Mind
Tong, Haibo, Lu, Enmeng, Sun, Yinqian, Han, Zhengqiang, Liu, Chao, Zhao, Feifei, Zeng, Yi
One of the most important aspects of aligning with human values is the necessity for agents to autonomously make altruistic, safe, and ethical decisions, considering and caring for human well-being. Current AI extremely pursues absolute superiority in certain tasks, remaining indifferent to the surrounding environment and other agents, which has led to numerous safety risks. Altruistic behavior in human society originates from humans' capacity for empathizing others, known as Theory of Mind (ToM), combined with predictive imaginative interactions before taking action to produce thoughtful and altruistic behaviors. Inspired by this, we are committed to endow agents with considerate self-imagination and ToM capabilities, driving them through implicit intrinsic motivations to autonomously align with human altruistic values. By integrating ToM within the imaginative space, agents keep an eye on the well-being of other agents in real time, proactively anticipate potential risks to themselves and others, and make thoughtful altruistic decisions that balance negative effects on the environment. The ancient Chinese story of Sima Guang Smashes the Vat illustrates the moral behavior of the young Sima Guang smashed a vat to save a child who had accidentally fallen into it, which is an excellent reference scenario for this paper. We design an experimental scenario similar to Sima Guang Smashes the Vat and its variants with different complexities, which reflects the trade-offs and comprehensive considerations between self-goals, altruistic rescue, and avoiding negative side effects. Comparative experimental results indicate that agents are capable of prioritizing altruistic rescue while minimizing irreversible damage to the environment and making more altruistic and thoughtful decisions. This work provides a preliminary exploration of agents' autonomous alignment with human altruistic values, laying the foundation for the subsequent realization of moral and ethical AI.
NAMER: Non-Autoregressive Modeling for Handwritten Mathematical Expression Recognition
Liu, Chenyu, Pan, Jia, Hu, Jinshui, Yin, Baocai, Yin, Bing, Chen, Mingjun, Liu, Cong, Du, Jun, Liu, Qingfeng
Recently, Handwritten Mathematical Expression Recognition (HMER) has gained considerable attention in pattern recognition for its diverse applications in document understanding. Current methods typically approach HMER as an image-to-sequence generation task within an autoregressive (AR) encoder-decoder framework. However, these approaches suffer from several drawbacks: 1) a lack of overall language context, limiting information utilization beyond the current decoding step; 2) error accumulation during AR decoding; and 3) slow decoding speed. To tackle these problems, this paper makes the first attempt to build a novel bottom-up Non-AutoRegressive Modeling approach for HMER, called NAMER. NAMER comprises a Visual Aware Tokenizer (VAT) and a Parallel Graph Decoder (PGD). Initially, the VAT tokenizes visible symbols and local relations at a coarse level. Subsequently, the PGD refines all tokens and establishes connectivities in parallel, leveraging comprehensive visual and linguistic contexts. Experiments on CROHME 2014/2016/2019 and HME100K datasets demonstrate that NAMER not only outperforms the current state-of-the-art (SOTA) methods on ExpRate by 1.93%/2.35%/1.49%/0.62%, but also achieves significant speedups of 13.7x and 6.7x faster in decoding time and overall FPS, proving the effectiveness and efficiency of NAMER.
- Asia > China > Anhui Province > Hefei (0.04)
- Europe > Switzerland > Vaud > Lausanne (0.04)
- Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
Improving Performance of Semi-Supervised Learning by Adversarial Attacks
Yang, Dongyoon, Kim, Kunwoong, Kim, Yongdai
Semi-supervised learning (SSL) algorithm is a setup built upon a realistic assumption that access to a large amount of labeled data is tough. In this study, we present a generalized framework, named SCAR, standing for Selecting Clean samples with Adversarial Robustness, for improving the performance of recent SSL algorithms. By adversarially attacking pre-trained models with semi-supervision, our framework shows substantial advances in classifying images. We introduce how adversarial attacks successfully select high-confident unlabeled data to be labeled with current predictions. On CIFAR10, three recent SSL algorithms with SCAR result in significantly improved image classification.
- Information Technology > Security & Privacy (0.64)
- Government > Military (0.64)
Brain in a Vat: On Missing Pieces Towards Artificial General Intelligence in Large Language Models
Ma, Yuxi, Zhang, Chi, Zhu, Song-Chun
In this perspective paper, we first comprehensively review existing evaluations of Large Language Models (LLMs) using both standardized tests and ability-oriented benchmarks. We pinpoint several problems with current evaluation methods that tend to overstate the capabilities of LLMs. We then articulate what artificial general intelligence should encompass beyond the capabilities of LLMs. We propose four characteristics of generally intelligent agents: 1) they can perform unlimited tasks; 2) they can generate new tasks within a context; 3) they operate based on a value system that underpins task generation; and 4) they have a world model reflecting reality, which shapes their interaction with the world. Building on this viewpoint, we highlight the missing pieces in artificial general intelligence, that is, the unity of knowing and acting. We argue that active engagement with objects in the real world delivers more robust signals for forming conceptual representations. Additionally, knowledge acquisition isn't solely reliant on passive input but requires repeated trials and errors. We conclude by outlining promising future research directions in the field of artificial general intelligence.
- North America > United States (0.28)
- Asia > India (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > China > Beijing > Beijing (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Cognitive Science (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.98)
Regularization with Latent Space Virtual Adversarial Training
Osada, Genki, Ahsan, Budrul, Bora, Revoti Prasad, Nishide, Takashi
Virtual Adversarial Training (VAT) has shown impressive results among recently developed regularization methods called consistency regularization. VAT utilizes adversarial samples, generated by injecting perturbation in the input space, for training and thereby enhances the generalization ability of a classifier. However, such adversarial samples can be generated only within a very small area around the input data point, which limits the adversarial effectiveness of such samples. To address this problem we propose LVAT (Latent space VAT), which injects perturbation in the latent space instead of the input space. LVAT can generate adversarial samples flexibly, resulting in more adverse effects and thus more effective regularization. The latent space is built by a generative model, and in this paper, we examine two different type of models: variational auto-encoder and normalizing flow, specifically Glow. We evaluated the performance of our method in both supervised and semi-supervised learning scenarios for an image classification task using SVHN and CIFAR-10 datasets. In our evaluation, we found that our method outperforms VAT and other state-of-the-art methods.
Submit your presentation proposal - RIL - Finnish Association of Civil Engineers
We accept proposals only via our submission system ExOrdo. And if your sweetheart theme isn't listed, dare yet to send your proposal under the category "Other". Don't let the limits of our imagination limit yourself! If your proposal gets accepted, we will ask you to submit your presentation material to be shared on the event platform (shareable version). Please note that we also intend to record all the sessions and recordings will be shared on the event platform.
Vats
In many robot motion planning problems such as manipulation planning for a personal robot in a kitchen or an industrial manipulator in a warehouse, all motion planning queries are in an environment that is largely static. Consequently, one should be able to improve the performance of a planning algorithm by training on this static environment ahead of operation time. In this work, we propose a method to improve the performance of heuristic search-based motion planners in such environments. The first, learning, phase of our proposed method analyzes search performance on multiple planning episodes to infer local minima zones, that is, regions where the existing heuristic(s) are weakly correlated with the true cost-to-go. Then, in the planning phase of the method, the learnt local minima are used to modify the original search graph in a way that improves search performance. We prove that our method preserves guarantees on completeness and bounded suboptimality with respect to the original search graph. Experimentally, we observe significant improvements in success rate and planning time for challenging 11 degree-of-freedom mobile manipulation problems.
Virtual Adversarial Training for Semi-supervised Breast Mass Classification
Chen, Xuxin, Wang, Ximin, Zhang, Ke, Fung, Kar-Ming, Thai, Theresa C., Moore, Kathleen, Mannel, Robert S., Liu, Hong, Zheng, Bin, Qiu, Yuchen
This study aims to develop a novel computer-aided diagnosis (CAD) scheme for mammographic breast mass classification using semi-supervised learning. Although supervised deep learning has achieved huge success across various medical image analysis tasks, its success relies on large amounts of high-quality annotations, which can be challenging to acquire in practice. To overcome this limitation, we propose employing a semi-supervised method, i.e., virtual adversarial training (VAT), to leverage and learn useful information underlying in unlabeled data for better classification of breast masses. Accordingly, our VAT-based models have two types of losses, namely supervised and virtual adversarial losses. The former loss acts as in supervised classification, while the latter loss aims at enhancing model robustness against virtual adversarial perturbation, thus improving model generalizability. To evaluate the performance of our VAT-based CAD scheme, we retrospectively assembled a total of 1024 breast mass images, with equal number of benign and malignant masses. A large CNN and a small CNN were used in this investigation, and both were trained with and without the adversarial loss. When the labeled ratios were 40% and 80%, VAT-based CNNs delivered the highest classification accuracy of 0.740 and 0.760, respectively. The experimental results suggest that the VAT-based CAD scheme can effectively utilize meaningful knowledge from unlabeled data to better classify mammographic breast mass images.
- North America > United States > Oklahoma > Cleveland County > Norman (0.14)
- North America > United States > Oklahoma > Oklahoma County > Oklahoma City (0.05)
- Asia > China > Shanghai > Shanghai (0.04)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
- Health & Medicine > Therapeutic Area > Oncology > Breast Cancer (0.31)
TransZero++: Cross Attribute-Guided Transformer for Zero-Shot Learning
Chen, Shiming, Hong, Ziming, Xie, Guo-Sen, Zhao, Jian, Li, Hao, You, Xinge, Yan, Shuicheng, Shao, Ling
Zero-shot learning (ZSL) tackles the novel class recognition problem by transferring semantic knowledge from seen classes to unseen ones. Existing attention-based models have struggled to learn inferior region features in a single image by solely using unidirectional attention, which ignore the transferability and discriminative attribute localization of visual features. In this paper, we propose a cross attribute-guided Transformer network, termed TransZero++, to refine visual features and learn accurate attribute localization for semantic-augmented visual embedding representations in ZSL. TransZero++ consists of an attribute$\rightarrow$visual Transformer sub-net (AVT) and a visual$\rightarrow$attribute Transformer sub-net (VAT). Specifically, AVT first takes a feature augmentation encoder to alleviate the cross-dataset problem, and improves the transferability of visual features by reducing the entangled relative geometry relationships among region features. Then, an attribute$\rightarrow$visual decoder is employed to localize the image regions most relevant to each attribute in a given image for attribute-based visual feature representations. Analogously, VAT uses the similar feature augmentation encoder to refine the visual features, which are further applied in visual$\rightarrow$attribute decoder to learn visual-based attribute features. By further introducing semantical collaborative losses, the two attribute-guided transformers teach each other to learn semantic-augmented visual embeddings via semantical collaborative learning. Extensive experiments show that TransZero++ achieves the new state-of-the-art results on three challenging ZSL benchmarks. The codes are available at: \url{https://github.com/shiming-chen/TransZero_pp}.
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.14)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
- Asia > China > Beijing > Beijing (0.04)
- (4 more...)