Goto

Collaborating Authors

 bort


Enhancing Boundary Segmentation for Topological Accuracy with Skeleton-based Methods

arXiv.org Artificial Intelligence

Topological consistency plays a crucial role in the task of boundary segmentation for reticular images, such as cell membrane segmentation in neuron electron microscopic images, grain boundary segmentation in material microscopic images and road segmentation in aerial images. In these fields, topological changes in segmentation results have a serious impact on the downstream tasks, which can even exceed the misalignment of the boundary itself. To enhance the topology accuracy in segmentation results, we propose the Skea-Topo Aware loss, which is a novel loss function that takes into account the shape of each object and topological significance of the pixels. It consists of two components. First, a skeleton-aware weighted loss improves the segmentation accuracy by better modeling the object geometry with skeletons. Second, a boundary rectified term effectively identifies and emphasizes topological critical pixels in the prediction errors using both foreground and background skeletons in the ground truth and predictions. Experiments prove that our method improves topological consistency by up to 7 points in VI compared to 13 state-of-art methods, based on objective and subjective assessments across three different boundary segmentation datasets. The code is available at https://github.com/clovermini/Skea_topo.


Bort: Towards Explainable Neural Networks with Bounded Orthogonal Constraint

arXiv.org Artificial Intelligence

Deep learning has revolutionized human society, yet the black-box nature of deep neural networks hinders further application to reliability-demanded industries. In the attempt to unpack them, many works observe or impact internal variables to improve the comprehensibility and invertibility of the black-box models. However, existing methods rely on intuitive assumptions and lack mathematical guarantees. To bridge this gap, we introduce Bort, an optimizer for improving model explainability with boundedness and orthogonality constraints on model parameters, derived from the sufficient conditions of model comprehensibility and invertibility. We perform reconstruction and backtracking on the model representations optimized by Bort and observe a clear improvement in model explainability. Based on Bort, we are able to synthesize explainable adversarial samples without additional parameters and training. Surprisingly, we find Bort constantly improves the classification accuracy of various architectures including ResNet and DeiT on MNIST, CIFAR-10, and ImageNet. Code: https://github.com/zbr17/Bort.


A version of the BERT language model that's 20 times as fast

#artificialintelligence

In natural-language understanding (NLU), the Transformer-based BERT language model is king. Its high performance on multiple tasks has strongly influenced contemporary NLU research. On the other hand, it is a relatively big and slow model, which makes it unsuitable for some applications. Multiple efforts have been made to compress the BERT architecture, but the choice of architectural parameters (the number of layers, the number of processing nodes per layer, and so on) has been somewhat arbitrary, and the resulting models are rarely much better than the original at optimizing the balance between the model's size, speed, and error rate. A few weeks ago, we released part of the code for Bort, a highly optimized language model (LM) extracted from the BERT architecture through a combination of two rigorous algorithmic techniques especially designed for neural-network compression.


Amazon's BERT Optimal Subset: 7.9x Faster & 6.3x Smaller Than BERT

#artificialintelligence

The transformer-based architectures BERT has in recent years demonstrated the efficacy of large-scale pretrained models for tackling natural language processing (NLP) tasks such as machine translation and question answering. BERT's large size and complex pretraining process however raise usability concerns for many researchers. In a new paper, a pair of Amazon Alexa researchers extract an optimal subset of architectural parameters for the BERT architecture by applying recent breakthroughs in algorithms for neural architecture search. The proposed optimal subset, "Bort," is just 5.5 percent the effective size of the original BERT-large architecture (not counting the embedding layer), and 16 percent of its net size. Many attempts have been made to extract a simpler sub-architecture of BERT that maintains similar performance to its predecessor while simplifying the pretraining process and shortening inference time. Yet the performance of such sub-architectures is still being surpassed by the original implementation in terms of accuracy, the researchers say, and the choice of the set of architectural parameters in these works often appears to be arbitrary.


This New BERT Is Way Faster & Smaller Than The Original

#artificialintelligence

Recently, the researchers at Amazon introduced an optimal subset of the popular BERT architecture for neural architecture search. This smaller version of BERT is known as BORT and is able to be pre-trained in 288 GPU hours, which is 1.2% of the time required to pre-train the highest-performing BERT parametric architectural variant, RoBERTa-large. Since its inception, BERT has achieved several groundbreaking tasks in the field of natural language processing (NLP) and natural language understanding (NLU). It has made a resounding impact in the area of language modelling, as well. However, several times, the usability of BERT has been considered an issue for various serious concerns, such as its larger size, slow inference time, complex pre-training process, among others.