Goto

Collaborating Authors

 ccnet


CCNETS: A Novel Brain-Inspired Approach for Enhanced Pattern Recognition in Imbalanced Datasets

arXiv.org Artificial Intelligence

This study introduces CCNETS (Causal Learning with Causal Cooperative Nets), a novel generative model-based classifier designed to tackle the challenge of generating data for imbalanced datasets in pattern recognition. CCNETS is uniquely crafted to emulate brain-like information processing and comprises three main components: Explainer, Producer, and Reasoner. Each component is designed to mimic specific brain functions, which aids in generating high-quality datasets and enhancing classification performance. The model is particularly focused on addressing the common and significant challenge of handling imbalanced datasets in machine learning. CCNETS's effectiveness is demonstrated through its application to a "fraud dataset," where normal transactions significantly outnumber fraudulent ones (99.83% vs. 0.17%). Traditional methods often struggle with such imbalances, leading to skewed performance metrics. However, CCNETS exhibits superior classification ability, as evidenced by its performance metrics. Specifically, it achieved an F1-score of 0.7992, outperforming traditional models like Autoencoders and Multi-layer Perceptrons (MLP) in the same context. This performance indicates CCNETS's proficiency in more accurately distinguishing between normal and fraudulent patterns. The innovative structure of CCNETS enhances the coherence between generative and classification models, helping to overcome the limitations of pattern recognition that rely solely on generative models. This study emphasizes CCNETS's potential in diverse applications, especially where quality data generation and pattern recognition are key. It proves effective in machine learning, particularly for imbalanced datasets. CCNETS overcomes current challenges in these datasets and advances machine learning with brain-inspired approaches.


Latent Graph Attention for Enhanced Spatial Context

arXiv.org Artificial Intelligence

Global contexts in images are quite valuable in image-to-image translation problems. Conventional attention-based and graph-based models capture the global context to a large extent, however, these are computationally expensive. Moreover, the existing approaches are limited to only learning the pairwise semantic relation between any two points on the image. In this paper, we present Latent Graph Attention (LGA) a computationally inexpensive (linear to the number of nodes) and stable, modular framework for incorporating the global context in the existing architectures, especially empowering small-scale architectures to give performance closer to large size architectures, thus making the light-weight architectures more useful for edge devices with lower compute power and lower energy needs. LGA propagates information spatially using a network of locally connected graphs, thereby facilitating to construct a semantically coherent relation between any two spatially distant points that also takes into account the influence of the intermediate pixels. Moreover, the depth of the graph network can be used to adapt the extent of contextual spread to the target dataset, thereby being able to explicitly control the added computational cost. To enhance the learning mechanism of LGA, we also introduce a novel contrastive loss term that helps our LGA module to couple well with the original architecture at the expense of minimal additional computational load. We show that incorporating LGA improves the performance on three challenging applications, namely transparent object segmentation, image restoration for dehazing and optical flow estimation.


The Web Can Be Your Oyster for Improving Large Language Models

arXiv.org Artificial Intelligence

Large language models (LLMs) encode a large amount of world knowledge. However, as such knowledge is frozen at the time of model training, the models become static and limited by the training data at that time. In order to further improve the capacity of LLMs for knowledge-intensive tasks, we consider augmenting LLMs with the large-scale web using search engine. Unlike previous augmentation sources (e.g., Wikipedia data dump), the web provides broader, more comprehensive and constantly updated information. In this paper, we present a web-augmented LLM UNIWEB, which is trained over 16 knowledge-intensive tasks in a unified text-to-text format. Instead of simply using the retrieved contents from web, our approach has made two major improvements. Firstly, we propose an adaptive search engine assisted learning method that can self-evaluate the confidence level of LLM's predictions, and adaptively determine when to refer to the web for more data, which can avoid useless or noisy augmentation from web. Secondly, we design a pretraining task, i.e., continual knowledge learning, based on salient spans prediction, to reduce the discrepancy between the encoded and retrieved knowledge. Experiments on a wide range of knowledge-intensive tasks show that our model significantly outperforms previous retrieval-augmented methods.


Review -- CCNet: Criss-Cross Attention for Semantic Segmentation

#artificialintelligence

In TPAMI, besides cross-entropy loss lseg for segmentation loss, there is also the category consistent loss to drive RCCA module to learn category consistent features directly. In TPAMI, besides cross-entropy loss lseg for segmentation loss, there is also the category consistent loss to drive RCCA module to learn category consistent features directly. Let C be the set of classes, Nc is the number of valid elements belonging to category c. hi is the feature vector at spatial position i. μc is the mean feature of category c C (the cluster center). To reduce the computation load, a convolutional layer with 1 1 filters is first applied on the output of RCCA module for dimension reduction and then these three losses are applied on the feature map with fewer channels. Let C be the set of classes, Nc is the number of valid elements belonging to category c. hi is the feature vector at spatial position i. μc is the mean feature of category c C (the cluster center).


Determining Intoxication With Machine Learning Analysis of Eyes

#artificialintelligence

Researchers from Germany and Chile have developed a new machine learning framework capable of evaluating whether a person is intoxicated, based on near infra-red images of their eyes. The research is aimed at the development of'fitness for duty' real-time systems capable of assessing the readiness of an individual to perform critical tasks such as driving, or operating machinery, and uses a novel and scratch-trained object detector that can individuate a subject's eye components from a single image and evaluate them against a database that includes intoxicated and non-intoxicated eye images. You Only Look Once (YOLO) individuates the subject's eyes, after which the framework separates the instances and performs segmentation to break the eye image down into its constituent parts. Initially the system captures and individuates an image of each eye with the You-Only-Look-Once (YOLO) object detection framework. After this, two optimized networks are used to break down the eye images into semantic regions – the Criss Cross attention network (CCNet) released in 2020 by the Huazhong University of Science and Technology, and the DenseNet10 segmentation algorithm, also developed by several of the new paper's researchers at Chile.