Goto

Collaborating Authors

 classification scheme


Transfer Learning with Active Sampling for Rapid Training and Calibration in BCI-P300 Across Health States and Multi-centre Data

Flores, Christian, Contreras, Marcelo, Macedo, Ichiro, Andreu-Perez, Javier

arXiv.org Artificial Intelligence

Machine learning and deep learning advancements have boosted Brain-Computer Interface (BCI) performance, but their wide-scale applicability is limited due to factors like individual health, hardware variations, and cultural differences affecting neural data. Studies often focus on uniform single-site experiments in uniform settings, leading to high performance that may not translate well to real-world diversity. Deep learning models aim to enhance BCI classification accuracy, and transfer learning has been suggested to adapt models to individual neural patterns using a base model trained on others' data. This approach promises better generalizability and reduced overfitting, yet challenges remain in handling diverse and imbalanced datasets from different equipment, subjects, multiple centres in different countries, and both healthy and patient populations for effective model transfer and tuning. In a setting characterized by maximal heterogeneity, we proposed P300 wave detection in BCIs employing a convolutional neural network fitted with adaptive transfer learning based on Poison Sampling Disk (PDS) called Active Sampling (AS), which flexibly adjusts the transition from source data to the target domain. Our results reported for subject adaptive with 40% of adaptive fine-tuning that the averaged classification accuracy improved by 5.36% and standard deviation reduced by 12.22% using two distinct, internationally replicated datasets. These results outperformed in classification accuracy, computational time, and training efficiency, mainly due to the proposed Active Sampling (AS) method for transfer learning.


Class-specific feature selection for classification explainability

Aguilar-Ruiz, Jesus S.

arXiv.org Artificial Intelligence

Feature Selection techniques aim at finding a relevant subset of features that perform equally or better than the original set of features at explaining the behavior of data. Typically, features are extracted from feature ranking or subset selection techniques, and the performance is measured by classification or regression tasks. However, while selected features may not have equal importance for the task, they do have equal importance for each class. This work first introduces a comprehensive review of the concept of class-specific, with a focus on feature selection and classification. The fundamental idea of the class-specific concept resides in the understanding that the significance of each feature can vary from one class to another. This contrasts with the traditional class-independent approach, which evaluates the importance of attributes collectively for all classes. For example, in tumor prediction scenarios, each type of tumor may be associated with a distinct subset of relevant features. These features possess significant discriminatory power, enabling the differentiation of one tumor type from others. This class-specific perspective offers a more effective approach to classification tasks by recognizing and leveraging the unique characteristics of each class. Secondly, classification schemes from one-versus-all and one-versus-each strategies are described, and a novel deep one-versus-each strategy is introduced, which offers advantages from the point of view of explainability (feature selection) and decomposability (classification). Thirdly, a novel class-specific relevance matrix is presented, from which some more sophisticated classification schemes can be derived, such as the three-layer class-specific scheme. The potential for further advancements is wide and will open new horizons for exploring novel research directions in multiclass hyperdimensional contexts.


Predicting building types and functions at transnational scale

Fill, Jonas, Eichelbeck, Michael, Ebner, Michael

arXiv.org Artificial Intelligence

Building-specific knowledge such as building type and function information is important for numerous energy applications. However, comprehensive datasets containing this information for individual households are missing in many regions of Europe. For the first time, we investigate whether it is feasible to predict building types and functional classes at a European scale based on only open GIS datasets available across countries. We train a graph neural network (GNN) classifier on a large-scale graph dataset consisting of OpenStreetMap (OSM) buildings across the EU, Norway, Switzerland, and the UK. To efficiently perform training using the large-scale graph, we utilize localized subgraphs. A graph transformer model achieves a high Cohen's kappa coefficient of 0.754 when classifying buildings into 9 classes, and a very high Cohen's kappa coefficient of 0.844 when classifying buildings into the residential and non-residential classes. The experimental results imply three core novel contributions to literature. Firstly, we show that building classification across multiple countries is possible using a multi-source dataset consisting of information about 2D building shape, land use, degree of urbanization, and countries as input, and OSM tags as ground truth. Secondly, our results indicate that GNN models that consider contextual information about building neighborhoods improve predictive performance compared to models that only consider individual buildings and ignore the neighborhood. Thirdly, we show that training with GNNs on localized subgraphs instead of standard GNNs improves performance for the task of building classification.


Rumour Evaluation with Very Large Language Models

Shehata, Dahlia, Cohen, Robin, Clarke, Charles

arXiv.org Artificial Intelligence

Conversational prompt-engineering-based large language models (LLMs) have enabled targeted control over the output creation, enhancing versatility, adaptability and adhoc retrieval. From another perspective, digital misinformation has reached alarming levels. The anonymity, availability and reach of social media offer fertile ground for rumours to propagate. This work proposes to leverage the advancement of prompting-dependent LLMs to combat misinformation by extending the research efforts of the RumourEval task on its Twitter dataset. To the end, we employ two prompting-based LLM variants (GPT-3.5-turbo and GPT-4) to extend the two RumourEval subtasks: (1) veracity prediction, and (2) stance classification. For veracity prediction, three classifications schemes are experimented per GPT variant. Each scheme is tested in zero-, one- and few-shot settings. Our best results outperform the precedent ones by a substantial margin. For stance classification, prompting-based-approaches show comparable performance to prior results, with no improvement over finetuning methods. Rumour stance subtask is also extended beyond the original setting to allow multiclass classification. All of the generated predictions for both subtasks are equipped with confidence scores determining their trustworthiness degree according to the LLM, and post-hoc justifications for explainability and interpretability purposes. Our primary aim is AI for social good.


Company classification using zero-shot learning

Rizinski, Maryan, Jankov, Andrej, Sankaradas, Vignesh, Pinsky, Eugene, Miskovski, Igor, Trajanov, Dimitar

arXiv.org Artificial Intelligence

In recent years, natural language processing (NLP) has become increasingly important in a variety of business applications, including sentiment analysis, text classification, and named entity recognition. In this paper, we propose an approach for company classification using NLP and zero-shot learning. Our method utilizes pre-trained transformer models to extract features from company descriptions, and then applies zero-shot learning to classify companies into relevant categories without the need for specific training data for each category. We evaluate our approach on a dataset obtained through the Wharton Research Data Services (WRDS), which comprises textual descriptions of publicly traded companies. We demonstrate that the approach can streamline the process of company classification, thereby reducing the time and resources required in traditional approaches such as the Global Industry Classification Standard (GICS). The results show that this method has potential for automation of company classification, making it a promising avenue for future research in this area.


No Strong Feelings One Way or Another: Re-operationalizing Neutrality in Natural Language Inference

Nighojkar, Animesh, Laverghetta, Antonio Jr., Licato, John

arXiv.org Artificial Intelligence

Natural Language Inference (NLI) has been a cornerstone task in evaluating language models' inferential reasoning capabilities. However, the standard three-way classification scheme used in NLI has well-known shortcomings in evaluating models' ability to capture the nuances of natural human reasoning. In this paper, we argue that the operationalization of the neutral label in current NLI datasets has low validity, is interpreted inconsistently, and that at least one important sense of neutrality is often ignored. We uncover the detrimental impact of these shortcomings, which in some cases leads to annotation datasets that actually decrease performance on downstream tasks. We compare approaches of handling annotator disagreement and identify flaws in a recent NLI dataset that designs an annotator study based on a problematic operationalization. Our findings highlight the need for a more refined evaluation framework for NLI, and we hope to spark further discussion and action in the NLP community.


Automating privacy decisions -- where to draw the line?

Morel, Victor, Fischer-Hübner, Simone

arXiv.org Artificial Intelligence

Users are often overwhelmed by privacy decisions to manage their personal data, which can happen on the web, in mobile, and in IoT environments. These decisions can take various forms -- such as decisions for setting privacy permissions or privacy preferences, decisions responding to consent requests, or to intervene and ``reject'' processing of one's personal data --, and each can have different legal impacts. In all cases and for all types of decisions, scholars and industry have been proposing tools to better automate the process of privacy decisions at different levels, in order to enhance usability. We provide in this paper an overview of the main challenges raised by the automation of privacy decisions, together with a classification scheme of the existing and envisioned work and proposals addressing automation of privacy decisions.


Revisiting Wright: Improving supervised classification of rat ultrasonic vocalisations using synthetic training data

Scott, K. Jack, Speers, Lucinda J., Bilkey, David K.

arXiv.org Artificial Intelligence

Rodents communicate through ultrasonic vocalizations (USVs). These calls are of interest because they provide insight into the development and function of vocal communication, and may prove to be useful as a biomarker for dysfunction in models of neurodevelopmental disorders. Rodent USVs can be categorised into different components and while manual classification is time consuming, advances in neural computing have allowed for fast and accurate identification and classification. Here, we adapt a convolutional neural network (CNN), VocalMat, created for analysing mice USVs, for use with rats. We codify a modified schema, adapted from that previously proposed by Wright et al. (2010), for classification, and compare the performance of our adaptation of VocalMat with a benchmark CNN, DeepSqueak. Additionally, we test the effect of inserting synthetic USVs into the training data of our classification network in order to reduce the workload involved in generating a training set. Our results show that the modified VocalMat outperformed the benchmark software on measures of both call identification, and classification. Additionally, we found that the augmentation of training data with synthetic images resulted in a marked improvement in the accuracy of VocalMat when it was subsequently used to analyse novel data. The resulting accuracy on the modified Wright categorizations was sufficiently high to allow for the application of this software in rat USV classification in laboratory conditions. Our findings also show that inserting synthetic USV calls into the training set leads to improvements in accuracy with little extra time-cost.


Solution for the EPO CodeFest on Green Plastics: Hierarchical multi-label classification of patents relating to green plastics using deep learning

Qiao, Tingting, Perez, Gonzalo Moro

arXiv.org Artificial Intelligence

This work aims at hierarchical multi-label patents classification for patents disclosing technologies related to green plastics. This is an emerging field for which there is currently no classification scheme, and hence, no labeled data is available, making this task particularly challenging. We first propose a classification scheme for this technology and a way to learn a machine learning model to classify patents into the proposed classification scheme. To achieve this, we come up with a strategy to automatically assign labels to patents in order to create a labeled training dataset that can be used to learn a classification model in a supervised learning setting. Using said training dataset, we come up with two classification models, a SciBERT Neural Network (SBNN) model and a SciBERT Hierarchical Neural Network (SBHNN) model. Both models use a BERT model as a feature extractor and on top of it, a neural network as a classifier. We carry out extensive experiments and report commonly evaluation metrics for this challenging classification problem. The experiment results verify the validity of our approach and show that our model sets a very strong benchmark for this problem. We also interpret our models by visualizing the word importance given by the trained model, which indicates the model is capable to extract high-level semantic information of input documents. Finally, we highlight how our solution fulfills the evaluation criteria for the EPO CodeFest and we also outline possible directions for future work. Our code has been made available at https://github.com/epo/CF22-Green-Hands


Towards holistic scene understanding: Semantic segmentation and beyond

Meletis, Panagiotis

arXiv.org Artificial Intelligence

This dissertation addresses visual scene understanding and enhances segmentation performance and generalization, training efficiency of networks, and holistic understanding. First, we investigate semantic segmentation in the context of street scenes and train semantic segmentation networks on combinations of various datasets. In Chapter 2 we design a framework of hierarchical classifiers over a single convolutional backbone, and train it end-to-end on a combination of pixel-labeled datasets, improving generalizability and the number of recognizable semantic concepts. Chapter 3 focuses on enriching semantic segmentation with weak supervision and proposes a weakly-supervised algorithm for training with bounding box-level and image-level supervision instead of only with per-pixel supervision. The memory and computational load challenges that arise from simultaneous training on multiple datasets are addressed in Chapter 4. We propose two methodologies for selecting informative and diverse samples from datasets with weak supervision to reduce our networks' ecological footprint without sacrificing performance. Motivated by memory and computation efficiency requirements, in Chapter 5, we rethink simultaneous training on heterogeneous datasets and propose a universal semantic segmentation framework. This framework achieves consistent increases in performance metrics and semantic knowledgeability by exploiting various scene understanding datasets. Chapter 6 introduces the novel task of part-aware panoptic segmentation, which extends our reasoning towards holistic scene understanding. This task combines scene and parts-level semantics with instance-level object detection. In conclusion, our contributions span over convolutional network architectures, weakly-supervised learning, part and panoptic segmentation, paving the way towards a holistic, rich, and sustainable visual scene understanding.