Goto

Collaborating Authors

 Bucharest


Residual Information in Deep Speaker Embedding Architectures

arXiv.org Artificial Intelligence

Speaker embeddings represent a means to extract representative vectorial representations from a speech signal such that the representation pertains to the speaker identity alone. The embeddings are commonly used to classify and discriminate between different speakers. However, there is no objective measure to evaluate the ability of a speaker embedding to disentangle the speaker identity from the other speech characteristics. This means that the embeddings are far from ideal, highly dependent on the training corpus and still include a degree of residual information pertaining to factors such as linguistic content, recording conditions or speaking style of the utterance. This paper introduces an analysis over six sets of speaker embeddings extracted with some of the most recent and high-performing DNN architectures, and in particular, the degree to which they are able to truly disentangle the speaker identity from the speech signal. To correctly evaluate the architectures, a large multi-speaker parallel speech dataset is used. The dataset includes 46 speakers uttering the same set of prompts, recorded in either a professional studio or their home environments. The analysis looks into the intra- and inter-speaker similarity measures computed over the different embedding sets, as well as if simple classification and regression methods are able to extract several residual information factors from the speaker embeddings. The results show that the discriminative power of the analyzed embeddings is very high, yet across all the analyzed architectures, residual information is still present in the representations in the form of a high correlation to the recording conditions, linguistic contents and utterance duration.


It's Just a Matter of Time: Detecting Depression with Time-Enriched Multimodal Transformers

arXiv.org Artificial Intelligence

Depression detection from user-generated content on the internet has been a long-lasting topic of interest in the research community, providing valuable screening tools for psychologists. The ubiquitous use of social media platforms lays out the perfect avenue for exploring mental health manifestations in posts and interactions with other users. Current methods for depression detection from social media mainly focus on text processing, and only a few also utilize images posted by users. In this work, we propose a flexible time-enriched multimodal transformer architecture for detecting depression from social media posts, using pretrained models for extracting image and text embeddings. Our model operates directly at the user-level, and we enrich it with the relative time between posts by using time2vec positional embeddings. Moreover, we propose another model variant, which can operate on randomly sampled and unordered sets of posts to be more robust to dataset noise. We show that our method, using EmoBERTa and CLIP embeddings, surpasses other methods on two multimodal datasets, obtaining state-of-the-art results of 0.931 F1 score on a popular multimodal Twitter dataset, and 0.902 F1 score on the only multimodal Reddit dataset.


DANES: Deep Neural Network Ensemble Architecture for Social and Textual Context-aware Fake News Detection

arXiv.org Artificial Intelligence

The growing popularity of social media platforms has simplified the creation and distribution of news articles but also creates a conduit for spreading fake news. In consequence, the need arises for effective context-aware fake news detection mechanisms, where the contextual information can be built either from the textual content of posts or from available social data (e.g., information about the users, reactions to posts, or the social network). In this paper, we propose DANES, a Deep Neural Network Ensemble Architecture for Social and Textual Context-aware Fake News Detection. DANES comprises a Text Branch for a textual content-based context and a Social Branch for the social context. These two branches are used to create a novel Network Embedding. Preliminary ablation results on 3 real-world datasets, i.e., BuzzFace, Twitter15, and Twitter16, are promising, with an accuracy that outperforms state-of-the-art solutions when employing both social and textual content features.


EDSA-Ensemble: an Event Detection Sentiment Analysis Ensemble Architecture

arXiv.org Artificial Intelligence

As social media platforms grow more and more each day, it also increases the need to analyze and understand certain aspects, such as the impact of important or spiking topics over the network[49]. Event Detection techniques are used to automatically identify important or spiking topics by analysing social media data. In this paper, we use the angle of the positive emotion generated by these topics for the users and the magnitude, both reach and time span, in order to better understand what is happening on social media platforms, mainly Twitter. Sentiment Analysis is a field in Natural Language Processing that analyzes user opinions and emotions from written language [38, 66], while Event Detection deals with analyzing information diffusion in graph networks [24]. Although there is a large volume of work done on Event Detection using social media data and on Sentiment Analysis of this type of content, in the current literature, there is a shortcoming of the approaches that combine the two domains. There are multiple communities that are involved in mining, gathering, and giving some meaning to the vast amount of content generated daily by the users of those platforms, namely the Network Analysis and Natural Language Processing communities. The two communities are using different types of approaches since they have different purposes: For the Network Analysis community, the main purpose is developing methods to deal with the spread and mitigation of harmful content using Event Detection. Event Detection is used to detect the impact and spread of topics on Social Networks using multiple types of approaches such as sliding windows, topic detection, etc.


ContCommRTD: A Distributed Content-based Misinformation-aware Community Detection System for Real-Time Disaster Reporting

arXiv.org Artificial Intelligence

Real-time social media data can provide useful information on evolving hazards. Alongside traditional methods of disaster detection, the integration of social media data can considerably enhance disaster management. In this paper, we investigate the problem of detecting geolocation-content communities on Twitter and propose a novel distributed system that provides in near real-time information on hazard-related events and their evolution. We show that content-based community analysis leads to better and faster dissemination of reports on hazards. Our distributed disaster reporting system analyzes the social relationship among worldwide geolocated tweets, and applies topic modeling to group tweets by topics. Considering for each tweet the following information: user, timestamp, geolocation, retweets, and replies, we create a publisher-subscriber distribution model for topics. We use content similarity and the proximity of nodes to create a new model for geolocation-content based communities. Users can subscribe to different topics in specific geographical areas or worldwide and receive real-time reports regarding these topics. As misinformation can lead to increase damage if propagated in hazards related tweets, we propose a new deep learning model to detect fake news. The misinformed tweets are then removed from display. We also show empirically the scalability capabilities of the proposed system.


Same or Different? Diff-Vectors for Authorship Analysis

arXiv.org Artificial Intelligence

We investigate the effects on authorship identification tasks of a fundamental shift in how to conceive the vectorial representations of documents that are given as input to a supervised learner. In ``classic'' authorship analysis a feature vector represents a document, the value of a feature represents (an increasing function of) the relative frequency of the feature in the document, and the class label represents the author of the document. We instead investigate the situation in which a feature vector represents an unordered pair of documents, the value of a feature represents the absolute difference in the relative frequencies (or increasing functions thereof) of the feature in the two documents, and the class label indicates whether the two documents are from the same author or not. This latter (learner-independent) type of representation has been occasionally used before, but has never been studied systematically. We argue that it is advantageous, and that in some cases (e.g., authorship verification) it provides a much larger quantity of information to the training process than the standard representation. The experiments that we carry out on several publicly available datasets (among which one that we here make available for the first time) show that feature vectors representing pairs of documents (that we here call Diff-Vectors) bring about systematic improvements in the effectiveness of authorship identification tasks, and especially so when training data are scarce (as it is often the case in real-life authorship identification scenarios). Our experiments tackle same-author verification, authorship verification, and closed-set authorship attribution; while DVs are naturally geared for solving the 1st, we also provide two novel methods for solving the 2nd and 3rd that use a solver for the 1st as a building block.


A comparison of several AI techniques for authorship attribution on Romanian texts

arXiv.org Artificial Intelligence

Determining the author of a text is a difficult task. Here we compare multiple AI techniques for classifying literary texts written by multiple authors by taking into account a limited number of speech parts (prepositions, adverbs, and conjunctions). We also introduce a new dataset composed of texts written in the Romanian language on which we have run the algorithms. The compared methods are Artificial Neural Networks, Support Vector Machines, Multi Expression Programming, Decision Trees with C5.0, and k-Nearest Neighbour. Numerical experiments show, first of all, that the problem is difficult, but some algorithms are able to generate decent errors on the test set.


Threats, Vulnerabilities, and Controls of Machine Learning Based Systems: A Survey and Taxonomy

arXiv.org Artificial Intelligence

In this article, we propose the Artificial Intelligence Security Taxonomy to systematize the knowledge of threats, vulnerabilities, and security controls of machine-learning-based (ML-based) systems. We first classify the damage caused by attacks against ML-based systems, define ML-specific security, and discuss its characteristics. Next, we enumerate all relevant assets and stakeholders and provide a general taxonomy for ML-specific threats. Then, we collect a wide range of security controls against ML-specific threats through an extensive review of recent literature. Finally, we classify the vulnerabilities and controls of an ML-based system in terms of each vulnerable asset in the system's entire lifecycle.


A review of clustering models in educational data science towards fairness-aware learning

arXiv.org Artificial Intelligence

Ensuring fairness is essential for every education system. Machine learning is increasingly supporting the education system and educational data science (EDS) domain, from decision support to educational activities and learning analytics. However, the machine learning-based decisions can be biased because the algorithms may generate the results based on students' protected attributes such as race or gender. Clustering is an important machine learning technique to explore student data in order to support the decision-maker, as well as support educational activities, such as group assignments. Therefore, ensuring high-quality clustering models along with satisfying fairness constraints are important requirements. This chapter comprehensively surveys clustering models and their fairness in EDS. We especially focus on investigating the fair clustering models applied in educational activities. It is believed that these models are practical tools for analyzing students' data and ensuring fairness in EDS.


Do Performance Aspirations Matter for Guiding Software Configuration Tuning?

arXiv.org Artificial Intelligence

Configurable software systems can be tuned for better performance. Leveraging on some Pareto optimizers, recent work has shifted from tuning for a single, time-related performance objective to two intrinsically different objectives that assess distinct performance aspects of the system, each with varying aspirations. Before we design better optimizers, a crucial engineering decision to make therein is how to handle the performance requirements with clear aspirations in the tuning process. For this, the community takes two alternative optimization models: either quantifying and incorporating the aspirations into the search objectives that guide the tuning, or not considering the aspirations during the search but purely using them in the later decision-making process only. However, despite being a crucial decision that determines how an optimizer can be designed and tailored, there is a rather limited understanding of which optimization model should be chosen under what particular circumstance, and why. In this paper, we seek to close this gap. Firstly, we do that through a review of over 426 papers in the literature and 14 real-world requirements datasets. Drawing on these, we then conduct a comprehensive empirical study that covers 15 combinations of the state-of-the-art performance requirement patterns, four types of aspiration space, three Pareto optimizers, and eight real-world systems/environments, leading to 1,296 cases of investigation. We found that (1) the realism of aspirations is the key factor that determines whether they should be used to guide the tuning; (2) the given patterns and the position of the realistic aspirations in the objective landscape are less important for the choice, but they do matter to the extents of improvement; (3) the available tuning budget can also influence the choice for unrealistic aspirations but it is insignificant under realistic ones.