Goto

Collaborating Authors

 Bucharest


Rethinking the Authorship Verification Experimental Setups

arXiv.org Artificial Intelligence

One of the main drivers of the recent advances in authorship verification is the PAN large-scale authorship dataset. Despite generating significant progress in the field, inconsistent performance differences between the closed and open test sets have been reported. To this end, we improve the experimental setup by proposing five new public splits over the PAN dataset, specifically designed to isolate and identify biases related to the text topic and to the author's writing style. We evaluate several BERT-like baselines on these splits, showing that such models are competitive with authorship verification state-of-the-art methods. Furthermore, using explainable AI, we find that these baselines are biased towards named entities. We show that models trained without the named entities obtain better results and generalize better when tested on DarkReddit, our new dataset for authorship verification.


VeriDark: A Large-Scale Benchmark for Authorship Verification on the Dark Web

arXiv.org Artificial Intelligence

The Dark Web represents a hotbed for illicit activity, where users communicate on different market forums in order to exchange goods and services. Law enforcement agencies benefit from forensic tools that perform authorship analysis, in order to identify and profile users based on their textual content. However, authorship analysis has been traditionally studied using corpora featuring literary texts such as fragments from novels or fan fiction, which may not be suitable in a cybercrime context. Moreover, the few works that employ authorship analysis tools for cybercrime prevention usually employ ad-hoc experimental setups and datasets. To address these issues, we release VeriDark: a benchmark comprised of three large scale authorship verification datasets and one authorship identification dataset obtained from user activity from either Dark Web related Reddit communities or popular illicit Dark Web market forums. We evaluate competitive NLP baselines on the three datasets and perform an analysis of the predictions to better understand the limitations of such approaches. We make the datasets and baselines publicly available at https://github.com/bit-ml/VeriDark.


Large scale traffic forecasting with gradient boosting, Traffic4cast 2022 challenge

arXiv.org Artificial Intelligence

Accurate traffic forecasting is of the utmost importance for optimal travel planning and for efficient city mobility. IARAI (The Institute of Advanced Research in Artificial Intelligence) organizes Traffic4cast, a yearly traffic prediction competition based on real-life data [https://www.iarai.ac.at/traffic4cast/], aiming to leverage artificial intelligence advances for producing accurate traffic estimates. We present our solution to the IARAI Traffic4cast 2022 competition, in which the goal is to develop algorithms for predicting road graph edge congestion classes and supersegment-level travel times. In contrast to the previous years, this year's competition focuses on modelling graph edge level behaviour, rather than more coarse aggregated grid-based traffic movies. Due to this, we leverage a method familiar from tabular data modelling -- gradient-boosted decision tree ensembles. We reduce the dimensionality of the input data representing traffic counters with the help of the classic PCA method and feed it as input to a LightGBM model. This simple, fast, and scalable technique allowed us to win second place in the core competition. The source code and references to trained model files and submissions are available at https://github.com/skandium/t4c22 .


Deep Learning-Based Anomaly Detection in Synthetic Aperture Radar Imaging

arXiv.org Machine Learning

In this paper, we proposed to investigate unsupervised anomaly detection in Synthetic Aperture Radar (SAR) images. Our approach considers anomalies as abnormal patterns that deviate from their surroundings but without any prior knowledge of their characteristics. In the literature, most model-based algorithms face three main issues. First, the speckle noise corrupts the image and potentially leads to numerous false detections. Second, statistical approaches may exhibit deficiencies in modeling spatial correlation in SAR images. Finally, neural networks based on supervised learning approaches are not recommended due to the lack of annotated SAR data, notably for the class of abnormal patterns. Our proposed method aims to address these issues through a self-supervised algorithm. The speckle is first removed through the deep learning SAR2SAR algorithm. Then, an adversarial autoencoder is trained to reconstruct an anomaly-free SAR image. Finally, a change detection processing step is applied between the input and the output to detect anomalies. Experiments are performed to show the advantages of our method compared to the conventional Reed-Xiaoli algorithm, highlighting the importance of an efficient despeckling pre-processing step.


Deep Crowd Anomaly Detection: State-of-the-Art, Challenges, and Future Research Directions

arXiv.org Artificial Intelligence

Crowd anomaly detection is one of the most popular topics in computer vision in the context of smart cities. A plethora of deep learning methods have been proposed that generally outperform other machine learning solutions. Our review primarily discusses algorithms that were published in mainstream conferences and journals between 2020 and 2022. We present datasets that are typically used for benchmarking, produce a taxonomy of the developed algorithms, and discuss and compare their performances. Our main findings are that the heterogeneities of pre-trained convolutional models have a negligible impact on crowd video anomaly detection performance. We conclude our discussion with fruitful directions for future research.


Multilingual Multimodal Learning with Machine Translated Text

arXiv.org Artificial Intelligence

Most vision-and-language pretraining research focuses on English tasks. However, the creation of multilingual multimodal evaluation datasets (e.g. Multi30K, xGQA, XVNLI, and MaRVL) poses a new challenge in finding high-quality training data that is both multilingual and multimodal. In this paper, we investigate whether machine translating English multimodal data can be an effective proxy for the lack of readily available multilingual data. We call this framework TD-MML: Translated Data for Multilingual Multimodal Learning, and it can be applied to any multimodal dataset and model. We apply it to both pretraining and fine-tuning data with a state-of-the-art model. In order to prevent models from learning from low-quality translated text, we propose two metrics for automatically removing such translations from the resulting datasets. In experiments on five tasks across 20 languages in the IGLUE benchmark, we show that translated data can provide a useful signal for multilingual multimodal learning, both at pretraining and fine-tuning.


Humans.ai wows thousands of people with its synthetic AI Guide in Bucharest

#artificialintelligence

Humans.ai attended the sixth edition of Spotlight, one of the most anticipated outdoor visual art festivals in Bucharest, where it stole the show with its " BRING IT TO LIFE" installation, set up on the historical Revolution Square. To make its first appearance at the Spotlight festival, a memorable one, Humans.ai More precisely, a video mapping installation on a large-scale sculpture depicting the iconic Humans Head, the protagonist in the company's NFT collection that symbolizes the symbiosis between humans and artificial intelligence. Tens of thousands of tourists and Bucharest residents were in awe of the light show and the vibrant energy emanated by the video projection made by the Humans.ai Though, the real piece de la resistance was DIANA, our synthetic avatar capable of speaking every language of the European Union.


Step out of KG: Knowledge Graph Completion via Knowledgeable Retrieval and Reading Comprehension

arXiv.org Artificial Intelligence

Knowledge graphs, as the cornerstone of many AI applications, usually face serious incompleteness problems. In recent years, there have been many efforts to study automatic knowledge graph completion (KGC), most of which use existing knowledge to infer new knowledge. However, in our experiments, we find that not all relations can be obtained by inference, which constrains the performance of existing models. To alleviate this problem, we propose a new model based on information retrieval and reading comprehension, namely IR4KGC. Specifically, we pre-train a knowledge-based information retrieval module that can retrieve documents related to the triples to be completed. Then, the retrieved documents are handed over to the reading comprehension module to generate the predicted answers. In experiments, we find that our model can well solve relations that cannot be inferred from existing knowledge, and achieve good results on KGC datasets.


Multimodal Multi-Head Convolutional Attention with Various Kernel Sizes for Medical Image Super-Resolution

arXiv.org Artificial Intelligence

Super-resolving medical images can help physicians in providing more accurate diagnostics. In many situations, computed tomography (CT) or magnetic resonance imaging (MRI) techniques capture several scans (modes) during a single investigation, which can jointly be used (in a multimodal fashion) to further boost the quality of super-resolution results. To this end, we propose a novel multimodal multi-head convolutional attention module to super-resolve CT and MRI scans. Our attention module uses the convolution operation to perform joint spatial-channel attention on multiple concatenated input tensors, where the kernel (receptive field) size controls the reduction rate of the spatial attention, and the number of convolutional filters controls the reduction rate of the channel attention, respectively. We introduce multiple attention heads, each head having a distinct receptive field size corresponding to a particular reduction rate for the spatial attention. We integrate our multimodal multi-head convolutional attention (MMHCA) into two deep neural architectures for super-resolution and conduct experiments on three data sets. Our empirical results show the superiority of our attention module over the state-of-the-art attention mechanisms used in super-resolution. Moreover, we conduct an ablation study to assess the impact of the components involved in our attention module, e.g. the number of inputs or the number of heads. Our code is freely available at https://github.com/lilygeorgescu/MHCA.


YFACC: A Yor\`ub\'a speech-image dataset for cross-lingual keyword localisation through visual grounding

arXiv.org Artificial Intelligence

Visually grounded speech (VGS) models are trained on images paired with unlabelled spoken captions. Such models could be used to build speech systems in settings where it is impossible to get labelled data, e.g. for documenting unwritten languages. However, most VGS studies are in English or other high-resource languages. This paper attempts to address this shortcoming. We collect and release a new single-speaker dataset of audio captions for 6k Flickr images in Yor\`ub\'a -- a real low-resource language spoken in Nigeria. We train an attention-based VGS model where images are automatically tagged with English visual labels and paired with Yor\`ub\'a utterances. This enables cross-lingual keyword localisation: a written English query is detected and located in Yor\`ub\'a speech. To quantify the effect of the smaller dataset, we compare to English systems trained on similar and more data. We hope that this new dataset will stimulate research in the use of VGS models for real low-resource languages.