Goto

Collaborating Authors

 Bucharest


Visual Question Answering: A Survey on Techniques and Common Trends in Recent Literature

arXiv.org Artificial Intelligence

Visual Question Answering (VQA) is a multi-disciplinary artificial intelligence research problem that has attracted the attention of researchers from computer vision, natural language processing, knowledge representation, and other machine learning communities. To solve that question, VQA is a task of generating natural language answers when a question in natural language is asked related to an image. In recent years, visual question answering as a result of the flourish in this field, datasets, metrics, and models have been proposed, and the scope of research has been expanded. Although artificial intelligence has solved several different problems, such as image classification and natural language processing (NLP), it is hard to model a problem which needs different types of data. For instance, mixing computer vision with NLP to retrieve some information about an image from a question has tricked researchers for several years.


Adverbs, Surprisingly

arXiv.org Artificial Intelligence

This paper begins with the premise that adverbs are neglected in computational linguistics. This view derives from two analyses: a literature review and a novel adverb dataset to probe a state-of-the-art language model, thereby uncovering systematic gaps in accounts for adverb meaning. We suggest that using Frame Semantics for characterizing word meaning, as in FrameNet, provides a promising approach to adverb analysis, given its ability to describe ambiguity, semantic roles, and null instantiation.


HaVQA: A Dataset for Visual Question Answering and Multimodal Research in Hausa Language

arXiv.org Artificial Intelligence

This paper presents HaVQA, the first multimodal dataset for visual question-answering (VQA) tasks in the Hausa language. The dataset was created by manually translating 6,022 English question-answer pairs, which are associated with 1,555 unique images from the Visual Genome dataset. As a result, the dataset provides 12,044 gold standard English-Hausa parallel sentences that were translated in a fashion that guarantees their semantic match with the corresponding visual information. We conducted several baseline experiments on the dataset, including visual question answering, visual question elicitation, text-only and multimodal machine translation.


A Distributed Automatic Domain-Specific Multi-Word Term Recognition Architecture using Spark Ecosystem

arXiv.org Artificial Intelligence

Automatic Term Recognition (ATR) is used to extract domain-specific terms that create the terminology of the domain. A term can be defined as a linguistic structure or a concept and it is composed of one or more words with a specific meaning to a domain. With the exponential growth of technical and scientific articles, new domain-specific terms appear daily as named entities (e.g., Apache Spark), idioms (e.g., Big Data), multi-word expressions (e.g., recurrent neural networks), or through semantic change and shifts (e.g., local neighborhood). Methods that can automatically recognize and extract these domain-specific terms are useful for both scientists and professionals to improve existing systems (i.e., WordNet [4], OntoLex-FRaC [3]) that deal with linguistics, terminology, and machine-readable technologies. ATR methods [6, 7, 5, 8, 9, 11] consist of two main phases. The first phase is extracting a list of candidate terms that will later be used by scoring metrics to rank their importance to a given domain. To extract this list, words are tagged with their part of speech (PoS), and candidate multi-word terms are extracted using language-dependent linguistic filters [10]. The second phase is specific to each method and involves computing a score of domain relevance by using different term statistics, e.g., frequency, context, number of similar terms, etc. Users can process large volumes of textual data when employing ATR methods. The extraction and recognition of domain-specific terms can be improved by developing the methods on top of distributed ecosystems such as Apache Hadoop and Apache Spark.


Romanian Multiword Expression Detection Using Multilingual Adversarial Training and Lateral Inhibition

arXiv.org Artificial Intelligence

Multiword expressions are a key ingredient for developing large-scale and linguistically sound natural language processing technology. This paper describes our improvements in automatically identifying Romanian multiword expressions on the corpus released for the PARSEME v1.2 shared task. Our approach assumes a multilingual perspective based on the recently introduced lateral inhibition layer and adversarial training to boost the performance of the employed multilingual language models. With the help of these two methods, we improve the F1-score of XLM-RoBERTa by approximately 2.7% on unseen multiword expressions, the main task of the PARSEME 1.2 edition. In addition, our results can be considered SOTA performance, as they outperform the previous results on Romanian obtained by the participants in this competition.


Big Data and Large Numbers. Interpreting Zipf's Law

arXiv.org Artificial Intelligence

It turns out that some empirical facts in Big Data are the effects of properties of large numbers. Zipf's law 'noise' is an example of such an artefact. We expose several properties of the power law distributions and of similar distribution that occur when the population is finite and the rank and counts of elements in the population are natural numbers. We are particularly concerned with the low-rank end of the graph of the law, the potential of noise in the law, and with the approximation of the number of types of objects at various ranks. Approximations instead of exact solutions are the center of attention. Consequences in the interpretation of Zipf's law are discussed.


SemEval-2023 Task 12: Sentiment Analysis for African Languages (AfriSenti-SemEval)

arXiv.org Artificial Intelligence

We present the first Africentric SemEval Shared task, Sentiment Analysis for African Languages (AfriSenti-SemEval) - The dataset is available at https://github.com/afrisenti-semeval/afrisent-semeval-2023. AfriSenti-SemEval is a sentiment classification challenge in 14 African languages: Amharic, Algerian Arabic, Hausa, Igbo, Kinyarwanda, Moroccan Arabic, Mozambican Portuguese, Nigerian Pidgin, Oromo, Swahili, Tigrinya, Twi, Xitsonga, and Yor\`ub\'a (Muhammad et al., 2023), using data labeled with 3 sentiment classes. We present three subtasks: (1) Task A: monolingual classification, which received 44 submissions; (2) Task B: multilingual classification, which received 32 submissions; and (3) Task C: zero-shot classification, which received 34 submissions. The best performance for tasks A and B was achieved by NLNDE team with 71.31 and 75.06 weighted F1, respectively. UCAS-IIE-NLP achieved the best average score for task C with 58.15 weighted F1. We describe the various approaches adopted by the top 10 systems and their approaches.


HausaNLP at SemEval-2023 Task 12: Leveraging African Low Resource TweetData for Sentiment Analysis

arXiv.org Artificial Intelligence

We present the findings of SemEval-2023 Task 12, a shared task on sentiment analysis for low-resource African languages using Twitter dataset. The task featured three subtasks; subtask A is monolingual sentiment classification with 12 tracks which are all monolingual languages, subtask B is multilingual sentiment classification using the tracks in subtask A and subtask C is a zero-shot sentiment classification. We present the results and findings of subtask A, subtask B and subtask C. We also release the code on github. Our goal is to leverage low-resource tweet data using pre-trained Afro-xlmr-large, AfriBERTa-Large, Bert-base-arabic-camelbert-da-sentiment (Arabic-camelbert), Multilingual-BERT (mBERT) and BERT models for sentiment analysis of 14 African languages. The datasets for these subtasks consists of a gold standard multi-class labeled Twitter datasets from these languages. Our results demonstrate that Afro-xlmr-large model performed better compared to the other models in most of the languages datasets. Similarly, Nigerian languages: Hausa, Igbo, and Yoruba achieved better performance compared to other languages and this can be attributed to the higher volume of data present in the languages.


Identifying Appropriate Intellectual Property Protection Mechanisms for Machine Learning Models: A Systematization of Watermarking, Fingerprinting, Model Access, and Attacks

arXiv.org Artificial Intelligence

The commercial use of Machine Learning (ML) is spreading; at the same time, ML models are becoming more complex and more expensive to train, which makes Intellectual Property Protection (IPP) of trained models a pressing issue. Unlike other domains that can build on a solid understanding of the threats, attacks and defenses available to protect their IP, the ML-related research in this regard is still very fragmented. This is also due to a missing unified view as well as a common taxonomy of these aspects. In this paper, we systematize our findings on IPP in ML, while focusing on threats and attacks identified and defenses proposed at the time of writing. We develop a comprehensive threat model for IP in ML, categorizing attacks and defenses within a unified and consolidated taxonomy, thus bridging research from both the ML and security communities.


Interpolation property of shallow neural networks

arXiv.org Artificial Intelligence

We study the geometry of global minima of the loss landscape of overparametrized neural networks. In the light of the interpolation threshold outlined in [1] one of the important issues in neural networks is to have guarantees that the interpolation is indeed achieved. We tackle this problem for the case of shallow neural network and show that this holds true in general as long as the activation is not a polynomial of low degree. Standard optimization problems are done in the case the loss function is convex in which case we only have a global minima. Another class of optimization problems is for nonconvex loss functions which has a discrete number of global minima. Recently there has been interesting progress aimed at understanding the locus of the global minima for overparametrized neural networks ([3], [6]) when the activation function is continuous. In this paper, we generalize these results in section 2 for a larger class of activation functions. More precisely, we prove that in the overparametrized regime, we can interpolate any data set consisting of d points with a shallow neural network having at least d neurons on the hidden layer and with an activation function which is locally integrable and not almost everywhere a polynomial of degree at most d 2. In addition, if the activation function is also smooth, the locus of global minima of the loss landscape of an over-parametrized neural network is a submanifold of R