Goto

Collaborating Authors

 silva


GeeSanBhava: Sentiment Tagged Sinhala Music Video Comment Data Set

De Mel, Yomal, de Silva, Nisansa

arXiv.org Artificial Intelligence

This study introduce GeeSanBhava, a high-quality data set of Sinhala song comments extracted from YouTube manually tagged using Russell's Valence-Arousal model by three independent human annotators. The human annotators achieve a substantial inter-annotator agreement (Fleiss' kappa = 84.96%). The analysis revealed distinct emotional profiles for different songs, highlighting the importance of comment-based emotion mapping. The study also addressed the challenges of comparing comment-based and song-based emotions, mitigating biases inherent in user-generated content. A number of Machine learning and deep learning models were pre-trained on a related large data set of Sinhala News comments in order to report the zero-shot result of our Sinhala YouTube comment data set. An optimized Multi-Layer Percep-tron model, after extensive hyperparameter tuning, achieved a ROC-AUC score of 0.887. The model is a three-layer MLP with a configuration of 256, 128, and 64 neurons. This research contributes a valuable annotated dataset and provides insights for future work in Sinhala Natural Language Processing and music emotion recognition.


Aspect-Based Sentiment Analysis Techniques: A Comparative Study

Jayakody, Dineth, Isuranda, Koshila, Malkith, A V A, de Silva, Nisansa, Ponnamperuma, Sachintha Rajith, Sandamali, G G N, Sudheera, K L K

arXiv.org Artificial Intelligence

Since the dawn of the digitalisation era, customer feedback and online reviews are unequivocally major sources of insights for businesses. Consequently, conducting comparative analyses of such sources has become the de facto modus operandi of any business that wishes to give itself a competitive edge over its peers and improve customer loyalty. Sentiment analysis is one such method instrumental in gauging public interest, exposing market trends, and analysing competitors. While traditional sentiment analysis focuses on overall sentiment, as the needs advance with time, it has become important to explore public opinions and sentiments on various specific subjects, products and services mentioned in the reviews on a finer-granular level. To this end, Aspect-based Sentiment Analysis (ABSA), supported by advances in Artificial Intelligence (AI) techniques which have contributed to a paradigm shift from simple word-level analysis to tone and context-aware analyses, focuses on identifying specific aspects within the text and determining the sentiment associated with each aspect. In this study, we compare several deep-NN methods for ABSA on two benchmark datasets (Restaurant14 and Laptop-14) and found that FAST LSA obtains the best overall results of 87.6% and 82.6% accuracy but does not pass LSA+DeBERTa which reports 90.33% and 86.21% accuracy respectively.


Shoulders of Giants: A Look at the Degree and Utility of Openness in NLP Research

Ranathunga, Surangika, de Silva, Nisansa, Jayakody, Dilith, Fernando, Aloka

arXiv.org Artificial Intelligence

We analysed a sample of NLP research papers archived in ACL Anthology as an attempt to quantify the degree of openness and the benefit of such an open culture in the NLP community. We observe that papers published in different NLP venues show different patterns related to artefact reuse. We also note that more than 30% of the papers we analysed do not release their artefacts publicly, despite promising to do so. Further, we observe a wide language-wise disparity in publicly available NLP-related artefacts.


US's Blinken begins four-nation Africa tour amid Sahel worries

Al Jazeera

United States Secretary of State Antony Blinken on Monday said the US is committed to deeper relations with Africa despite global crises as he opened a four-country tour of the continent. Blinken is touring four democracies on the Atlantic Coast – Cape Verde, Ivory Coast, Nigeria and Angola – as security deteriorates in the Sahel and doubts grow about a key US base in neighbouring coup-hit Niger. US President Joe Biden welcomed leaders from Africa in 2022 in a show of newfound attention to the continent. But he did not visit Africa last year as promised. Blinken nonetheless quoted Biden as he vowed, "We are all in when it comes to Africa."


Sinhala-English Word Embedding Alignment: Introducing Datasets and Benchmark for a Low Resource Language

Wickramasinghe, Kasun, de Silva, Nisansa

arXiv.org Artificial Intelligence

Since their inception, embeddings have become a primary ingredient in many flavours of Natural Language Processing (NLP) tasks supplanting earlier types of representation. Even though multilingual embeddings have been used for the increasing number of multilingual tasks, due to the scarcity of parallel training data, low-resource languages such as Sinhala, tend to focus more on monolingual embeddings. Then when it comes to the aforementioned multi-lingual tasks, it is challenging to utilize these monolingual embeddings given that even if the embedding spaces have a similar geometric arrangement due to an identical training process, the embeddings of the languages considered are not aligned. This is solved by the embedding alignment task. Even in this, high-resource language pairs are in the limelight while low-resource languages such as Sinhala which is in dire need of help seem to have fallen by the wayside. In this paper, we try to align Sinhala and English word embedding spaces based on available alignment techniques and introduce a benchmark for Sinhala language embedding alignment. In addition to that, to facilitate the supervised alignment, as an intermediate task, we also introduce Sinhala-English alignment datasets. These datasets serve as our anchor datasets for supervised word embedding alignment. Even though we do not obtain results comparable to the high-resource languages such as French, German, or Chinese, we believe our work lays the groundwork for more specialized alignment between English and Sinhala embeddings.


Verifiable Learning for Robust Tree Ensembles

Calzavara, Stefano, Cazzaro, Lorenzo, Pibiri, Giulio Ermanno, Prezza, Nicola

arXiv.org Machine Learning

Verifying the robustness of machine learning models against evasion attacks at test time is an important research problem. Unfortunately, prior work established that this problem is NP-hard for decision tree ensembles, hence bound to be intractable for specific inputs. In this paper, we identify a restricted class of decision tree ensembles, called large-spread ensembles, which admit a security verification algorithm running in polynomial time. We then propose a new approach called verifiable learning, which advocates the training of such restricted model classes which are amenable for efficient verification. We show the benefits of this idea by designing a new training algorithm that automatically learns a large-spread decision tree ensemble from labelled data, thus enabling its security verification in polynomial time. Experimental results on public datasets confirm that large-spread ensembles trained using our algorithm can be verified in a matter of seconds, using standard commercial hardware. Moreover, large-spread ensembles are more robust than traditional ensembles against evasion attacks, at the cost of an acceptable loss of accuracy in the non-adversarial setting.


Rob\^oCIn Small Size League Extended Team Description Paper for RoboCup 2023

de Oliveira, Aline Lima, Gomes, Cauê Addae da Silva, da Silva, Cecília Virginia Santos, Alves, Charles Matheus de Sousa, de Souza, Danilo Andrade Martins, Xavier, Driele Pires Ferreira Araújo, da Silva, Edgleyson Pereira, Martins, Felipe Bezerra, Santos, Lucas Henrique Cavalcanti, Maciel, Lucas Dias, Santos, Matheus Paixão Gumercindo dos, Vasconcelos, Matheus Lafayette, Andrade, Matheus Vinícius Teotonio do Nascimento, de Melo, João Guilherme Oliveira Carvalho, de Moura, João Pedro Souza Pereira, da Silva, José Ronald, Cruz, José Victor Silva, de Morais, Pedro Henrique Santana, de Oliveira, Pedro Paulo Salman, Rodrigues, Riei Joaquim Matos, Fernandes, Roberto Costa, Morais, Ryan Vinicius Santos, Teobaldo, Tamara Mayara Ramos, Silva, Washington Igor dos Santos, Barros, Edna Natividade Silva

arXiv.org Artificial Intelligence

Rob\^oCIn has participated in RoboCup Small Size League since 2019, won its first world title in 2022 (Division B), and is currently a three-times Latin-American champion. This paper presents our improvements to defend the Small Size League (SSL) division B title in RoboCup 2023 in Bordeaux, France. This paper aims to share some of the academic research that our team developed over the past year. Our team has successfully published 2 articles related to SSL at two high-impact conferences: the 25th RoboCup International Symposium and the 19th IEEE Latin American Robotics Symposium (LARS 2022). Over the last year, we have been continuously migrating from our past codebase to Unification. We will describe the new architecture implemented and some points of software and AI refactoring. In addition, we discuss the process of integrating machined components into the mechanical system, our development for participating in the vision blackout challenge last year and what we are preparing for this year.


Sinhala Sentence Embedding: A Two-Tiered Structure for Low-Resource Languages

Weeraprameshwara, Gihan, Jayawickrama, Vihanga, de Silva, Nisansa, Wijeratne, Yudhanjaya

arXiv.org Artificial Intelligence

In the process of numerically modeling natural languages, developing language embeddings is a vital step. However, it is challenging to develop functional embeddings for resource-poor languages such as Sinhala, for which sufficiently large corpora, effective language parsers, and any other required resources are difficult to find. In such conditions, the exploitation of existing models to come up with an efficacious embedding methodology to numerically represent text could be quite fruitful. This paper explores the effectivity of several one-tiered and two-tiered embedding architectures in representing Sinhala text in the sentiment analysis domain. With our findings, the two-tiered embedding architecture where the lower-tier consists of a word embedding and the upper-tier consists of a sentence embedding has been proven to perform better than one-tier word embeddings, by achieving a maximum F1 score of 88.04% in contrast to the 83.76% achieved by word embedding models. Furthermore, embeddings in the hyperbolic space are also developed and compared with Euclidean embeddings in terms of performance. A sentiment data set consisting of Facebook posts and associated reactions have been used for this research. To effectively compare the performance of different embedding systems, the same deep neural network structure has been trained on sentiment data with each of the embedding systems used to encode the text associated.


How artificial intelligence helps 2 environmental scientists unlock the natural world's mysteries > News > USC Dornsife

#artificialintelligence

Machine learning is a very specific form of artificial intelligence. Through algorithms designed to learn from experience, machine learning -- also known as ML -- adapts and grows in efficiency over time as more data is added. The ML-driven program "learns" from its mistakes, and in doing so can reduce the time it takes to analyze mountains of data from years to minutes. Melissa Guzman and Sam Silva are using machine learning to find insights into patterns underlying the natural world. Two recently hired faculty members, Melissa Guzman, Gabilan Assistant Professor of Biological Sciences, and Sam Silva, assistant professor of Earth sciences, both at at the USC Dornsife College of Letters, Arts and Sciences, are already garnering attention for their usage of machine learning to find insights into the seemingly unknowable -- the patterns underlying the natural world.


Watching this AI-assisted art video is like tripping on acid in the Matrix

#artificialintelligence

Jason Silva, futurist and host of National Geographic's "Brain Games," recently published a mind-bending YouTube video combining the technological prowess of AI with the artistic creativity of someone who believes in the power of psychoactive experiences. It's called "Dreaming while awake: a journey into ourselves." The description on Silva's YouTube channel describes the video as: The first art piece of the singularity: born from a human-AI collaboration by Jason Silva, Hueman Instrument and digital intelligence. Personally, I'd describe it as a surrealistic experience that seems equal parts Ted Talk and Burning Man. And, I'd add, it makes me want to eat a bunch of psychedelic mushrooms and think about the future.