Goto

Collaborating Authors

 Bucharest


Intelligent Product 3.0: Decentralised AI Agents and Web3 Intelligence Standards

Wong, Alex C. Y., McFarlane, Duncan, Ellarby, C., Lee, M., Kuok, M.

arXiv.org Artificial Intelligence

The "Intelligent Product" was first introduced as a way to embed intelligence within everyday objects, enabling them to assess and influence their own destiny (Wong et al., 2002). The concept built on the technologies and infrastructure being developed at the Auto-ID Center (Sarma et al., 2000), notably the Electronic Product Code (EPC) for Radio Frequency Identification (RFID), along with related standards for storing and communicating product data. However, this predated blockchain, while the Internet of Things (IoT), a term also coined at the Auto-ID Center by Kevin Ashton (Ashton, 2009), and the Internet itself were still in their infancy as communication platforms. Embedded AI, primarily implemented through software agents, remained largely a research tool at the time. As a result, truly autonomous and fully intelligent products were not attainable until recent innovations in blockchain, Web3, and artificial intelligence. This paper revisits the original vision and specification of the Intelligent Product, charts its refinement over the years, and demonstrates how these emerging capabilities have paved the way for Intelligent Product 3.0. 1


RipVIS: Rip Currents Video Instance Segmentation Benchmark for Beach Monitoring and Safety

Dumitriu, Andrei, Tatui, Florin, Miron, Florin, Ralhan, Aakash, Ionescu, Radu Tudor, Timofte, Radu

arXiv.org Artificial Intelligence

Rip currents are strong, localized and narrow currents of water that flow outwards into the sea, causing numerous beach-related injuries and fatalities worldwide. Accurate identification of rip currents remains challenging due to their amorphous nature and the lack of annotated data, which often requires expert knowledge. To address these issues, we present RipVIS, a large-scale video instance segmentation benchmark explicitly designed for rip current segmentation. RipVIS is an order of magnitude larger than previous datasets, featuring $184$ videos ($212,328$ frames), of which $150$ videos ($163,528$ frames) are with rip currents, collected from various sources, including drones, mobile phones, and fixed beach cameras. Our dataset encompasses diverse visual contexts, such as wave-breaking patterns, sediment flows, and water color variations, across multiple global locations, including USA, Mexico, Costa Rica, Portugal, Italy, Greece, Romania, Sri Lanka, Australia and New Zealand. Most videos are annotated at $5$ FPS to ensure accuracy in dynamic scenarios, supplemented by an additional $34$ videos ($48,800$ frames) without rip currents. We conduct comprehensive experiments with Mask R-CNN, Cascade Mask R-CNN, SparseInst and YOLO11, fine-tuning these models for the task of rip current segmentation. Results are reported in terms of multiple metrics, with a particular focus on the $F_2$ score to prioritize recall and reduce false negatives. To enhance segmentation performance, we introduce a novel post-processing step based on Temporal Confidence Aggregation (TCA). RipVIS aims to set a new standard for rip current segmentation, contributing towards safer beach environments. We offer a benchmark website to share data, models, and results with the research community, encouraging ongoing collaboration and future contributions, at https://ripvis.ai.


Datasets for Depression Modeling in Social Media: An Overview

Bucur, Ana-Maria, Moldovan, Andreea-Codrina, Parvatikar, Krutika, Zampieri, Marcos, KhudaBukhsh, Ashiqur R., Dinu, Liviu P.

arXiv.org Artificial Intelligence

Depression is the most common mental health disorder, and its prevalence increased during the COVID-19 pandemic. As one of the most extensively researched psychological conditions, recent research has increasingly focused on leveraging social media data to enhance traditional methods of depression screening. This paper addresses the growing interest in interdisciplinary research on depression, and aims to support early-career researchers by providing a comprehensive and up-to-date list of datasets for analyzing and predicting depression through social media data. We present an overview of datasets published between 2019 and 2024. We also make the comprehensive list of datasets available online as a continuously updated resource, with the hope that it will facilitate further interdisciplinary research into the linguistic expressions of depression on social media.


An analysis of higher-order kinematics formalisms for an innovative surgical parallel robot

Vaida, Calin, Birlescu, Iosif, Gherman, Bogdan, Condurache, Daniel, Chablat, Damien, Pisla, Doina

arXiv.org Artificial Intelligence

The paper presents a novel modular hybrid parallel robot for pancreatic surgery and its higher-order kinematics derived based on various formalisms. The classical vector, homogeneous transformation matrices and dual quaternion approaches are studied for the kinematic functions using both classical differentiation and multidual algebra. The algorithms for inverse kinematics for all three studied formalisms are presented for both differentiation and multidual algebra approaches. Furthermore, these algorithms are compared based on numerical stability, execution times and number and type of mathematical functions and operators contained in each algorithm. A statistical analysis shows that there is significant improvement in execution time for the algorithms implemented using multidual algebra, while the numerical stability is appropriate for all algorithms derived based on differentiation and multidual algebra. While the implementation of the kinematic algorithms using multidual algebra shows positive results when benchmarked on a standard PC, further work is required to evaluate the multidual algorithms on hardware/software used for the modular parallel robot command and control.


A Retrieval-Based Approach to Medical Procedure Matching in Romanian

Niculae, Andrei, Cosma, Adrian, Radoi, Emilian

arXiv.org Artificial Intelligence

Accurately mapping medical procedure names from healthcare providers to standardized terminology used by insurance companies is a crucial yet complex task. Inconsistencies in naming conventions lead to missclasified procedures, causing administrative inefficiencies and insurance claim problems in private healthcare settings. Many companies still use human resources for manual mapping, while there is a clear opportunity for automation. This paper proposes a retrieval-based architecture leveraging sentence embeddings for medical name matching in the Romanian healthcare system. This challenge is significantly more difficult in underrepresented languages such as Romanian, where existing pretrained language models lack domain-specific adaptation to medical text. We evaluate multiple embedding models, including Romanian, multilingual, and medical-domain-specific representations, to identify the most effective solution for this task. Our findings contribute to the broader field of medical NLP for low-resource languages such as Romanian.


Detecting and Mitigating DDoS Attacks with AI: A Survey

Apostu, Alexandru, Gheorghe, Silviu, Hîji, Andrei, Cleju, Nicolae, Pătraşcu, Andrei, Rusu, Cristian, Ionescu, Radu, Irofti, Paul

arXiv.org Artificial Intelligence

Distributed Denial of Service attacks represent an active cybersecurity research problem. Recent research shifted from static rule-based defenses towards AI-based detection and mitigation. This comprehensive survey covers several key topics. Preeminently, state-of-the-art AI detection methods are discussed. An in-depth taxonomy based on manual expert hierarchies and an AI-generated dendrogram are provided, thus settling DDoS categorization ambiguities. An important discussion on available datasets follows, covering data format options and their role in training AI detection methods together with adversarial training and examples augmentation. Beyond detection, AI based mitigation techniques are surveyed as well. Finally, multiple open research directions are proposed.


Entity-aware Cross-lingual Claim Detection for Automated Fact-checking

Panchendrarajan, Rrubaa, Zubiaga, Arkaitz

arXiv.org Artificial Intelligence

Identifying claims requiring verification is a critical task in automated fact-checking, especially given the proliferation of misinformation on social media platforms. Despite significant progress in the task, there remain open challenges such as dealing with multilingual and multimodal data prevalent in online discourse. Addressing the multilingual challenge, recent efforts have focused on fine-tuning pre-trained multilingual language models. While these models can handle multiple languages, their ability to effectively transfer cross-lingual knowledge for detecting claims spreading on social media remains under-explored. In this paper, we introduce EX-Claim, an entity-aware cross-lingual claim detection model that generalizes well to handle claims written in any language. The model leverages entity information derived from named entity recognition and entity linking techniques to improve the language-level performance of both seen and unseen languages during training. Extensive experiments conducted on three datasets from different social media platforms demonstrate that our proposed model significantly outperforms the baselines, across 27 languages, and achieves the highest rate of knowledge transfer, even with limited training data.


ExDDV: A New Dataset for Explainable Deepfake Detection in Video

Hondru, Vlad, Hogea, Eduard, Onchis, Darian, Ionescu, Radu Tudor

arXiv.org Artificial Intelligence

The ever growing realism and quality of generated videos makes it increasingly harder for humans to spot deepfake content, who need to rely more and more on automatic deepfake detectors. However, deepfake detectors are also prone to errors, and their decisions are not explainable, leaving humans vulnerable to deepfake-based fraud and misinformation. To this end, we introduce ExDDV, the first dataset and benchmark for Explainable Deepfake Detection in Video. ExDDV comprises around 5.4K real and deepfake videos that are manually annotated with text descriptions (to explain the artifacts) and clicks (to point out the artifacts). We evaluate a number of vision-language models on ExDDV, performing experiments with various fine-tuning and in-context learning strategies. Our results show that text and click supervision are both required to develop robust explainable models for deepfake videos, which are able to localize and describe the observed artifacts. Our novel dataset and code to reproduce the results are available at https://github.com/vladhondru25/ExDDV.


Sustainable Greenhouse Microclimate Modeling: A Comparative Analysis of Recurrent and Graph Neural Networks

Seri, Emiliano, Petitta, Marcello, Cornaro, Cristina

arXiv.org Artificial Intelligence

The integration of photovoltaic (PV) systems into greenhouses not only optimizes land use but also enhances sustainable agricultural practices by enabling dual benefits of food production and renewable energy generation. However, accurate prediction of internal environmental conditions is crucial to ensure optimal crop growth while maximizing energy production. This study introduces a novel application of Spatio-Temporal Graph Neural Networks (STGNNs) to greenhouse microclimate modeling, comparing their performance with traditional Recurrent Neural Networks (RNNs). While RNNs excel at temporal pattern recognition, they cannot explicitly model the directional relationships between environmental variables. Our STGNN approach addresses this limitation by representing these relationships as directed graphs, enabling the model to capture both environmental dependencies and their directionality. Using high-frequency data collected at 15-minute intervals from a greenhouse in Volos, Greece, we demonstrate that RNNs achieve exceptional accuracy in winter conditions ($R^2 = 0.985$) but show limitations during summer cooling system operation. Though STGNNs currently show lower performance (winter $R^2 = 0.947$), their architecture offers greater potential for integrating additional variables such as PV generation and crop growth indicators.


UniBERTs: Adversarial Training for Language-Universal Representations

Avram, Andrei-Marius, Lupaşcu, Marian, Cercel, Dumitru-Clementin, Mironică, Ionuţ, Trăuşan-Matu, Ştefan

arXiv.org Artificial Intelligence

This paper presents UniBERT, a compact multilingual language model that leverages an innovative training framework integrating three components: masked language modeling, adversarial training, and knowledge distillation. Pre-trained on a meticulously curated Wikipedia corpus spanning 107 languages, UniBERT is designed to reduce the computational demands of large-scale models while maintaining competitive performance across various natural language processing tasks. Comprehensive evaluations on four tasks -- named entity recognition, natural language inference, question answering, and semantic textual similarity -- demonstrate that our multilingual training strategy enhanced by an adversarial objective significantly improves cross-lingual generalization. Specifically, UniBERT models show an average relative improvement of 7.72% over traditional baselines, which achieved an average relative improvement of only 1.17%, with statistical analysis confirming the significance of these gains (p-value = 0.0181). This work highlights the benefits of combining adversarial training and knowledge distillation to build scalable and robust language models, thereby advancing the field of multilingual and cross-lingual natural language processing.