AITopics | Bucharest

Collaborating Authors

Bucharest

Big5PersonalityEssays: Introducing a Novel Synthetic Generated Dataset Consisting of Short State-of-Consciousness Essays Annotated Based on the Five Factor Model of Personality

arXiv.org Artificial IntelligenceMay-22-2024

Psychology, with a focus on psychometry, heavily relies on statistical models of analysis to create a cohesive understanding of personality and preferences of individuals. One of the fields that showed a proper evolution over time is the psychology of personality. Statistics-wise, personality can be modeled using the Five Factor Model (FFM), this model being the most scientifically validated personality model to date. It consists of five personality traits, each divided into 6 facets, usually. These traits can be memorized using the acronym OCEAN: openness to experience (O), conscienciousness (C), extraversion (E), agreeableness (A) and neuroticism (N). The traits are not correlated with one another, as evidence suggests. However, these 5 personality traits can be mapped into two metatraits: plasticity and stability.

dataset, factor model, personality trait, (11 more...)

arXiv.org Artificial Intelligence

2407.17586

Country: Europe > Romania > București - Ilfov Development Region > Municipality of Bucharest > Bucharest (0.05)

Genre: Research Report (0.50)

Industry:

Education (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Mental Health (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.70)

Add feedback

A Novel Cartography-Based Curriculum Learning Method Applied on RoNLI: The First Romanian Natural Language Inference Corpus

Poesina, Eduard, Caragea, Cornelia, Ionescu, Radu Tudor

arXiv.org Artificial IntelligenceMay-22-2024

Natural language inference (NLI), the task of recognizing the entailment relationship in sentence pairs, is an actively studied topic serving as a proxy for natural language understanding. Despite the relevance of the task in building conversational agents and improving text classification, machine translation and other NLP tasks, to the best of our knowledge, there is no publicly available NLI corpus for the Romanian language. To this end, we introduce the first Romanian NLI corpus (RoNLI) comprising 58K training sentence pairs, which are obtained via distant supervision, and 6K validation and test sentence pairs, which are manually annotated with the correct labels. We conduct experiments with multiple machine learning methods based on distant learning, ranging from shallow models based on word embeddings to transformer-based neural networks, to establish a set of competitive baselines. Furthermore, we improve on the best model by employing a new curriculum learning strategy based on data cartography. Our dataset and code to reproduce the baselines are available at https://github.com/Eduard6421/RONLI.

computational linguistic, proceedings, sentence pair, (12 more...)

arXiv.org Artificial Intelligence

2405.11877

Country:

Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.05)
Europe > Romania > București - Ilfov Development Region > Municipality of Bucharest > Bucharest (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(9 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback

Goal-conditioned reinforcement learning for ultrasound navigation guidance

Amadou, Abdoul Aziz, Singh, Vivek, Ghesu, Florin C., Kim, Young-Ho, Stanciulescu, Laura, Sai, Harshitha P., Sharma, Puneet, Young, Alistair, Rajani, Ronak, Rhode, Kawal

arXiv.org Artificial IntelligenceMay-22-2024

Transesophageal echocardiography (TEE) plays a pivotal role in cardiology for diagnostic and interventional procedures. However, using it effectively requires extensive training due to the intricate nature of image acquisition and interpretation. To enhance the efficiency of novice sonographers and reduce variability in scan acquisitions, we propose a novel ultrasound (US) navigation assistance method based on contrastive learning as goal-conditioned reinforcement learning (GCRL). We augment the previous framework using a novel contrastive patient batching method (CPB) and a data-augmented contrastive loss, both of which we demonstrate are essential to ensure generalization to anatomical variations across patients. The proposed framework enables navigation to both standard diagnostic as well as intricate interventional views with a single model. Our method was developed with a large dataset of 789 patients and obtained an average error of 6.56 mm in position and 9.36 degrees in angle on a testing dataset of 140 patients, which is competitive or superior to models trained on individual views. Furthermore, we quantitatively validate our method's ability to navigate to interventional views such as the Left Atrial Appendage (LAA) view used in LAA closure. Our approach holds promise in providing valuable guidance during transesophageal ultrasound examinations, contributing to the advancement of skill acquisition for cardiac ultrasound practitioners.

goal-conditioned reinforcement, navigation, reinforcement, (16 more...)

arXiv.org Artificial Intelligence

2405.01409

Country:

Europe > United Kingdom > England > Greater London > London (0.04)
Asia > India > Karnataka > Bengaluru (0.04)
North America > United States > New Jersey > Mercer County > Princeton (0.04)
(2 more...)

Genre: Research Report (0.64)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.90)

Add feedback

A Survey of Artificial Intelligence in Gait-Based Neurodegenerative Disease Diagnosis

Rao, Haocong, Zeng, Minlin, Zhao, Xuejiao, Miao, Chunyan

arXiv.org Artificial IntelligenceMay-21-2024

Recent years have witnessed an increasing global population affected by neurodegenerative diseases (NDs), which traditionally require extensive healthcare resources and human effort for medical diagnosis and monitoring. As a crucial disease-related motor symptom, human gait can be exploited to characterize different NDs. The current advances in artificial intelligence (AI) models enable automatic gait analysis for NDs identification and classification, opening a new avenue to facilitate faster and more cost-effective diagnosis of NDs. In this paper, we provide a comprehensive survey on recent progress of machine learning and deep learning based AI techniques applied to diagnosis of five typical NDs through gait. We provide an overview of the process of AI-assisted NDs diagnosis, and present a systematic taxonomy of existing gait data and AI models. Through an extensive review and analysis of 164 studies, we identify and discuss the challenges, potential solutions, and future directions in this field. Finally, we envision the prospective utilization of 3D skeleton data for human gait representation and the development of more efficient AI models for NDs diagnosis. We provide a public resource repository to track and facilitate developments in this emerging field: https://github.com/Kali-Hac/AI4NDD-Survey.

dual-modal attention-enhanced deep learning network, markerless parkinsonian gait pattern quantification, spatial-temporal graph convolutional neural network, (17 more...)

arXiv.org Artificial Intelligence

2405.13082

Country:

Asia > India (0.05)
South America > Colombia (0.04)
Europe > Italy (0.04)
(55 more...)

Genre:

Research Report > Promising Solution (1.00)
Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology > Parkinson's Disease (1.00)
Health & Medicine > Therapeutic Area > Musculoskeletal (1.00)
Health & Medicine > Therapeutic Area > Neurology > Amyotrophic Lateral Sclerosis (ALS) (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.92)

Add feedback

Special Characters Attack: Toward Scalable Training Data Extraction From Large Language Models

Bai, Yang, Pei, Ge, Gu, Jindong, Yang, Yong, Ma, Xingjun

arXiv.org Artificial IntelligenceMay-20-2024

Large language models (LLMs) have achieved remarkable performance on a wide range of tasks. However, recent studies have shown that LLMs can memorize training data and simple repeated tokens can trick the model to leak the data. In this paper, we take a step further and show that certain special characters or their combinations with English letters are stronger memory triggers, leading to more severe data leakage. The intuition is that, since LLMs are trained with massive data that contains a substantial amount of special characters (e.g. structural symbols {, } of JSON files, and @, # in emails and online posts), the model may memorize the co-occurrence between these special characters and the raw texts. This motivates us to propose a simple but effective Special Characters Attack (SCA) to induce training data leakage. Our experiments verify the high effectiveness of SCA against state-of-the-art LLMs: they can leak diverse training data, such as code corpus, web pages, and personally identifiable information, and sometimes generate non-stop outputs as a byproduct. We further show that the composition of the training data corpus can be revealed by inspecting the leaked data -- one crucial piece of information for pre-training high-performance LLMs. Our work can help understand the sensitivity of LLMs to special characters and identify potential areas for improvement.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2405.0599

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Oceania > Australia (0.04)
(34 more...)

Genre: Research Report (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Government > Voting & Elections (1.00)
(4 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Equilibria in multiagent online problems with predictions

Istrate, Gabriel, Bonchiş, Cosmin, Bogdan, Victor

arXiv.org Artificial IntelligenceMay-20-2024

We study the power of (competitive) algorithms with predictions in a multiagent setting. To this extent we introduce a multiagent version of the ski-rental problem. In this problem agents can collaborate by pooling resources to get a group license for some asset. If the license price is not met agents have to rent the asset individually for the day at a unit price. Otherwise the license becomes available forever to everyone at no extra cost. Our main contribution is a best-response analysis of a single-agent competitive algorithm that assumes perfect knowledge of other agents' actions (but no knowledge of its own renting time). We then analyze the setting when agents have a predictor for their own active time, yielding a tradeoff between robustness and consistency. We investigate the effect of using such a predictor in an equilibrium, as well as the new equilibria formed in this way.

agent, algorithm, prediction, (17 more...)

arXiv.org Artificial Intelligence

2405.11873

Country:

Europe > Romania > Vest Development Region > Timiș County > Timișoara (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
Europe > Romania > București - Ilfov Development Region > Municipality of Bucharest > Bucharest (0.04)
(2 more...)

Genre: Research Report (0.81)

Industry: Leisure & Entertainment > Games (0.68)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

Designing NLP Systems That Adapt to Diverse Worldviews

Creanga, Claudiu, Dinu, Liviu P.

arXiv.org Artificial IntelligenceMay-18-2024

Natural Language Inference (NLI) is foundational for evaluating language understanding in AI. However, progress has plateaued, with models failing on ambiguous examples and exhibiting poor generalization. We argue that this stems from disregarding the subjective nature of meaning, which is intrinsically tied to an individual's \textit{weltanschauung} (which roughly translates to worldview). Existing NLP datasets often obscure this by aggregating labels or filtering out disagreement. We propose a perspectivist approach: building datasets that capture annotator demographics, values, and justifications for their labels. Such datasets would explicitly model diverse worldviews. Our initial experiments with a subset of the SBIC dataset demonstrate that even limited annotator metadata can improve model performance.

annotator, dataset, worldview, (14 more...)

arXiv.org Artificial Intelligence

2405.11197

Country:

North America > United States > Washington > King County > Seattle (0.04)
North America > Canada (0.04)
Europe > Romania > București - Ilfov Development Region > Municipality of Bucharest > Bucharest (0.04)
(2 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Transformer based neural networks for emotion recognition in conversations

Creanga, Claudiu, Dinu, Liviu P.

arXiv.org Artificial IntelligenceMay-18-2024

This paper outlines the approach of the ISDS-NLP team in the SemEval 2024 Task 10: Emotion Discovery and Reasoning its Flip in Conversation (EDiReF). For Subtask 1 we obtained a weighted F1 score of 0.43 and placed 12 in the leaderboard. We investigate two distinct approaches: Masked Language Modeling (MLM) and Causal Language Modeling (CLM). For MLM, we employ pre-trained BERT-like models in a multilingual setting, fine-tuning them with a classifier to predict emotions. Experiments with varying input lengths, classifier architectures, and fine-tuning strategies demonstrate the effectiveness of this approach. Additionally, we utilize Mistral 7B Instruct V0.2, a state-of-the-art model, applying zero-shot and few-shot prompting techniques. Our findings indicate that while Mistral shows promise, MLMs currently outperform them in sentence-level emotion classification.

classifier, emotion, subtask 1, (15 more...)

arXiv.org Artificial Intelligence

2405.11222

Country: Europe > Romania > București - Ilfov Development Region > Municipality of Bucharest > Bucharest (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.65)

Add feedback

Automated Text Identification Using CNN and Training Dynamics

Creanga, Claudiu, Dinu, Liviu Petrisor

arXiv.org Artificial IntelligenceMay-18-2024

We used Data Maps to model and characterize the AuTexTification dataset. This provides insights about the behaviour of individual samples during training across epochs (training dynamics). We characterized the samples across 3 dimensions: confidence, variability and correctness. This shows the presence of 3 regions: easy-to-learn, ambiguous and hard-to-learn examples. We used a classic CNN architecture and found out that training the model only on a subset of ambiguous examples improves the model's out-of-distribution generalization.

ambiguous example, dataset, prediction, (13 more...)

arXiv.org Artificial Intelligence

2405.11212

Country:

Europe > Spain > Andalusia > Jaén Province > Jaén (0.05)
Europe > Romania > București - Ilfov Development Region > Municipality of Bucharest > Bucharest (0.05)

Genre: Research Report (0.65)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

OpenLLM-Ro -- Technical Report on Open-source Romanian LLMs

Masala, Mihai, Ilie-Ablachim, Denis C., Corlatescu, Dragos, Zavelca, Miruna, Leordeanu, Marius, Velicu, Horia, Popescu, Marius, Dascalu, Mihai, Rebedea, Traian

arXiv.org Artificial IntelligenceMay-17-2024

In recent years, Large Language Models (LLMs) have achieved almost human-like performance on various tasks. While some LLMs have been trained on multilingual data, most of the training data is in English. Hence, their performance in English greatly exceeds their performance in other languages. This document presents our approach to training and evaluating the first foundational and chat LLM specialized for Romanian.

huggingface, mistral-7b-v0, preprint arxiv, (15 more...)

arXiv.org Artificial Intelligence

2405.07703

Country:

Europe > Romania > București - Ilfov Development Region > Municipality of Bucharest > Bucharest (0.05)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
Europe > France > Provence-Alpes-Côte d'Azur > Bouches-du-Rhône > Marseille (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback