AITopics

Neural networks are traditionally trained under the assumption that data come from a stationary distribution. However, settings which violate this assumption are becoming more popular; examples include supervised learning under distributional shifts, reinforcement learning, continual learning and non-stationary contextual bandits. In this work we introduce a novel learning approach that automatically models and adapts to non-stationarity, via an Ornstein-Uhlenbeck process with an adaptive drift parameter. The adaptive drift tends to draw the parameters towards the initialisation distribution, so the approach can be understood as a form of soft parameter reset. We show empirically that our approach performs well in non-stationary supervised and off-policy reinforcement learning settings.

artificial intelligence, bayesian inference, machine learning, (17 more...)

2411.04034

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Middle East > Jordan (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(3 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry: Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

A Bayesian Mixture Model of Temporal Point Processes with Determinantal Point Process Prior

Dong, Yiwei, Ye, Shaoxin, Cao, Yuwen, Han, Qiyu, Xu, Hongteng, Yang, Hanfang

Asynchronous event sequence clustering aims to group similar event sequences in an unsupervised manner. Mixture models of temporal point processes have been proposed to solve this problem, but they often suffer from overfitting, leading to excessive cluster generation with a lack of diversity. To overcome these limitations, we propose a Bayesian mixture model of Temporal Point Processes with Determinantal Point Process prior (TP$^2$DP$^2$) and accordingly an efficient posterior inference algorithm based on conditional Gibbs sampling. Our work provides a flexible learning framework for event sequence clustering, enabling automatic identification of the potential number of clusters and accurate grouping of sequences with similar features. It is applicable to a wide range of parametric temporal point processes, including neural network-based models. Experimental results on both synthetic and real-world data suggest that our framework could produce moderately fewer yet more diverse mixture components, and achieve outstanding results across multiple evaluation metrics.

event sequence, point process, tp 2, (15 more...)

2411.04397

Country:

Europe > Austria > Vienna (0.14)
Asia > China (0.05)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

TrajGPT: Controlled Synthetic Trajectory Generation Using a Multitask Transformer-Based Spatiotemporal Model

Hsu, Shang-Ling, Tung, Emmanuel, Krumm, John, Shahabi, Cyrus, Shafique, Khurram

Human mobility modeling from GPS-trajectories and synthetic trajectory generation are crucial for various applications, such as urban planning, disaster management and epidemiology. Both of these tasks often require filling gaps in a partially specified sequence of visits - a new problem that we call "controlled" synthetic trajectory generation. Existing methods for next-location prediction or synthetic trajectory generation cannot solve this problem as they lack the mechanisms needed to constrain the generated sequences of visits. Moreover, existing approaches (1) frequently treat space and time as independent factors, an assumption that fails to hold true in real-world scenarios, and (2) suffer from challenges in accuracy of temporal prediction as they fail to deal with mixed distributions and the inter-relationships of different modes with latent variables (e.g., day-of-the-week). These limitations become even more pronounced when the task involves filling gaps within sequences instead of solely predicting the next visit. We introduce TrajGPT, a transformer-based, multi-task, joint spatiotemporal generative model to address these issues. Taking inspiration from large language models, TrajGPT poses the problem of controlled trajectory generation as that of text infilling in natural language. TrajGPT integrates the spatial and temporal models in a transformer architecture through a Bayesian probability model that ensures that the gaps in a visit sequence are filled in a spatiotemporally consistent manner. Our experiments on public and private datasets demonstrate that TrajGPT not only excels in controlled synthetic visit generation but also outperforms competing models in next-location prediction tasks - Relatively, TrajGPT achieves a 26-fold improvement in temporal accuracy while retaining more than 98% of spatial accuracy on average.

prediction, sequence, trajgpt, (13 more...)

doi: 10.1145/3678717.3691303

2411.04381

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.28)
North America > United States > Georgia > Fulton County > Atlanta (0.05)
North America > United States > Virginia > Loudoun County > Ashburn (0.04)
(4 more...)

Genre: Research Report > Promising Solution (0.46)

Industry: Information Technology (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Malinga, Melusi, Lupanda, Isaac, Nkongolo, Mike Wa, van Deventer, Phil

A Multilingual Sentiment Lexicon for Low-Resource Language Translation using Large Languages Models and Explainable AI

South Africa and the Democratic Republic of Congo (DRC) present a complex linguistic landscape with languages such as Zulu, Sepedi, Afrikaans, French, English, and Tshiluba (Ciluba), which creates unique challenges for AI-driven translation and sentiment analysis systems due to a lack of accurately labeled data. This study seeks to address these challenges by developing a multilingual lexicon designed for French and Tshiluba, now expanded to include translations in English, Afrikaans, Sepedi, and Zulu. The lexicon enhances cultural relevance in sentiment classification by integrating language-specific sentiment scores. A comprehensive testing corpus is created to support translation and sentiment analysis tasks, with machine learning models such as Random Forest, Support Vector Machine (SVM), Decision Trees, and Gaussian Naive Bayes (GNB) trained to predict sentiment across low resource languages (LRLs). Among them, the Random Forest model performed particularly well, capturing sentiment polarity and handling language-specific nuances effectively. Furthermore, Bidirectional Encoder Representations from Transformers (BERT), a Large Language Model (LLM), is applied to predict context-based sentiment with high accuracy, achieving 99% accuracy and 98% precision, outperforming other models. The BERT predictions were clarified using Explainable AI (XAI), improving transparency and fostering confidence in sentiment classification. Overall, findings demonstrate that the proposed lexicon and machine learning models significantly enhance translation and sentiment analysis for LRLs in South Africa and the DRC, laying a foundation for future AI models that support underrepresented languages, with applications across education, governance, and business in multilingual contexts.

sentiment, sentiment analysis, sentiment score, (16 more...)

2411.04316

Country:

Africa > Democratic Republic of the Congo (0.54)
Africa > South Africa > Gauteng > Pretoria (0.04)
Europe > Switzerland (0.04)
Asia > Indonesia > Bali (0.04)

Genre: Research Report > New Finding (0.65)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
(5 more...)

Martinez, Rolando Gonzales

Bayesian algorithmic perfumery: A Hierarchical Relevance Vector Machine for the Estimation of Personalized Fragrance Preferences based on Three Sensory Layers and Jungian Personality Archetypes

This study explores a Bayesian algorithmic approach to personalized fragrance recommendation by integrating hierarchical Relevance Vector Machines (RVM) and Jungian personality archetypes. The paper proposes a structured model that links individual scent preferences for top, middle, and base notes to personality traits derived from Jungian archetypes, such as the Hero, Caregiver, and Explorer, among others. The algorithm utilizes Bayesian updating to dynamically refine predictions as users interact with each fragrance note. This iterative process allows for the personalization of fragrance experiences based on prior data and personality assessments, leading to adaptive and interpretable recommendations. By combining psychological theory with Bayesian machine learning, this approach addresses the complexity of modeling individual preferences while capturing user-specific and population-level trends. The study highlights the potential of hierarchical Bayesian frameworks in creating customized olfactory experiences, informed by psychological and demographic factors, contributing to advancements in personalized product design and machine learning applications in sensory-based industries.

archetype, fragrance preference, probability, (15 more...)

2411.03965

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

A Bayesian Approach to Data Point Selection

Xu, Xinnuo, Kim, Minyoung, Lee, Royson, Martinez, Brais, Hospedales, Timothy

Data point selection (DPS) is becoming a critical topic in deep learning due to the ease of acquiring uncurated training data compared to the difficulty of obtaining curated or processed data. Existing approaches to DPS are predominantly based on a bi-level optimisation (BLO) formulation, which is demanding in terms of memory and computation, and exhibits some theoretical defects regarding minibatches. Thus, we propose a novel Bayesian approach to DPS. We view the DPS problem as posterior inference in a novel Bayesian model where the posterior distributions of the instance-wise weights and the main neural network parameters are inferred under a reasonable prior and likelihood model. We employ stochastic gradient Langevin MCMC sampling to learn the main network and instance-wise weights jointly, ensuring convergence even with minibatches. Our update equation is comparable to the widely used SGD and much more efficient than existing BLO-based methods. Through controlled experiments in both the vision and language domains, we present the proof-of-concept. Additionally, we demonstrate that our method scales effectively to large language models and facilitates automated per-task optimization for instruction fine-tuning datasets.

experiment, international conference, scenario, (15 more...)

2411.03768

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > Canada > Ontario > Toronto (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

A Personal data Value at Risk Approach

Enriquez, Luis

What if the main data protection vulnerability is risk management? Data Protection merges three disciplines: data protection law, information security, and risk management. Nonetheless, very little research has been made on the field of data protection risk management, where subjectivity and superficiality are the dominant state of the art. Since the GDPR tells you what to do, but not how to do it, the solution for approaching GDPR compliance is still a gray zone, where the trend is using the rule of thumb. Considering that the most important goal of risk management is to reduce uncertainty in order to take informed decisions, risk management for the protection of the rights and freedoms of the data subjects cannot be disconnected from the impact materialization that data controllers and processors need to assess. This paper proposes a quantitative approach to data protection risk-based compliance from a data controllers perspective, with the aim of proposing a mindset change, where data protection impact assessments can be improved by using data protection analytics, quantitative risk analysis, and calibrating expert opinions.

administrative fine, data breach, risk management, (9 more...)

2411.03217

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.05)
Europe > France > Hauts-de-France > Nord > Lille (0.05)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(7 more...)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.68)
(2 more...)

LeCoz, Adrien, Herbin, Stéphane, Adjed, Faouzi

Confidence Calibration of Classifiers with Many Classes

For classification models based on neural networks, the maximum predicted class probability is often used as a confidence score. This score rarely predicts well the probability of making a correct prediction and requires a post-processing calibration step. However, many confidence calibration methods fail for problems with many classes. To address this issue, we transform the problem of calibrating a multiclass classifier into calibrating a single surrogate binary classifier. This approach allows for more efficient use of standard calibration methods. We evaluate our approach on numerous neural networks used for image or text classification and show that it significantly enhances existing calibration methods.

calibration, model uncal, probability, (16 more...)

2411.02988

Country:

Europe > France (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.04)
(2 more...)

Genre: Research Report > Experimental Study (0.92)

Industry: Energy (0.42)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)

Słupiński, Mikołaj, Lipiński, Piotr

Bayesian Inference in Recurrent Explicit Duration Switching Linear Dynamical Systems

arXiv.org Machine LearningNov-6-2024

In this paper, we propose a novel model called Recurrent Explicit Duration Switching Linear Dynamical Systems (REDSLDS) that incorporates recurrent explicit duration variables into the rSLDS model. We also propose an inference and learning scheme that involves the use of P\'olya-gamma augmentation. We demonstrate the improved segmentation capabilities of our model on three benchmark datasets, including two quantitative datasets and one qualitative dataset.

dataset, redsld, rsld, (13 more...)

arXiv.org Machine Learning

2411.0428

Country:

North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(5 more...)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)

Beręsewicz, Maciej, Wydmuch, Marek, Cherniaiev, Herman, Pater, Robert

Multilingual hierarchical classification of job advertisements for job vacancy statistics

arXiv.org Machine LearningNov-6-2024

The goal of this paper is to develop a multilingual classifier and conditional probability estimator of occupation codes for online job advertisements according in accordance with the International Standard Classification of Occupations (ISCO) extended with the Polish Classification of Occupations and Specializations (KZiS), which is analogous to the European Classification of Occupations. In this paper, we utilise a range of data sources, including a novel one, namely the Central Job Offers Database, which is a register of all vacancies submitted to Public Employment Offices. Their staff members code the vacancies according to the ISCO and KZiS. A hierarchical multi-class classifier has been developed based on the transformer architecture. The classifier begins by encoding the jobs found in advertisements to the widest 1-digit occupational group, and then narrows the assignment to a 6-digit occupation code. We show that incorporation of the hierarchical structure of occupations improves prediction accuracy by 1-2 percentage points, particularly for the hand-coded online job advertisements. Finally, a bilingual (Polish and English) and multilingual (24 languages) model is developed based on data translated using closed and open-source software. The open-source software is provided for the benefit of the official statistics community, with a particular focus on international comparability.

advertisement, classification, dataset, (16 more...)

arXiv.org Machine Learning

2411.03779

Country:

Europe > United Kingdom (0.28)
Europe > Poland > Greater Poland Province > Poznań (0.04)
Europe > Poland > Masovia Province > Warsaw (0.04)
(7 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (0.92)

Industry:

Marketing (1.00)
Education (0.92)
Government > Regional Government > Europe Government (0.46)

Technology:

Information Technology > Software (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(4 more...)