AITopics

2502.20099

Country:

Asia > Japan > Honshū > Tōhoku > Iwate Prefecture > Morioka (0.04)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
North America > United States > California > Alameda County > Berkeley (0.04)
(2 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
Information Technology > Artificial Intelligence > Natural Language (0.67)

arXiv.org Artificial IntelligenceFeb-27-2025

Text classification using machine learning methods

Oancea, Bogdan

In this paper we present the results of an experiment aimed to use machine learning methods to obtain models that can be used for the automatic classification of products. In order to apply automatic classification methods, we transformed the product names from a text representation to numeric vectors, a process called word embedding. We used several embedding methods: Count Vectorization, TF-IDF, Word2Vec, FASTTEXT, and GloVe. Having the product names in a form of numeric vectors, we proceeded with a set of machine learning methods for automatic classification: Logistic Regression, Multinomial Naive Bayes, kNN, Artificial Neural Networks, Support Vector Machines, and Decision trees with several variants. The results show an impressive accuracy of the classification process for Support Vector Machines, Logistic Regression, and Random Forests. Regarding the word embedding methods, the best results were obtained with the FASTTEXT technique.

classification, product name, representation, (15 more...)

2502.19801

Country:

North America > United States > New York > New York County > New York City (0.05)
Europe > Romania > București - Ilfov Development Region > Municipality of Bucharest > Bucharest (0.05)
Asia > India (0.04)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.96)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.50)

Lu, Zhixin, Kuśmierz, Łukasz, Mihalas, Stefan

A Fokker-Planck-Based Loss Function that Bridges Dynamics with Density Estimation

arXiv.org Artificial IntelligenceFeb-27-2025

We have derived a novel loss function from the Fokker-Planck equation that links dynamical system models with their probability density functions, demonstrating its utility in model identification and density estimation. In the first application, we show that this loss function can enable the extraction of dynamical parameters from non-temporal datasets, including timestamp-free measurements from steady non-equilibrium systems such as noisy Lorenz systems and gene regulatory networks. In the second application, when coupled with a density estimator, this loss facilitates density estimation when the dynamic equations are known. For density estimation, we propose a density estimator that integrates a Gaussian Mixture Model with a normalizing flow model. It simultaneously estimates normalized density, energy, and score functions from both empirical data and dynamics. It is compatible with a variety of data-based training methodologies, including maximum likelihood and score matching. It features a latent space akin to a modern Hopfield network, where the inherent Hopfield energy effectively assigns low densities to sparsely populated data regions, addressing common challenges in neural density estimators. Additionally, this Hopfield-like energy enables direct and rapid data manipulation through the Concave-Convex Procedure (CCCP) rule, facilitating tasks such as denoising and clustering. Our work demonstrates a principled framework for leveraging the complex interdependencies between dynamics and density estimation, as illustrated through synthetic examples that clarify the underlying theoretical intuitions.

density estimator, density function, loss function, (12 more...)

2502.1769

Country: North America > United States > California > Santa Clara County > Stanford (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.34)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Ghosh, Sulagna, Ignatiadis, Nikolaos, Koehler, Frederic, Lee, Amber

Stein's unbiased risk estimate and Hyv\"arinen's score matching

arXiv.org Machine LearningFeb-27-2025

We study two G-modeling strategies for estimating the signal distribution (the empirical Bayesian's prior) from observations corrupted with normal noise. First, we choose the signal distribution by minimizing Stein's unbiased risk estimate (SURE) of the implied Eddington/Tweedie Bayes denoiser, an approach motivated by optimal empirical Bayesian shrinkage estimation of the signals. Second, we select the signal distribution by minimizing Hyv\"arinen's score matching objective for the implied score (derivative of log-marginal density), targeting minimal Fisher divergence between estimated and true marginal densities. While these strategies appear distinct, they are known to be mathematically equivalent. We provide a unified analysis of SURE and score matching under both well-specified signal distribution classes and misspecification. In the classical well-specified setting with homoscedastic noise and compactly supported signal distribution, we establish nearly parametric rates of convergence of the empirical Bayes regret and the Fisher divergence. In a commonly studied misspecified model, we establish fast rates of convergence to the oracle denoiser and corresponding oracle inequalities. Our empirical results demonstrate competitiveness with nonparametric maximum likelihood in well-specified settings, while showing superior performance under misspecification, particularly in settings involving heteroscedasticity and side information.

argument, estimation, inequality, (16 more...)

2502.20123

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California (0.04)

Genre: Research Report > New Finding (0.87)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.45)

arXiv.org Machine LearningFeb-27-2025

Bayesian Computation in Deep Learning

Chen, Wenlong, Li, Bolian, Zhang, Ruqi, Li, Yingzhen

Bayesian computation has achieved profound success in many modeling tasks with statistics tools such as generalized linear models (Dobson and Barnett, 2018; Nelder and Wedderburn, 1972). Yet these traditional tools fail to produce satisfactory predictions for high-dimensional and highly complex data such as images, speech and videos. Deep Learning (LeCun et al., 2015a) provides an attractive solution. At the time of late 2023, deep neural networks achieve accurate predictions for image classification (Dehghani et al., 2023), segmentation (Kirillov et al., 2023) and speech recognition tasks (Zhang et al., 2023). Meanwhile they have also demonstrated an astonishing capability for generating photo-realistic and/or artistic images (Rombach et al., 2022), music (Agostinelli et al., 2023) and videos (Liang et al., 2022). Nowadays deep neural networks have become a standard modeling tool for many of the applications in AI and related fields, and the success of deep learning so far are based on training deterministic deep neural networks on big data. So one might ask: is there a place for Bayesian computation in modern deep learning?

inference, international conference, neural network, (13 more...)

2502.183

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Asia > Middle East > Jordan (0.04)
North America > United States > New York (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Instructional Material (0.92)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Machine LearningFeb-26-2025

Sparkle: A Statistical Learning Toolkit for High-Dimensional Hawkes Processes in Python

Lacoste, Romain Edmond

This paper introduce the Python package Sparklen (see Lacoste (2025)), which implements a complete set of statistical learning methods for exponential Hawkes processes with an emphasize on high-dimension setting. Hawkes processes, introduced in Hawkes (1971), form a specific but rather versatile class of point processes. Such processes model time series in which the occurrence of one event temporarily increases the probability of other events occurring. This intrinsic ability to take into account self-exciting effects makes them particularly interesting for real data modeling. Historically applied in seismology (see Ogata (1988)), they have since been used in a wide variety of other fields, including neuroscience in Reynaud-Bouret, Rivoirard, and Tuleau-Malot (2013), finance in Bacry, Mastromatteo, and Muzy (2015), ecology in Denis, Dion-Blanc, Lacoste, Sansonnet, and Bas (2024). The multidimensional version, known as the Multivariate Hawkes Processes (MHP), captures additionally interactions among each univariate process within a network. This generalization enables the modeling of more intricate dynamics, significantly expanding the range of potential applications. For example, MHP has been applied to model action potentials within neural networks in Bonnet, Dion-Blanc, Gindraud, and Lemler (2022), or for trend detection in social networks in Pinto, Chahed, and Altman (2015).

artificial intelligence, hawke process, machine learning, (19 more...)

2502.18979

Country: Europe > France (0.14)

Genre:

Workflow (0.67)
Overview (0.67)
Research Report (0.63)

Industry:

Information Technology (0.48)
Health & Medicine > Therapeutic Area > Neurology (0.34)
Energy > Oil & Gas > Upstream (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)

arXiv.org Artificial IntelligenceFeb-26-2025

Truth in Text: A Meta-Analysis of ML-Based Cyber Information Influence Detection Approaches

Pittman, Jason M.

Cyber information influence, or disinformation in general terms, is widely regarded as one of the biggest threats to social progress and government stability. From US presidential elections to European Union referendums and down to regional news reporting of wildfires, lies and post-truths have normalized radical decision-making. Accordingly, there has been an explosion in research seeking to detect disinformation in online media. The frontier of disinformation detection research is leveraging a variety of ML techniques such as traditional ML algorithms like Support Vector Machines, Random Forest, and Na\"ive Bayes. Other research has applied deep learning models including Convolutional Neural Networks, Long Short-Term Memory networks, and transformer-based architectures. Despite the overall success of such techniques, the literature demonstrates inconsistencies when viewed holistically which limits our understanding of the true effectiveness. Accordingly, this work employed a two-stage meta-analysis to (a) demonstrate an overall meta statistic for ML model effectiveness in detecting disinformation and (b) investigate the same by subgroups of ML model types. The study found the majority of the 81 ML detection techniques sampled have greater than an 80\% accuracy with a Mean sample effectiveness of 79.18\% accuracy. Meanwhile, subgroups demonstrated no statistically significant difference between-approaches but revealed high within-group variance. Based on the results, this work recommends future work in replication and development of detection methods operating at the ML model level.

artificial intelligence, detection, machine learning, (19 more...)

2503.22686

Country:

South America > Chile (0.04)
North America > United States > Maryland (0.04)
North America > Canada > British Columbia > Vancouver (0.04)
(4 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study > Negative Result (0.66)

Industry:

Media > News (1.00)
Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Kabaha, Anan, Drachsler-Cohen, Dana

Guarding the Privacy of Label-Only Access to Neural Network Classifiers via iDP Verification

arXiv.org Artificial IntelligenceFeb-26-2025

Neural networks are susceptible to privacy attacks that can extract private information of the training set. To cope, several training algorithms guarantee differential privacy (DP) by adding noise to their computation. However, DP requires to add noise considering every possible training set. This leads to a significant decrease in the network's accuracy. Individual DP (iDP) restricts DP to a given training set. We observe that some inputs deterministically satisfy iDP without any noise. By identifying them, we can provide iDP label-only access to the network with a minor decrease to its accuracy. However, identifying the inputs that satisfy iDP without any noise is highly challenging. Our key idea is to compute the iDP deterministic bound (iDP-DB), which overapproximates the set of inputs that do not satisfy iDP, and add noise only to their predicted labels. To compute the tightest iDP-DB, which enables to guard the label-only access with minimal accuracy decrease, we propose LUCID, which leverages several formal verification techniques. First, it encodes the problem as a mixed-integer linear program, defined over a network and over every network trained identically but without a unique data point. Second, it abstracts a set of networks using a hyper-network. Third, it eliminates the overapproximation error via a novel branch-and-bound technique. Fourth, it bounds the differences of matching neurons in the network and the hyper-network and employs linear relaxation if they are small. We show that LUCID can provide classifiers with a perfect individuals' privacy guarantee (0-iDP) -- which is infeasible for DP training algorithms -- with an accuracy decrease of 1.4%. For more relaxed $\varepsilon$-iDP guarantees, LUCID has an accuracy decrease of 1.2%. In contrast, existing DP training algorithms reduce the accuracy by 12.7%.

classifier, dataset, lucid, (15 more...)

2502.16519

Country:

Asia > Middle East > Israel > Haifa District > Haifa (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)
Asia > Taiwan (0.04)

Genre: Research Report (0.82)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.41)

Veness, Joel, Hutter, Marcus, Gyorgy, Andras, Grau-Moya, Jordi

Partition Tree Weighting for Non-Stationary Stochastic Bandits

arXiv.org Artificial IntelligenceFeb-26-2025

In contrast to popular decision-making frameworks such as reinforcement learning, which are built upon appealing to decision-theoretic notions such as Maximum Expected Utility, we instead construct an agent by trying to minimise the expected number of bits needed to losslessly describe general agent-environment interactions. The appeal with this approach is that if we can construct a good universal coding scheme for arbitrary agent interactions, one could simply sample from this coding distribution to generate a control policy. However when considering general agents, whose goal is to work well across multiple environments, this question turns out to be surprisingly subtle. Naive approaches which do not discriminate between actions and observations fail, and are subject to the self-delusion problem [Ortega et al., 2021]. In this work, we will adopt a universal source coding perspective to this question, and showcase its efficacy by applying it to the challenging non-stationary stochastic bandit problem. In the passive case, namely, sequential prediction of observations under the logarithmic loss, there is a well developed universal source coding literature for dealing with non-stationary sources under various types of non-stationarity. The most influential idea for modelling piecewise stationary sources is the transition diagram technique of Willems [1996]. This technique performs Bayesian model averaging over all possible partitions of a sequence of data, cleverly exploiting dynamic programming to yield an algorithm which has both quadratic time complexity and provable regret guarantees.

algorithm, corpusid, semanticscholar, (16 more...)

2502.19325

Genre: Research Report > New Finding (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.87)

arXiv.org Machine LearningFeb-26-2025

Mixture models for data with unknown distributions

Newman, M. E. J.

We describe and analyze a broad class of mixture models for real-valued multivariate data in which the probability density of observations within each component of the model is represented as an arbitrary combination of basis functions. Fits to these models give us a way to cluster data with distributions of unknown form, including strongly non-Gaussian or multimodal distributions, and return both a division of the data and an estimate of the distributions, effectively performing clustering and density estimation within each cluster at the same time. We describe two fitting methods, one using an expectation-maximization (EM) algorithm and the other a Bayesian non-parametric method using a collapsed Gibbs sampler. The former is numerically efficient, but gives only point estimates of the probability densities. The latter is more computationally demanding but returns a full Bayesian posterior and also an estimate of the number of components. We demonstrate our methods with a selection of illustrative applications and give code implementing both algorithms.

algorithm, basis function, mixture model, (16 more...)

2502.19605

Country:

North America > United States > Michigan > Washtenaw County > Ann Arbor (0.14)
South America > Paraguay > Asunción > Asunción (0.04)
North America > United States > Rhode Island > Providence County > Providence (0.04)
(5 more...)

Genre: Research Report > New Finding (0.45)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (0.68)
Consumer Products & Services > Food, Beverage, Tobacco & Cannabis > Beverages (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.93)