AITopics

2502.17077

Country:

Europe > Spain > Castilla-La Mancha (0.04)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
North America > United States (0.04)
Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)

Genre: Research Report > Promising Solution (0.34)

Industry: Leisure & Entertainment (0.67)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
(2 more...)

Bento, M. P., Câmara, H. B., Seabra, J. F.

Unraveling particle dark matter with Physics-Informed Neural Networks

We parametrically solve the Boltzmann equations governing freeze-in dark matter (DM) in alternative cosmologies with Physics-Informed Neural Networks (PINNs), a mesh-free method. Through inverse PINNs, using a single DM experimental point -- observed relic density -- we determine the physical attributes of the theory, namely power-law cosmologies, inspired by braneworld scenarios, and particle interaction cross sections. The expansion of the Universe in such alternative cosmologies has been parameterized through a switch-like function reproducing the Hubble law at later times. Without loss of generality, we model more realistically this transition with a smooth function. We predict a distinct pair-wise relationship between power-law exponent and particle interactions: for a given cosmology with negative (positive) exponent, smaller (larger) cross sections are required to reproduce the data. Lastly, via Bayesian methods, we quantify the epistemic uncertainty of theoretical parameters found in inverse problems.

arxiv, cosmology, pinn, (16 more...)

2502.17597

Country:

Europe > Portugal > Lisbon > Lisbon (0.04)
Europe > Portugal > Braga > Braga (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Sentiment analysis of texts from social networks based on machine learning methods for monitoring public sentiment

Nurlanuly, Arsen Tolebay

A sentiment analysis system powered by machine learning was created in this study to improve real-time social network public opinion monitoring. For sophisticated sentiment identification, the suggested approach combines cutting-edge transformer-based architectures (DistilBERT, RoBERTa) with traditional machine learning models (Logistic Regression, SVM, Naive Bayes). The system achieved an accuracy of up to 80-85% using transformer models in real-world scenarios after being tested using both deep learning techniques and standard machine learning processes on annotated social media datasets. According to experimental results, deep learning models perform noticeably better than lexicon-based and conventional rule-based classifiers, lowering misclassification rates and enhancing the ability to recognize nuances like sarcasm. According to feature importance analysis, context tokens, sentiment-bearing keywords, and part-of-speech structure are essential for precise categorization. The findings confirm that AI-driven sentiment frameworks can provide a more adaptive and efficient approach to modern sentiment challenges. Despite the system's impressive performance, issues with computing overhead, data quality, and domain-specific terminology still exist. In order to monitor opinions on a broad scale, future research will investigate improving computing performance, extending coverage to various languages, and integrating real-time streaming APIs. The results demonstrate that governments, corporations, and social researchers looking for more in-depth understanding of public mood on digital platforms can find a reliable and adaptable answer in AI-powered sentiment analysis.

logistic regression, sentiment, sentiment analysis, (13 more...)

2502.17143

Country:

Asia > Kazakhstan > Akmola Region > Astana (0.05)
North America > United States > California > San Diego County > San Diego (0.04)

Genre: Research Report > New Finding (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Information Technology > Services (0.72)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.37)

Improved Diffusion-based Generative Model with Better Adversarial Robustness

Wang, Zekun, Yi, Mingyang, Xue, Shuchen, Li, Zhenguo, Liu, Ming, Qin, Bing, Ma, Zhi-Ming

Diffusion Probabilistic Models (DPMs) have achieved significant success in generative tasks. However, their training and sampling processes suffer from the issue of distribution mismatch. During the denoising process, the input data distributions differ between the training and inference stages, potentially leading to inaccurate data generation. To obviate this, we analyze the training objective of DPMs and theoretically demonstrate that this mismatch can be alleviated through Distributionally Robust Optimization (DRO), which is equivalent to performing robustness-driven Adversarial Training (AT) on DPMs. Furthermore, for the recently proposed Consistency Model (CM), which distills the inference process of the DPM, we prove that its training objective also encounters the mismatch issue. Fortunately, this issue can be mitigated by AT as well. Based on these insights, we propose to conduct efficient AT on both DPM and CM. Finally, extensive empirical studies validate the effectiveness of AT in diffusion-based models. The code is available at https://github.com/kugwzk/AT_Diff.

conference paper, diffusion model, international conference, (11 more...)

2502.17099

Country:

Europe > Switzerland > Zürich > Zürich (0.14)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > Canada > British Columbia > Vancouver (0.04)
(4 more...)

Genre:

Research Report (1.00)
Instructional Material (0.68)

Industry: Health & Medicine (0.67)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Bengio, Yoshua, Cohen, Michael, Fornasiere, Damiano, Ghosn, Joumana, Greiner, Pietro, MacDermott, Matt, Mindermann, Sören, Oberman, Adam, Richardson, Jesse, Richardson, Oliver, Rondeau, Marc-Antoine, St-Charles, Pierre-Luc, Williams-King, David

Superintelligent Agents Pose Catastrophic Risks: Can Scientist AI Offer a Safer Path?

The leading AI companies are increasingly focused on building generalist AI agents -- systems that can autonomously plan, act, and pursue goals across almost all tasks that humans can perform. Despite how useful these systems might be, unchecked AI agency poses significant risks to public safety and security, ranging from misuse by malicious actors to a potentially irreversible loss of human control. We discuss how these risks arise from current AI training methods. Indeed, various scenarios and experiments have demonstrated the possibility of AI agents engaging in deception or pursuing goals that were not specified by human operators and that conflict with human interests, such as self-preservation. Following the precautionary principle, we see a strong need for safer, yet still useful, alternatives to the current agency-driven trajectory. Accordingly, we propose as a core building block for further advances the development of a non-agentic AI system that is trustworthy and safe by design, which we call Scientist AI. This system is designed to explain the world from observations, as opposed to taking actions in it to imitate or please humans. It comprises a world model that generates theories to explain data and a question-answering inference machine. Both components operate with an explicit notion of uncertainty to mitigate the risks of overconfident predictions. In light of these considerations, a Scientist AI could be used to assist human researchers in accelerating scientific progress, including in AI safety. In particular, our system can be employed as a guardrail against AI agents that might be created despite the risks involved. Ultimately, focusing on non-agentic AI may enable the benefits of AI innovation while avoiding the risks associated with the current trajectory. We hope these arguments will motivate researchers, developers, and policymakers to favor this safer path.

agent, probability, scientist ai, (16 more...)

2502.15657

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
North America > United States > New York (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(8 more...)

Genre: Research Report > New Finding (0.34)

Industry:

Leisure & Entertainment > Games (1.00)
Law (1.00)
Information Technology > Security & Privacy (1.00)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(3 more...)

Vassaux, Louis, Massoulié, Laurent

The feasibility of multi-graph alignment: a Bayesian approach

We establish thresholds for the feasibility of random multi-graph alignment in two models. In the Gaussian model, we demonstrate an "all-or-nothing" phenomenon: above a critical threshold, exact alignment is achievable with high probability, while below it, even partial alignment is statistically impossible. In the sparse Erd\H{o}s-R\'enyi model, we rigorously identify a threshold below which no meaningful partial alignment is possible and conjecture that above this threshold, partial alignment can be achieved. To prove these results, we develop a general Bayesian estimation framework over metric spaces, which provides insight into a broader class of high-dimensional statistical problems.

alignment, graph, graph alignment, (14 more...)

2502.17142

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.14)
Europe > France > Île-de-France > Paris > Paris (0.04)
Asia > China (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Towards Auto-Regressive Next-Token Prediction: In-Context Learning Emerges from Generalization

Gong, Zixuan, Hu, Xiaolin, Tang, Huayi, Liu, Yong

Large language models (LLMs) have demonstrated remarkable in-context learning (ICL) abilities. However, existing theoretical analysis of ICL primarily exhibits two limitations: (a) Limited i.i.d. Setting. Most studies focus on supervised function learning tasks where prompts are constructed with i.i.d. input-label pairs. This i.i.d. assumption diverges significantly from real language learning scenarios where prompt tokens are interdependent. (b) Lack of Emergence Explanation. Most literature answers what ICL does from an implicit optimization perspective but falls short in elucidating how ICL emerges and the impact of pre-training phase on ICL. In our paper, to extend (a), we adopt a more practical paradigm, auto-regressive next-token prediction (AR-NTP), which closely aligns with the actual training of language models. Specifically, within AR-NTP, we emphasize prompt token-dependency, which involves predicting each subsequent token based on the preceding sequence. To address (b), we formalize a systematic pre-training and ICL framework, highlighting the layer-wise structure of sequences and topics, alongside a two-level expectation. In conclusion, we present data-dependent, topic-dependent and optimization-dependent PAC-Bayesian generalization bounds for pre-trained LLMs, investigating that ICL emerges from the generalization of sequences and topics. Our theory is supported by experiments on numerical linear dynamic systems, synthetic GINC and real-world language datasets.

generalization, pre, sequence, (15 more...)

2502.17024

Country:

Asia > China > Beijing > Beijing (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report > New Finding (0.45)

Industry: Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.92)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

Functional Bayesian Additive Regression Trees with Shape Constraints

Cao, Jiahao, He, Shiyuan, Zhang, Bohai

Motivated by the great success of Bayesian additive regression trees (BART) on regression, we propose a nonparametric Bayesian approach for the function-on-scalar regression problem, termed as Functional BART (FBART). Utilizing spline-based function representation and tree-based domain partition model, FBART offers great flexibility in characterizing the complex and heterogeneous relationship between the response curve and scalar covariates. We devise a tailored Bayesian backfitting algorithm for estimating the parameters in the FBART model. Furthermore, we introduce an FBART model with shape constraints on the response curve, enhancing estimation and prediction performance when prior shape information of response curves is available. By incorporating a shape-constrained prior, we ensure that the posterior samples of the response curve satisfy the required shape constraints (e.g., monotonicity and/or convexity). Our proposed FBART model and its shape-constrained version are the new advances of BART models for functional data. Under certain regularity conditions, we derive the posterior convergence results for both FBART and its shape-constrained version. Finally, the superiority of the proposed methods over other competitive counterparts is validated through simulation experiments under various settings and analyses of two real datasets.

regression, s-fbart, shape constraint, (11 more...)

2502.16888

Country:

Asia > China > Beijing > Beijing (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Guangdong Province > Zhuhai (0.04)

Genre: Research Report (0.64)

Industry:

Energy > Energy Storage (0.67)
Electrical Industrial Apparatus (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)

Terenin, Alexander, Negrea, Jeffrey

An Adversarial Analysis of Thompson Sampling for Full-information Online Learning: from Finite to Infinite Action Spaces

We develop an analysis of Thompson sampling for online learning under full feedback - also known as prediction with expert advice - where the learner's prior is defined over the space of an adversary's future actions, rather than the space of experts. We show regret decomposes into regret the learner expected a priori, plus a prior-robustness-type term we call excess regret. In the classical finite-expert setting, this recovers optimal rates. As an initial step towards practical online learning in settings with a potentially-uncountably-infinite number of experts, we show that Thompson sampling with a certain Gaussian process prior widely-used in the Bayesian optimization literature has a $\mathcal{O}(\beta\sqrt{T\log(1+\lambda)})$ rate against a $\beta$-bounded $\lambda$-Lipschitz adversary.

adversary, algorithm, thompson, (15 more...)

2502.1479

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Germany > Lower Saxony > Gottingen (0.04)

Genre: Research Report (0.50)

Industry: Education > Educational Setting > Online (0.82)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.82)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)
(2 more...)

arXiv.org Machine LearningFeb-23-2025

Optimizing Input Data Collection for Ranking and Selection

Song, Eunhye, Kim, Taeho

We study a ranking and selection (R&S) problem when all solutions share common parametric Bayesian input models updated with the data collected from multiple independent data-generating sources. Our objective is to identify the best system by designing a sequential sampling algorithm that collects input and simulation data given a budget. We adopt the most probable best (MPB) as the estimator of the optimum and show that its posterior probability of optimality converges to one at an exponential rate as the sampling budget increases. Assuming that the input parameters belong to a finite set, we characterize the $\epsilon$-optimal static sampling ratios for input and simulation data that maximize the convergence rate. Using these ratios as guidance, we propose the optimal sampling algorithm for R&S (OSAR) that achieves the $\epsilon$-optimal ratios almost surely in the limit. We further extend OSAR by adopting the kernel ridge regression to improve the simulation output mean prediction. This not only improves OSAR's finite-sample performance, but also lets us tackle the case where the input parameters lie in a continuous space with a strong consistency guarantee for finding the optimum. We numerically demonstrate that OSAR outperforms a state-of-the-art competitor.

artificial intelligence, machine learning, optimizing input data collection, (18 more...)

2502.16659

Country:

North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Hong Kong (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.45)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.45)