AITopics

The Bayesian conjugate gradient method offers probabilistic solutions to linear systems but suffers from poor calibration, limiting its utility in uncertainty quantification tasks. Recent approaches leveraging postit-erations to construct priors have improved computational properties but failed to correct calibration issues. In this work, we propose a novel randomised postiteration strategy that enhances the calibration of the BayesCG posterior while preserving its favourable convergence characteristics. We present theoretical guarantees for the improved calibration, supported by results on the distribution of posterior errors. Numerical experiments demonstrate the efficacy of the method in both synthetic and inverse problem settings, showing enhanced uncertainty quantification and better propagation of uncertainties through computational pipelines.

artificial intelligence, machine learning, posterior, (16 more...)

2504.04247

Genre: Research Report (0.51)

Industry: Energy > Oil & Gas > Upstream (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)

Winer, Mike, Hanin, Boris

Deep Neural Nets as Hamiltonians

arXiv.org Artificial IntelligenceApr-5-2025

Neural networks are complex functions of both their inputs and parameters. Much prior work in deep learning theory analyzes the distribution of network outputs at a fixed a set of inputs (e.g. a training dataset) over random initializations of the network parameters. The purpose of this article is to consider the opposite situation: we view a randomly initialized Multi-Layer Perceptron (MLP) as a Hamiltonian over its inputs. For typical realizations of the network parameters, we study the properties of the energy landscape induced by this Hamiltonian, focusing on the structure of near-global minimum in the limit of infinite width. Specifically, we use the replica trick to perform an exact analytic calculation giving the entropy (log volume of space) at a given energy. We further derive saddle point equations that describe the overlaps between inputs sampled iid from the Gibbs distribution induced by the random MLP. For linear activations we solve these saddle point equations exactly. But we also solve them numerically for a variety of depths and activation functions, including $\tanh, \sin, \text{ReLU}$, and shaped non-linearities. We find even at infinite width a rich range of behaviors. For some non-linearities, such as $\sin$, for instance, we find that the landscapes of random MLPs exhibit full replica symmetry breaking, while shallow $\tanh$ and ReLU networks or deep shaped MLPs are instead replica symmetric.

artificial intelligence, machine learning, neural network, (18 more...)

arXiv.org Artificial Intelligence

2503.23982

Country:

Europe (0.28)
North America > United States (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Causal Inference Isn't Special: Why It's Just Another Prediction Problem

Fernández-Loría, Carlos

Causal inference is often portrayed as fundamentally distinct from predictive modeling, with its own terminology, goals, and intellectual challenges. But at its core, causal inference is simply a structured instance of prediction under distribution shift. In both cases, we begin with labeled data from a source domain and seek to generalize to a target domain where outcomes are not observed. The key difference is that in causal inference, the labels -- potential outcomes -- are selectively observed based on treatment assignment, introducing bias that must be addressed through assumptions. This perspective reframes causal estimation as a familiar generalization problem and highlights how techniques from predictive modeling, such as reweighting and domain adaptation, apply directly to causal tasks. It also clarifies that causal assumptions are not uniquely strong -- they are simply more explicit. By viewing causal inference through the lens of prediction, we demystify its logic, connect it to familiar tools, and make it more accessible to practitioners and educators alike.

artificial intelligence, causal inference, machine learning, (13 more...)

2504.0432

Genre: Research Report (0.51)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceApr-5-2025

Opioid Named Entity Recognition (ONER-2025) from Reddit

Ahmad, Muhammad, Farid, Humaira, Ameer, Iqra, Amjad, Maaz, Muzamil, Muhammad, Hamza, Ameer, Jalal, Muhammad, Batyrshin, Ildar, Sidorov, Grigori

The opioid overdose epidemic remains a critical public health crisis, particularly in the United States, leading to significant mortality and societal costs. Social media platforms like Reddit provide vast amounts of unstructured data that offer insights into public perceptions, discussions, and experiences related to opioid use. This study leverages Natural Language Processing (NLP), specifically Opioid Named Entity Recognition (ONER-2025), to extract actionable information from these platforms. Our research makes four key contributions. First, we created a unique, manually annotated dataset sourced from Reddit, where users share self-reported experiences of opioid use via different administration routes. This dataset contains 331,285 tokens and includes eight major opioid entity categories. Second, we detail our annotation process and guidelines while discussing the challenges of labeling the ONER-2025 dataset. Third, we analyze key linguistic challenges, including slang, ambiguity, fragmented sentences, and emotionally charged language, in opioid discussions. Fourth, we propose a real-time monitoring system to process streaming data from social media, healthcare records, and emergency services to identify overdose events. Using 5-fold cross-validation in 11 experiments, our system integrates machine learning, deep learning, and transformer-based language models with advanced contextual embeddings to enhance understanding. Our transformer-based models (bert-base-NER and roberta-base) achieved 97% accuracy and F1-score, outperforming baselines by 10.23% (RF=0.88).

information retrieval, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2504.00027

Country: North America > United States (1.00)

Genre: Research Report > New Finding (0.69)

Industry:

Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Addiction Disorder (1.00)
Health & Medicine > Public Health (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Souza, J. V. S., Vieira, C. B., Cavalcanti, G. D. C., Cruz, R. M. O.

Imbalanced malware classification: an approach based on dynamic classifier selection

arXiv.org Artificial IntelligenceApr-5-2025

In recent years, the rise of cyber threats has emphasized the need for robust malware detection systems, especially on mobile devices. Malware, which targets vulnerabilities in devices and user data, represents a substantial security risk. A significant challenge in malware detection is the imbalance in datasets, where most applications are benign, with only a small fraction posing a threat. This study addresses the often-overlooked issue of class imbalance in malware detection by evaluating various machine learning strategies for detecting malware in Android applications. We assess monolithic classifiers and ensemble methods, focusing on dynamic selection algorithms, which have shown superior performance compared to traditional approaches. In contrast to balancing strategies performed on the whole dataset, we propose a balancing procedure that works individually for each classifier in the pool. Our empirical analysis demonstrates that the KNOP algorithm obtained the best results using a pool of Random Forest. Additionally, an instance hardness assessment revealed that balancing reduces the difficulty of the minority class and enhances the detection of the minority class (malware). The code used for the experiments is available at https://github.com/jvss2/Machine-Learning-Empirical-Evaluation.

artificial intelligence, classifier, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2504.00041

Country: South America > Brazil (0.46)

Genre: Research Report (0.40)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.36)

Shaw, Luke, Whalley, Peter A.

Randomised Splitting Methods and Stochastic Gradient Descent

We explore an explicit link between stochastic gradient descent using common batching strategies and splitting methods for ordinary differential equations. From this perspective, we introduce a new minibatching strategy (called Symmetric Minibatching Strategy) for stochastic gradient optimisation which shows greatly reduced stochastic gradient bias (from $\mathcal{O}(h^2)$ to $\mathcal{O}(h^4)$ in the optimiser stepsize $h$), when combined with momentum-based optimisers. We justify why momentum is needed to obtain the improved performance using the theory of backward analysis for splitting integrators and provide a detailed analytic computation of the stochastic gradient bias on a simple example. Further, we provide improved convergence guarantees for this new minibatching strategy using Lyapunov techniques that show reduced stochastic gradient bias for a fixed stepsize (or learning rate) over the class of strongly-convex and smooth objective functions. Via the same techniques we also improve the known results for the Random Reshuffling strategy for stochastic gradient descent methods with momentum. We argue that this also leads to a faster convergence rate when considering a decreasing stepsize schedule. Both the reduced bias and efficacy of decreasing stepsizes are demonstrated numerically on several motivating examples.

artificial intelligence, machine learning, splitting method, (15 more...)

2504.04274

Country: Europe (1.00)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

CATS: Mitigating Correlation Shift for Multivariate Time Series Classification

Lin, Xiao, Zeng, Zhichen, Wei, Tianxin, Liu, Zhining, chen, Yuzhong, Tong, Hanghang

Unsupervised Domain Adaptation (UDA) leverages labeled source data to train models for unlabeled target data. Given the prevalence of multivariate time series (MTS) data across various domains, the UDA task for MTS classification has emerged as a critical challenge. However, for MTS data, correlations between variables often vary across domains, whereas most existing UDA works for MTS classification have overlooked this essential characteristic. To bridge this gap, we introduce a novel domain shift, {\em correlation shift}, measuring domain differences in multivariate correlation. To mitigate correlation shift, we propose a scalable and parameter-efficient \underline{C}orrelation \underline{A}dapter for M\underline{TS} (CATS). Designed as a plug-and-play technique compatible with various Transformer variants, CATS employs temporal convolution to capture local temporal patterns and a graph attention module to model the changing multivariate correlation. The adapter reweights the target correlations to align the source correlations with a theoretically guaranteed precision. A correlation alignment loss is further proposed to mitigate correlation shift, bypassing the alignment challenge from the non-i.i.d. nature of MTS data. Extensive experiments on four real-world datasets demonstrate that (1) compared with vanilla Transformer-based models, CATS increases over $10\%$ average accuracy while only adding around $1\%$ parameters, and (2) all Transformer variants equipped with CATS either reach or surpass state-of-the-art baselines.

artificial intelligence, machine learning, natural language, (18 more...)

2504.04283

Country: North America > United States > Illinois (0.28)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

BBC NewsApr-4-2025, 23:51:01 GMT

Will 75 be the new normal for video games after Switch 2's Mario Kart?

Experts don't think Mario Kart World will be a one off. Christopher Dring, editor-in-chief and co-founder of The Game Business, said he expected to see price rises elsewhere too - particularly for the most anticipated titles, such as the latest edition of the Grand Theft Auto franchise. "I think if you're going see a game that's going to be able to charge more, look out for when GTA 6 gets a release date later in the year," he said. He says there are lots of reasons prices might go up, part of which is that modern games are a lot of work. "These games are taking longer to make, they require more people to make them," he said. But there's also the fact, he says, that video game prices have not kept up with inflation.

mario kart, switch 2, video game

BBC News

Industry: Leisure & Entertainment > Games > Computer Games (0.99)

Technology: Information Technology > Artificial Intelligence > Games (0.65)

BBC NewsApr-4-2025, 23:45:21 GMT

Sam Altman's AI-generated cricket jersey image gets Indians talking

Yet another user put into words a pattern he seemed to have spotted in Altman's recent social media posts - and a question that seems to be on many Indian users' minds. "Over the past few days, you've been praising India and Indian customers a lot. How did this sudden love for India come about? It feels like there's some deep strategy going on behind the scenes," he wrote on X. While the comment may sound a bit conspiratorial, there's some truth to at least part of it.

altman, artificial intelligence, social media, (4 more...)

BBC News

Country: Asia > India (0.93)

Industry: Government (0.56)

Technology:

Information Technology > Artificial Intelligence (1.00)
Information Technology > Communications > Social Media (0.83)

MashableApr-4-2025, 22:59:35 GMT

OpenAI is offering free ChatGPT Plus for college students

OpenAI is offering two months of free ChatGPT Plus to all college students, as CEO Sam Altman recently announced ahead of a much-anticipated update to the AI chatbot. The offer is available through May for U.S. and Canadian students only, and can be claimed on the ChatGPT student landing page. According to the site, Existing ChatGPT Plus subscribers and new students will be verified through a system called SheerID to confirm current enrollment. Make note: the subscription will automatically renew at the ChatGPT Plus monthly rate ( 20) if not cancelled before the two months are up. The paid version of ChatGPT includes extended limits on chatting, file uploads, and image generation, as well as advanced voice mode with video and screen sharing, limited Sora access, and new GPT‑4o and o3‑mini models.

large language model, machine learning, natural language, (9 more...)

Mashable

Industry: Education > Educational Setting > Higher Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.87)