Plotting

 Geneva


Dynamic allocation of limited memory resources in reinforcement learning - Appendix

Neural Information Processing Systems

A.1 Gradient of the log policy In order to compute In this section, we abuse the notation slightly to omit the explicit dependence on the state-action pair (s, a) for clarity, and instead place it in the subscript. Next, we relax the limit β in Eq. A.1 so as to differentiate the logarithm of the policy log π for β > 0 with respect to the relevant elements of the resource allocation vector σ(s, a) as follows: β( q A.2 Gradient of the cost In this section, we show how to compute I), and I is the identity matrix. Since the covariance matrix is diagonal, we can take the gradient with respect to elements of the resource allocation vector σ individually. In other words, we can take the gradient of each memory's marginal normal distribution with its standard deviation: ( A.4, A.5, and A.6, we can write our analytically obtained gradient of the cost term with respect to individual elements of the resource allocation vector σ(s, a) as: ( D A.3 Justification for our choice of the gradient of expected reward A potential concern regarding our method of allocating resources may be our choice of the advantage function to compute the gradient of the expected rewards (Eqs.


Dynamic allocation of limited memory resources in reinforcement learning Nisheet Patel

Neural Information Processing Systems

Biological brains are inherently limited in their capacity to process and store information, but are nevertheless capable of solving complex tasks with apparent ease. Intelligent behavior is related to these limitations, since resource constraints drive the need to generalize and assign importance differentially to features in the environment or memories of past experiences. Recently, there have been parallel efforts in reinforcement learning and neuroscience to understand strategies adopted by artificial and biological agents to circumvent limitations in information storage. However, the two threads have been largely separate. In this article, we propose a dynamical framework to maximize expected reward under constraints of limited resources, which we implement with a cost function that penalizes precise representations of action-values in memory, each of which may vary in its precision.


Practical Bayesian Optimization for Model Fitting with Bayesian Adaptive Direct Search

Neural Information Processing Systems

Computational models in fields such as computational neuroscience are often evaluated via stochastic simulation or numerical approximation. Fitting these models implies a difficult optimization problem over complex, possibly noisy parameter landscapes. Bayesian optimization (BO) has been successfully applied to solving expensive black-box problems in engineering and machine learning. Here we explore whether BO can be applied as a general tool for model fitting. First, we present a novel hybrid BO algorithm, Bayesian adaptive direct search (BADS), that achieves competitive performance with an affordable computational overhead for the running time of typical models. We then perform an extensive benchmark of BADS vs. many common and state-of-the-art nonconvex, derivativefree optimizers, on a set of model-fitting problems with real data and models from six studies in behavioral, cognitive, and computational neuroscience. With default settings, BADS consistently finds comparable or better solutions than other methods, including'vanilla' BO, showing great promise for advanced BO techniques, and BADS in particular, as a general model-fitting tool.


I was Biden's man in the room at the UN Security Council. Don't let Russia, China take over

FOX News

Over the last four years at the United Nations, the international community has witnessed an alarming trend of closer collaboration between Russia and China that poses a significant threat to the "rules-based order" the United States helped design back in 1945. This increased and renewed level of cooperation presents an unprecedented dilemma for the United States and like-minded partners: how to maintain the existing order, warts and all, when two permanent members of the UN Security Council are now working feverishly to subvert it. To many UN observers, China and Russia have now come to the shared conclusion that the UN has become a tool Washington and its allies regularly use to destabilize their regimes and diminish their global influence. Consequently, the United Nations has become a critical battleground in the current era of "Great Power" competition. During my two-plus years as the U.S. ambassador responsible for UN Security Council matters, I have seen first-hand at the UN how these two authoritarian powers repeatedly and energetically spread falsehoods alleging: U.S. Ambassador to the Conference on Disarmament Robert Wood attends a news conference at the United Nations in Geneva, Switzerland, April 19, 2018.


Strategic White Paper on AI Infrastructure for Particle, Nuclear, and Astroparticle Physics: Insights from JENA and EuCAIF

arXiv.org Artificial Intelligence

Artificial intelligence (AI) is transforming scientific research, with deep learning methods playing a central role in data analysis, simulations, and signal detection across particle, nuclear, and astroparticle physics. Within the JENA communities-ECFA, NuPECC, and APPEC-and as part of the EuCAIF initiative, AI integration is advancing steadily. However, broader adoption remains constrained by challenges such as limited computational resources, a lack of expertise, and difficulties in transitioning from research and development (R&D) to production. This white paper provides a strategic roadmap, informed by a community survey, to address these barriers. It outlines critical infrastructure requirements, prioritizes training initiatives, and proposes funding strategies to scale AI capabilities across fundamental physics over the next five years.


$\Lambda$CDM and early dark energy in latent space: a data-driven parametrization of the CMB temperature power spectrum

arXiv.org Artificial Intelligence

Finding the best parametrization for cosmological models in the absence of first-principle theories is an open question. We propose a data-driven parametrization of cosmological models given by the disentangled 'latent' representation of a variational autoencoder (VAE) trained to compress cosmic microwave background (CMB) temperature power spectra. We consider a broad range of $\Lambda$CDM and beyond-$\Lambda$CDM cosmologies with an additional early dark energy (EDE) component. We show that these spectra can be compressed into 5 ($\Lambda$CDM) or 8 (EDE) independent latent parameters, as expected when using temperature power spectra alone, and which reconstruct spectra at an accuracy well within the Planck errors. These latent parameters have a physical interpretation in terms of well-known features of the CMB temperature spectrum: these include the position, height and even-odd modulation of the acoustic peaks, as well as the gravitational lensing effect. The VAE also discovers one latent parameter which entirely isolates the EDE effects from those related to $\Lambda$CDM parameters, thus revealing a previously unknown degree of freedom in the CMB temperature power spectrum. We further showcase how to place constraints on the latent parameters using Planck data as typically done for cosmological parameters, obtaining latent values consistent with previous $\Lambda$CDM and EDE cosmological constraints. Our work demonstrates the potential of a data-driven reformulation of current beyond-$\Lambda$CDM phenomenological models into the independent degrees of freedom to which the data observables are sensitive.


Beyond checkmate: exploring the creative chokepoints in AI text

arXiv.org Artificial Intelligence

Large Language Models (LLMs) have revolutionized Natural Language Processing (NLP) and Artificial Intelligence (AI), unlocking unprecedented capabilities. This rapid advancement has spurred research into various aspects of LLMs, their text generation & reasoning capability, and potential misuse, fueling the necessity for robust detection methods. While numerous prior research has focused on detecting LLM-generated text (AI text) and thus checkmating them, our study investigates a relatively unexplored territory: portraying the nuanced distinctions between human and AI texts across text segments. Whether LLMs struggle with or excel at incorporating linguistic ingenuity across different text segments carries substantial implications for determining their potential as effective creative assistants to humans. Through an analogy with the structure of chess games-comprising opening, middle, and end games-we analyze text segments (introduction, body, and conclusion) to determine where the most significant distinctions between human and AI texts exist. While AI texts can approximate the body segment better due to its increased length, a closer examination reveals a pronounced disparity, highlighting the importance of this segment in AI text detection. Additionally, human texts exhibit higher cross-segment differences compared to AI texts. Overall, our research can shed light on the intricacies of human-AI text distinctions, offering novel insights for text detection and understanding.


Enhancing generalization in high energy physics using white-box adversarial attacks

arXiv.org Artificial Intelligence

Machine learning is becoming increasingly popular in the context of particle physics. Supervised learning, which uses labeled Monte Carlo (MC) simulations, remains one of the most widely used methods for discriminating signals beyond the Standard Model. However, this paper suggests that supervised models may depend excessively on artifacts and approximations from Monte Carlo simulations, potentially limiting their ability to generalize well to real data. This study aims to enhance the generalization properties of supervised models by reducing the sharpness of local minima. It reviews the application of four distinct white-box adversarial attacks in the context of classifying Higgs boson decay signals. The attacks are divided into weight space attacks, and feature space attacks. To study and quantify the sharpness of different local minima this paper presents two analysis methods: gradient ascent and reduced Hessian eigenvalue analysis. The results show that white-box adversarial attacks significantly improve generalization performance, albeit with increased computational complexity.


Full-waveform earthquake source inversion using simulation-based inference

arXiv.org Artificial Intelligence

This paper presents a novel framework for full-waveform seismic source inversion using simulation-based inference (SBI). Traditional probabilistic approaches often rely on simplifying assumptions about data errors, which we show can lead to inaccurate uncertainty quantification. SBI addresses this limitation by building an empirical probabilistic model of the data errors using machine learning models, known as neural density estimators, which can then be integrated into the Bayesian inference framework. We apply the SBI framework to point-source moment tensor inversions as well as joint moment tensor and time-location inversions. We construct a range of synthetic examples to explore the quality of the SBI solutions, as well as to compare the SBI results with standard Gaussian likelihood-based Bayesian inversions. We then demonstrate that under real seismic noise, common Gaussian likelihood assumptions for treating full-waveform data yield overconfident posterior distributions that underestimate the moment tensor component uncertainties by up to a factor of 3. We contrast this with SBI, which produces well-calibrated posteriors that generally agree with the true seismic source parameters, and offers an order-of-magnitude reduction in the number of simulations required to perform inference compared to standard Monte Carlo techniques. Finally, we apply our methodology to a pair of moderate magnitude earthquakes in the North Atlantic. We utilise seismic waveforms recorded by the recent UPFLOW ocean bottom seismometer array as well as by regional land stations in the Azores, comparing full moment tensor and source-time location posteriors between SBI and a Gaussian likelihood approach. We find that our adaptation of SBI can be directly applied to real earthquake sources to efficiently produce high quality posterior distributions that significantly improve upon Gaussian likelihood approaches.


Practical Bayesian Optimization for Model Fitting with Bayesian Adaptive Direct Search

Neural Information Processing Systems

Computational models in fields such as computational neuroscience are often evaluated via stochastic simulation or numerical approximation. Fitting these models implies a difficult optimization problem over complex, possibly noisy parameter landscapes. Bayesian optimization (BO) has been successfully applied to solving expensive black-box problems in engineering and machine learning. Here we explore whether BO can be applied as a general tool for model fitting. First, we present a novel hybrid BO algorithm, Bayesian adaptive direct search (BADS), that achieves competitive performance with an affordable computational overhead for the running time of typical models. We then perform an extensive benchmark of BADS vs. many common and state-of-the-art nonconvex, derivativefree optimizers, on a set of model-fitting problems with real data and models from six studies in behavioral, cognitive, and computational neuroscience. With default settings, BADS consistently finds comparable or better solutions than other methods, including'vanilla' BO, showing great promise for advanced BO techniques, and BADS in particular, as a general model-fitting tool.