Goto

Collaborating Authors

 feynman


Transformer Semantic Genetic Programming for d-dimensional Symbolic Regression Problems

Anthes, Philipp, Sobania, Dominik, Rothlauf, Franz

arXiv.org Artificial Intelligence

Transformer Semantic Genetic Programming (TSGP) is a semantic search approach that uses a pre-trained transformer model as a variation operator to generate offspring programs with controlled semantic similarity to a given parent. Unlike other semantic GP approaches that rely on fixed syntactic transformations, TSGP aims to learn diverse structural variations that lead to solutions with similar semantics. We find that a single transformer model trained on millions of programs is able to generalize across symbolic regression problems of varying dimension. Evaluated on 24 real-world and synthetic datasets, TSGP significantly outperforms standard GP, SLIM_GSGP, Deep Symbolic Regression, and Denoising Autoencoder GP, achieving an average rank of 1.58 across all benchmarks. Moreover, TSGP produces more compact solutions than SLIM_GSGP, despite its higher accuracy. In addition, the target semantic distance $\mathrm{SD}_t$ is able to control the step size in the semantic space: small values of $\mathrm{SD}_t$ enable consistent improvement in fitness but often lead to larger programs, while larger values promote faster convergence and compactness. Thus, $\mathrm{SD}_t$ provides an effective mechanism for balancing exploration and exploitation.


A Theory of the Mechanics of Information: Generalization Through Measurement of Uncertainty (Learning is Measuring)

Hazard, Christopher J., Resnick, Michael, Beel, Jacob, Xia, Jack, Mack, Cade, Glennie, Dominic, Fulp, Matthew, Maze, David, Bassett, Andrew, Koistinen, Martin

arXiv.org Machine Learning

Traditional machine learning relies on explicit models and domain assumptions, limiting flexibility and interpretability. We introduce a model-free framework using surprisal (information theoretic uncertainty) to directly analyze and perform inferences from raw data, eliminating distribution modeling, reducing bias, and enabling efficient updates including direct edits and deletion of training data. By quantifying relevance through uncertainty, the approach enables generalizable inference across tasks including generative inference, causal discovery, anomaly detection, and time series forecasting. It emphasizes traceability, interpretability, and data-driven decision making, offering a unified, human-understandable framework for machine learning, and achieves at or near state-of-the-art performance across most common machine learning tasks. The mathematical foundations create a ``physics'' of information, which enable these techniques to apply effectively to a wide variety of complex data types, including missing data. Empirical results indicate that this may be a viable alternative path to neural networks with regard to scalable machine learning and artificial intelligence that can maintain human understandability of the underlying mechanics.


Operator Feature Neural Network for Symbolic Regression

Deng, Yusong, Wu, Min, Yu, Lina, Liu, Jingyi, Wei, Shu, Li, Yanjie, Li, Weijun

arXiv.org Artificial Intelligence

Symbolic regression is a task aimed at identifying patterns in data and representing them through mathematical expressions, generally involving skeleton prediction and constant optimization. Many methods have achieved some success, however they treat variables and symbols merely as characters of natural language without considering their mathematical essence. This paper introduces the operator feature neural network (OF-Net) which employs operator representation for expressions and proposes an implicit feature encoding method for the intrinsic mathematical operational logic of operators. By substituting operator features for numeric loss, we can predict the combination of operators of target expressions. We evaluate the model on public datasets, and the results demonstrate that the model achieves superior recovery rates and high $R^2$ scores. With the discussion of the results, we analyze the merit and demerit of OF-Net and propose optimizing schemes.


A Personalised Learning Tool for Physics Undergraduate Students Built On a Large Language Model for Symbolic Regression

Zhu, Yufan, Khoo, Zi-Yu, Low, Jonathan Sze Choong, Bressan, Stephane

arXiv.org Artificial Intelligence

Interleaved practice enhances the memory and problem-solving ability of students in undergraduate courses. We introduce a personalized learning tool built on a Large Language Model (LLM) that can provide immediate and personalized attention to students as they complete homework containing problems interleaved from undergraduate physics courses. Our tool leverages the dimensional analysis method, enhancing students' qualitative thinking and problem-solving skills for complex phenomena. Our approach combines LLMs for symbolic regression with dimensional analysis via prompt engineering and offers students a unique perspective to comprehend relationships between physics variables. This fosters a broader and more versatile understanding of physics and mathematical principles and complements a conventional undergraduate physics education that relies on interpreting and applying established equations within specific contexts. We test our personalized learning tool on the equations from Feynman's lectures on physics. Our tool can correctly identify relationships between physics variables for most equations, underscoring its value as a complementary personalized learning tool for undergraduate physics students.


Understanding Diffusion Models by Feynman's Path Integral

Hirono, Yuji, Tanaka, Akinori, Fukushima, Kenji

arXiv.org Artificial Intelligence

Score-based diffusion models have proven effective in image generation and have gained widespread usage; however, the underlying factors contributing to the performance disparity between stochastic and deterministic (i.e., the probability flow ODEs) sampling schemes remain unclear. We introduce a novel formulation of diffusion models using Feynman's path integral, which is a formulation originally developed for quantum physics. We find this formulation providing comprehensive descriptions of score-based generative models, and demonstrate the derivation of backward stochastic differential equations and loss functions.The formulation accommodates an interpolating parameter connecting stochastic and deterministic sampling schemes, and we identify this parameter as a counterpart of Planck's constant in quantum physics. This analogy enables us to apply the Wentzel-Kramers-Brillouin (WKB) expansion, a well-established technique in quantum physics, for evaluating the negative log-likelihood to assess the performance disparity between stochastic and deterministic sampling schemes.


Latent Lab: Large Language Models for Knowledge Exploration

Dunnell, Kevin, Painter, Trudy, Stoddard, Andrew, Lippman, Andy

arXiv.org Artificial Intelligence

This paper investigates the potential of AI models, particularly large language models (LLMs), to support knowledge exploration and augment human creativity during ideation. We present "Latent Lab" an interactive tool for discovering connections among MIT Media Lab research projects, emphasizing "exploration" over search. The work offers insights into collaborative AI systems by addressing the challenges of organizing, searching, and synthesizing content. In a user study, the tool's success was evaluated based on its ability to introduce users to an unfamiliar knowledge base, ultimately setting the groundwork for the ongoing advancement of human-AI knowledge exploration systems.


Promotion/Inhibition Effects in Networks: A Model with Negative Probabilities

Dong, Anqi, Georgiou, Tryphon T., Tannenbaum, Allen

arXiv.org Artificial Intelligence

Biological networks often encapsulate promotion/inhibition as signed edge-weights of a graph. Nodes may correspond to genes assigned expression levels (mass) of respective proteins. The promotion/inhibition nature of co-expression between nodes is encoded in the sign of the corresponding entry of a sign-indefinite adjacency matrix, though the strength of such co-expression (i.e., the precise value of edge weights) cannot typically be directly measured. Herein we address the inverse problem to determine network edge-weights based on a sign-indefinite adjacency and expression levels at the nodes. While our motivation originates in gene networks, the framework applies to networks where promotion/inhibition dictates a stationary mass distribution at the nodes. In order to identify suitable edge-weights we adopt a framework of ``negative probabilities,'' advocated by P.\ Dirac and R.\ Feynman, and we set up a likelihood formalism to obtain values for the sought edge-weights. The proposed optimization problem can be solved via a generalization of the well-known Sinkhorn algorithm; in our setting the Sinkhorn-type ``diagonal scalings'' are multiplicative or inverse-multiplicative, depending on the sign of the respective entries in the adjacency matrix, with value computed as the positive root of a quadratic polynomial.


Guided scenarios with simulated expert personae: a remarkable strategy to perform cognitive work

Van Buren, David

arXiv.org Artificial Intelligence

Large language models (LLMs) trained on a substantial corpus of human knowledge and literature productively work with a large array of facts from that corpus. Surprisingly, they are also able to re-create the behaviors of personae that are captured within the corpus. By forming teams of simulated personae, supplying contexts that set the stage, and providing gentle prompts, one can move through scenarios that elicit expert behavior to perform meaningful cognitive work. The power of this strategy is demonstrated with two examples, one attacking factuality of LLM responses and the other reproducing a very recently published result in quantum optics.


Feynman on Artificial Intelligence and Machine Learning, with Updates

Mjolsness, Eric

arXiv.org Artificial Intelligence

I present my recollections of Richard Feynman's mid-1980s interest in artificial intelligence and neural networks, set in the technical context of the physics-related approaches to neural networks of that time. I attempt to evaluate his ideas in the light of the substantial advances in the field since then, and vice versa. There are aspects of Feynman's interests that I think have been largely achieved and others that remain excitingly open, notably in computational science, and potentially including the revival of symbolic methods therein.


Is Data Science a science?

#artificialintelligence

At its core, all fundamental science is about making predictions in the form of experiments: precise, quantifiable, falsifiable predictions. As Richard P. Feynman put it: "The fundamental principle of science, the definition almost, is this: the sole test of the validity of any idea is experiment." So if science is about making predictions, how is it different from the predictions that astrologers make? The core distinction is in the kinds of predictions each makes. Most horoscopes, for example, will give you general predictions.