AITopics | Tegmark, Max

Collaborating Authors

Tegmark, Max

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Provably safe systems: the only path to controllable AGI

Tegmark, Max, Omohundro, Steve

arXiv.org Artificial IntelligenceSep-4-2023

"Once the machine thinking method had started, it would not take long to outstrip our feeble powers. At some stage therefore we should have to expect the machines to take control" Alan Turing 1951 [35] AGI [91] safety is of the utmost urgency, since corporations and research labs are racing to build AGI despite prominent AI researchers and business leaders warning that it may lead to human extinction [11]. While governments are drafting AI regulations, there's little indication that they will be sufficient to resist competitive pressures and prevent the creation of AGI. Median estimates on the forecasting platform Metaculus of the date of AGI's creation have plummeted over the past few years from many decades away to 2027 [25] or 2032 [24] depending on definitions, with superintelligence expected to follow a few years later [23]. Is Alan Turing correct that we now "have to expect the machines to take control"?

large language model, logic & formal reasoning, machine learning, (24 more...)

arXiv.org Artificial Intelligence

2309.01933

Country: North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)

Genre: Research Report (0.82)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Government > Military (0.93)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(5 more...)

Add feedback

Discovering New Interpretable Conservation Laws as Sparse Invariants

Liu, Ziming, Sturm, Patrick Obin, Bharadwaj, Saketh, Silva, Sam, Tegmark, Max

arXiv.org Artificial IntelligenceJul-4-2023

Discovering conservation laws for a given dynamical system is important but challenging. In a theorist setup (differential equations and basis functions are both known), we propose the Sparse Invariant Detector (SID), an algorithm that auto-discovers conservation laws from differential equations. Its algorithmic simplicity allows robustness and interpretability of the discovered conserved quantities. We show that SID is able to rediscover known and even discover new conservation laws in a variety of systems. For two examples in fluid mechanics and atmospheric chemistry, SID discovers 14 and 3 conserved quantities, respectively, where only 12 and 2 were previously known to domain experts.

artificial intelligence, conserved quantity, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2305.19525

Country: North America > United States > California (0.28)

Genre: Research Report (0.40)

Industry: Energy > Oil & Gas (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Seeing is Believing: Brain-Inspired Modular Training for Mechanistic Interpretability

Liu, Ziming, Gan, Eric, Tegmark, Max

arXiv.org Artificial IntelligenceJun-6-2023

We introduce Brain-Inspired Modular Training (BIMT), a method for making neural networks more modular and interpretable. Inspired by brains, BIMT embeds neurons in a geometric space and augments the loss function with a cost proportional to the length of each neuron connection. We demonstrate that BIMT discovers useful modular neural networks for many simple tasks, revealing compositional structures in symbolic formulas, interpretable decision boundaries and features for classification, and mathematical structure in algorithmic datasets. The ability to directly see modules with the naked eye can complement current mechanistic interpretability strategies such as probes, interventions or staring at all weights.

artificial intelligence, machine learning, neural network, (18 more...)

arXiv.org Artificial Intelligence

2305.08746

Country: North America > United States (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

GenPhys: From Physical Processes to Generative Models

Liu, Ziming, Luo, Di, Xu, Yilun, Jaakkola, Tommi, Tegmark, Max

arXiv.org Artificial IntelligenceApr-5-2023

Since diffusion models (DM) and the more recent Poisson flow generative models (PFGM) are inspired by physical processes, it is reasonable to ask: Can physical processes offer additional new generative models? We show that the answer is yes. We introduce a general family, Generative Models from Physical Processes (GenPhys), where we translate partial differential equations (PDEs) describing physical processes to generative models. We show that generative models can be constructed from s-generative PDEs (s for smooth). GenPhys subsume the two existing generative models (DM and PFGM) and even give rise to new families of generative models, e.g., "Yukawa Generative Models" inspired from weak interactions. On the other hand, some physical processes by default do not belong to the GenPhys family, e.g., the wave equation and the Schr\"{o}dinger equation, but could be made into the GenPhys family with some modifications. Our goal with GenPhys is to explore and expand the design space of generative models.

artificial intelligence, equation, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2304.02637

Country: North America > United States (0.46)

Genre: Research Report (0.64)

Industry: Energy > Oil & Gas > Upstream (0.47)

Technology: Information Technology > Artificial Intelligence > Natural Language > Generation (1.00)

Add feedback

Omnigrok: Grokking Beyond Algorithmic Data

Liu, Ziming, Michaud, Eric J., Tegmark, Max

arXiv.org Artificial IntelligenceMar-23-2023

Grokking, the unusual phenomenon for algorithmic datasets where generalization happens long after overfitting the training data, has remained elusive. We aim to understand grokking by analyzing the loss landscapes of neural networks, identifying the mismatch between training and test loss landscapes as the cause for grokking. We refer to this as the "LU mechanism" because training and test losses (against model weight norm) typically resemble "L" and "U", respectively. This simple mechanism can nicely explain many aspects of grokking: data size dependence, weight decay dependence, the emergence of representations, etc. Guided by the intuitive picture, we are able to induce grokking on tasks involving images, language and molecules. In the reverse direction, we are able to eliminate grokking for algorithmic datasets. We attribute the dramatic nature of grokking for algorithmic datasets to representation learning. Generalization lies at the heart of machine learning. A good machine learning model should arguably be able to generalize fast, and behave in a smooth/predictable way under changes of (hyper)parameters. Grokking, the phenomenon where the model generalizes long after overfitting the training set, has raised interesting questions after it was observed on algorithmic datasets by (Power et al., 2022): Q1 The origin of grokking: Why is generalization much delayed after overfitting?

artificial intelligence, machine learning, weight norm, (19 more...)

arXiv.org Artificial Intelligence

2210.01117

Country:

North America > United States > Oregon (0.14)
North America > United States > Massachusetts (0.14)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

PFGM++: Unlocking the Potential of Physics-Inspired Generative Models

Xu, Yilun, Liu, Ziming, Tian, Yonglong, Tong, Shangyuan, Tegmark, Max, Jaakkola, Tommi

arXiv.org Artificial IntelligenceFeb-10-2023

We introduce a new family of physics-inspired generative models termed PFGM++ that unifies diffusion models and Poisson Flow Generative Models (PFGM). These models realize generative trajectories for $N$ dimensional data by embedding paths in $N{+}D$ dimensional space while still controlling the progression with a simple scalar norm of the $D$ additional variables. The new models reduce to PFGM when $D{=}1$ and to diffusion models when $D{\to}\infty$. The flexibility of choosing $D$ allows us to trade off robustness against rigidity as increasing $D$ results in more concentrated coupling between the data and the additional variable norms. We dispense with the biased large batch field targets used in PFGM and instead provide an unbiased perturbation-based objective similar to diffusion models. To explore different choices of $D$, we provide a direct alignment method for transferring well-tuned hyperparameters from diffusion models ($D{\to} \infty$) to any finite $D$ values. Our experiments show that models with finite $D$ can be superior to previous state-of-the-art diffusion models on CIFAR-10/FFHQ $64{\times}64$ datasets, with FID scores of $1.91/2.43$ when $D{=}2048/128$. In class-conditional setting, $D{=}2048$ yields current state-of-the-art FID of $1.74$ on CIFAR-10. In addition, we demonstrate that models with smaller $D$ exhibit improved robustness against modeling errors. Code is available at https://github.com/Newbeeer/pfgmpp

artificial intelligence, diffusion model, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2302.04265

Country:

North America > United States > Montana > Roosevelt County (0.34)
North America > United States > Massachusetts (0.28)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Generation (0.83)

Add feedback

Precision Machine Learning

Michaud, Eric J., Liu, Ziming, Tegmark, Max

arXiv.org Artificial IntelligenceOct-24-2022

We explore unique considerations involved in fitting ML models to data with very high precision, as is often required for science applications. We empirically compare various function approximation methods and study how they scale with increasing parameters and data. We find that neural networks can often outperform classical approximation methods on high-dimensional examples, by auto-discovering and exploiting modular structures therein. However, neural networks trained with common optimizers are less powerful for low-dimensional cases, which motivates us to study the unique properties of neural network loss landscapes and the corresponding optimization challenges that arise in the high precision regime. To address the optimization issue in low dimensions, we develop training tricks which enable us to train neural networks to extremely low loss, close to the limits allowed by numerical precision.

artificial intelligence, machine learning, neural network, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.3390/e25010175

2210.13447

Country: North America > United States (0.28)

Genre: Research Report (0.82)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Pareto-optimal clustering with the primal deterministic information bottleneck

Tan, Andrew K., Tegmark, Max, Chuang, Isaac L.

arXiv.org Artificial IntelligenceJul-27-2022

At the heart of both lossy compression and clustering is a trade-off between the fidelity and size of the learned representation. Our goal is to map out and study the Pareto frontier that quantifies this trade-off. We focus on the optimization of the Deterministic Information Bottleneck (DIB) objective over the space of hard clusterings. To this end, we introduce the primal DIB problem, which we show results in a much richer frontier than its previously studied Lagrangian relaxation when optimized over discrete search spaces. We present an algorithm for mapping out the Pareto frontier of the primal DIB trade-off that is also applicable to other two-objective clustering problems. We study general properties of the Pareto frontier, and we give both analytic and numerical evidence for logarithmic sparsity of the frontier in general. We provide evidence that our algorithm has polynomial scaling despite the super-exponential search space, and additionally, we propose a modification to the algorithm that can be used where sampling noise is expected to be significant. Finally, we use our algorithm to map the DIB frontier of three different tasks: compressing the English alphabet, extracting informative color classes from natural images, and compressing a group theory-inspired dataset, revealing interesting features of frontier, and demonstrating how the structure of the frontier can be used for model selection with a focus on points previously hidden by the cloak of the convex hull.

artificial intelligence, frontier, machine learning, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.3390/e24060771

2204.02489

Country: North America > United States (0.93)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.88)

Add feedback

Symbolic Pregression: Discovering Physical Laws from Distorted Video

Udrescu, Silviu-Marian, Tegmark, Max

arXiv.org Artificial IntelligenceSep-11-2020

We present a method for unsupervised learning of equations of motion for objects in raw and optionally distorted unlabeled video. We first train an autoencoder that maps each video frame into a low-dimensional latent space where the laws of motion are as simple as possible, by minimizing a combination of non-linearity, acceleration and prediction error. Differential equations describing the motion are then discovered using Pareto-optimal symbolic regression. We find that our pre-regression ("pregression") step is able to rediscover Cartesian coordinates of unlabeled moving objects even when the video is distorted by a generalized lens. Using intuition from multidimensional knot-theory, we find that the pregression step is facilitated by first adding extra latent space dimensions to avoid topological problems during training and then removing these extra dimensions via principal component analysis.

deep learning, latent space, neural network, (15 more...)

arXiv.org Artificial Intelligence

2005.11212

Country: North America > United States > Massachusetts (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)

Add feedback

AI Feynman 2.0: Pareto-optimal symbolic regression exploiting graph modularity

Udrescu, Silviu-Marian, Tan, Andrew, Feng, Jiahai, Neto, Orisvaldo, Wu, Tailin, Tegmark, Max

arXiv.org Artificial IntelligenceJun-18-2020

We present an improved method for symbolic regression that seeks to fit data to formulas that are Pareto-optimal, in the sense of having the best accuracy for a given complexity. It improves on the previous state-of-the-art by typically being orders of magnitude more robust toward noise and bad data, and also by discovering many formulas that stumped previous methods. We develop a method for discovering generalized symmetries (arbitrary modularity in the computational graph of a formula) from gradient properties of a neural network fit. We use normalizing flows to generalize our symbolic regression method to probability distributions from which we only have samples, and employ statistical hypothesis testing to accelerate robust brute-force search.

artificial intelligence, neural network, symbolic regression, (17 more...)

arXiv.org Artificial Intelligence

2006.10782

Country: North America > United States (0.28)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Add feedback