AITopics | George, Robert Joseph

Collaborating Authors

George, Robert Joseph

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

LeanProgress: Guiding Search for Neural Theorem Proving via Proof Progress Prediction

Huang, Suozhi, Song, Peiyang, George, Robert Joseph, Anandkumar, Anima

arXiv.org Artificial IntelligenceFeb-27-2025

Mathematical reasoning remains a significant challenge for Large Language Models (LLMs) due to hallucinations. When combined with formal proof assistants like Lean, these hallucinations can be eliminated through rigorous verification, making theorem proving reliable. However, even with formal verification, LLMs still struggle with long proofs and complex mathematical formalizations. While Lean with LLMs offers valuable assistance with retrieving lemmas, generating tactics, or even complete proofs, it lacks a crucial capability: providing a sense of proof progress. This limitation particularly impacts the overall development efficiency in large formalization projects. We introduce LeanProgress, a method that predicts the progress in the proof. Training and evaluating our models made on a large corpus of Lean proofs from Lean Workbook Plus and Mathlib4 and how many steps remain to complete it, we employ data preprocessing and balancing techniques to handle the skewed distribution of proof lengths. Our experiments show that LeanProgress achieves an overall prediction accuracy of 75.1\% in predicting the amount of progress and, hence, the remaining number of steps. When integrated into a best-first search framework using Reprover, our method shows a 3.8\% improvement on Mathlib4 compared to baseline performances of 41.2\%, particularly for longer proofs. These results demonstrate how proof progress prediction can enhance both automated and interactive theorem proving, enabling users to make more informed decisions about proof strategies.

large language model, leanprogress, natural language, (14 more...)

arXiv.org Artificial Intelligence

2502.17925

Country: North America > United States > California (0.14)

Genre: Research Report > New Finding (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Tensor-GaLore: Memory-Efficient Training via Gradient Tensor Decomposition

George, Robert Joseph, Pitt, David, Zhao, Jiawei, Kossaifi, Jean, Luo, Cheng, Tian, Yuandong, Anandkumar, Anima

arXiv.org Artificial IntelligenceJan-4-2025

We present Tensor-GaLore, a novel method for efficient training of neural networks with higher-order tensor weights. Many models, particularly those used in scientific computing, employ tensor-parameterized layers to capture complex, multidimensional relationships. When scaling these methods to high-resolution problems makes memory usage grow intractably, and matrix based optimization methods lead to suboptimal performance and compression. We propose to work directly in the high-order space of the complex tensor parameter space using a tensor factorization of the gradients during optimization. We showcase its effectiveness on Fourier Neural Operators (FNOs), a class of models crucial for solving partial differential equations (PDE) and prove the theory of it. Across various PDE tasks like the Navier Stokes and Darcy Flow equations, Tensor-GaLore achieves substantial memory savings, reducing optimizer memory usage by up to 75%. These substantial memory savings across AI for science demonstrate Tensor-GaLore's potential.

artificial intelligence, machine learning, natural language, (14 more...)

arXiv.org Artificial Intelligence

2501.02379

Country: North America > United States (0.14)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

A Library for Learning Neural Operators

Kossaifi, Jean, Kovachki, Nikola, Li, Zongyi, Pitt, David, Liu-Schiaffini, Miguel, George, Robert Joseph, Bonev, Boris, Azizzadenesheli, Kamyar, Berner, Julius, Anandkumar, Anima

arXiv.org Artificial IntelligenceDec-17-2024

We present NeuralOperator, an open-source Python library for operator learning. Neural operators generalize neural networks to maps between function spaces instead of finite-dimensional Euclidean spaces. They can be trained and inferenced on input and output functions given at various discretizations, satisfying a discretization convergence properties. Built on top of PyTorch, NeuralOperator provides all the tools for training and deploying neural operator models, as well as developing new ones, in a high-quality, tested, open-source package. It combines cutting-edge models and customizability with a gentle learning curve and simple user interface for newcomers.

artificial intelligence, machine learning, operator, (13 more...)

arXiv.org Artificial Intelligence

2412.10354

Genre: Research Report (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

LeanAgent: Lifelong Learning for Formal Theorem Proving

Kumarappan, Adarsh, Tiwari, Mo, Song, Peiyang, George, Robert Joseph, Xiao, Chaowei, Anandkumar, Anima

arXiv.org Artificial IntelligenceNov-24-2024

Large Language Models (LLMs) have been successful in mathematical reasoning tasks such as formal theorem proving when integrated with interactive proof assistants like Lean. Existing approaches involve training or fine-tuning an LLM on a specific dataset to perform well on particular domains, such as undergraduate-level mathematics. These methods struggle with generalizability to advanced mathematics. A fundamental limitation is that these approaches operate on static domains, failing to capture how mathematicians often work across multiple domains and projects simultaneously or cyclically. We present LeanAgent, a novel lifelong learning framework for formal theorem proving that continuously generalizes to and improves on ever-expanding mathematical knowledge without forgetting previously learned knowledge. LeanAgent introduces several key innovations, including a curriculum learning strategy that optimizes the learning trajectory in terms of mathematical difficulty, a dynamic database for efficient management of evolving mathematical knowledge, and progressive training to balance stability and plasticity. LeanAgent successfully proves 155 theorems previously unproved formally by humans across 23 diverse Lean repositories, many from advanced mathematics. It performs significantly better than the static LLM baseline, proving challenging theorems in domains like abstract algebra and algebraic topology while showcasing a clear progression of learning from basic concepts to advanced topics. In addition, we analyze LeanAgent's superior performance on key lifelong learning metrics. LeanAgent achieves exceptional scores in stability and backward transfer, where learning new tasks improves performance on previously learned tasks. This emphasizes LeanAgent's continuous generalizability and improvement, explaining its superior theorem-proving performance.

large language model, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2410.06209

Country:

North America > Canada (0.67)
North America > United States (0.67)

Genre: Instructional Material > Course Syllabus & Notes (0.66)

Industry: Education > Educational Setting > Continuing Education (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Pretraining Codomain Attention Neural Operators for Solving Multiphysics PDEs

Rahman, Md Ashiqur, George, Robert Joseph, Elleithy, Mogab, Leibovici, Daniel, Li, Zongyi, Bonev, Boris, White, Colin, Berner, Julius, Yeh, Raymond A., Kossaifi, Jean, Azizzadenesheli, Kamyar, Anandkumar, Anima

arXiv.org Artificial IntelligenceApr-5-2024

Existing neural operator architectures face challenges when solving multiphysics problems with coupled partial differential equations (PDEs), due to complex geometries, interactions between physical variables, and the lack of large amounts of high-resolution training data. To address these issues, we propose Codomain Attention Neural Operator (CoDA-NO), which tokenizes functions along the codomain or channel space, enabling self-supervised learning or pretraining of multiple PDE systems. Specifically, we extend positional encoding, self-attention, and normalization layers to the function space. CoDA-NO can learn representations of different PDE systems with a single model. We evaluate CoDA-NO's potential as a backbone for learning multiphysics PDEs over multiple systems by considering few-shot learning settings. On complex downstream tasks with limited data, such as fluid flow simulations and fluid-structure interactions, we found CoDA-NO to outperform existing methods on the few-shot learning task by over $36\%$. The code is available at https://github.com/ashiq24/CoDA-NO.

artificial intelligence, inductive learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2403.12553

Country: North America > United States > California (0.14)

Genre: Research Report (1.00)

Industry: Energy > Oil & Gas > Upstream (0.88)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Incremental Spectral Learning in Fourier Neural Operator

Zhao, Jiawei, George, Robert Joseph, Li, Zongyi, Anandkumar, Anima

arXiv.org Artificial IntelligenceFeb-6-2023

Recently, neural networks have proven their impressive ability to solve partial differential equations (PDEs). Among them, Fourier neural operator (FNO) has shown success in learning solution operators for highly non-linear problems such as turbulence flow. FNO learns weights over different frequencies and as a regularization procedure, it only retains frequencies below a fixed threshold. However, manually selecting such an appropriate threshold for frequencies can be challenging, as an incorrect threshold can lead to underfitting or overfitting. To this end, we propose Incremental Fourier Neural Operator (IFNO) that incrementally adds frequency modes by increasing the truncation threshold adaptively during training. We show that IFNO reduces the testing loss by more than 10% while using 20% fewer frequency modes, compared to the standard FNO training on the Kolmogorov Flow (with Reynolds number up to 5000) under the few-data regime.

artificial intelligence, machine learning, training sample, (17 more...)

arXiv.org Artificial Intelligence

2211.15188

Country: North America (0.28)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback