AITopics | Toscano, Juan Diego

Collaborating Authors

Toscano, Juan Diego

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

KKANs: Kurkova-Kolmogorov-Arnold Networks and Their Learning Dynamics

Toscano, Juan Diego, Wang, Li-Lian, Karniadakis, George Em

arXiv.org Machine LearningDec-21-2024

Inspired by the Kolmogorov-Arnold representation theorem and Kurkova's principle of using approximate representations, we propose the Kurkova-Kolmogorov-Arnold Network (KKAN), a new two-block architecture that combines robust multi-layer perceptron (MLP) based inner functions with flexible linear combinations of basis functions as outer functions. We first prove that KKAN is a universal approximator, and then we demonstrate its versatility across scientific machine-learning applications, including function regression, physics-informed machine learning (PIML), and operator-learning frameworks. The benchmark results show that KKANs outperform MLPs and the original Kolmogorov-Arnold Networks (KANs) in function approximation and operator learning tasks and achieve performance comparable to fully optimized MLPs for PIML. To better understand the behavior of the new representation models, we analyze their geometric complexity and learning dynamics using information bottleneck theory, identifying three universal learning stages, fitting, transition, and diffusion, across all types of architectures. We find a strong correlation between geometric complexity and signal-to-noise ratio (SNR), with optimal generalization achieved during the diffusion stage. Additionally, we propose self-scaled residual-based attention weights to maintain high SNR dynamically, ensuring uniform convergence and prolonged learning.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Machine Learning

2412.16738

Country:

Europe (0.67)
North America > United States (0.45)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (1.00)
Energy (0.67)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

From PINNs to PIKANs: Recent Advances in Physics-Informed Machine Learning

Toscano, Juan Diego, Oommen, Vivek, Varghese, Alan John, Zou, Zongren, Daryakenari, Nazanin Ahmadi, Wu, Chenxi, Karniadakis, George Em

arXiv.org Artificial IntelligenceOct-21-2024

Physics-Informed Neural Networks (PINNs) have emerged as a key tool in Scientific Machine Learning since their introduction in 2017, enabling the efficient solution of ordinary and partial differential equations using sparse measurements. Over the past few years, significant advancements have been made in the training and optimization of PINNs, covering aspects such as network architectures, adaptive refinement, domain decomposition, and the use of adaptive weights and activation functions. A notable recent development is the Physics-Informed Kolmogorov-Arnold Networks (PIKANS), which leverage a representation model originally proposed by Kolmogorov in 1957, offering a promising alternative to traditional PINNs. In this review, we provide a comprehensive overview of the latest advancements in PINNs, focusing on improvements in network design, feature expansion, optimization techniques, uncertainty quantification, and theoretical insights. We also survey key applications across a range of fields, including biomedicine, fluid and solid mechanics, geophysics, dynamical systems, heat transfer, chemical engineering, and beyond. Finally, we review computational frameworks and software tools developed by both academia and industry to support PINN research and applications.

artificial intelligence, machine learning, survey article, (15 more...)

arXiv.org Artificial Intelligence

2410.13228

Country:

North America > United States > North Carolina (0.14)
Europe > United Kingdom > England (0.14)

Genre:

Research Report > Promising Solution (1.00)
Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Energy > Renewable (1.00)
Energy > Oil & Gas > Upstream (1.00)
(6 more...)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Model-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A comprehensive and FAIR comparison between MLP and KAN representations for differential equations and operator networks

Shukla, Khemraj, Toscano, Juan Diego, Wang, Zhicheng, Zou, Zongren, Karniadakis, George Em

arXiv.org Artificial IntelligenceJun-5-2024

Kolmogorov-Arnold Networks (KANs) were recently introduced as an alternative representation model to MLP. Herein, we employ KANs to construct physics-informed machine learning models (PIKANs) and deep operator models (DeepOKANs) for solving differential equations for forward and inverse problems. In particular, we compare them with physics-informed neural networks (PINNs) and deep operator networks (DeepONets), which are based on the standard MLP representation. We find that although the original KANs based on the B-splines parameterization lack accuracy and efficiency, modified versions based on low-order orthogonal polynomials have comparable performance to PINNs and DeepONet although they still lack robustness as they may diverge for different random seeds or higher order orthogonal polynomials. We visualize their corresponding loss landscapes and analyze their learning dynamics using information bottleneck theory. Our study follows the FAIR principles so that other researchers can use our benchmarks to further advance this emerging topic.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2406.02917

Country:

Europe (0.46)
North America > United States (0.28)

Genre: Research Report (1.00)

Industry: Energy > Oil & Gas > Upstream (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Learning in PINNs: Phase transition, total diffusion, and generalization

Anagnostopoulos, Sokratis J., Toscano, Juan Diego, Stergiopulos, Nikolaos, Karniadakis, George Em

arXiv.org Artificial IntelligenceMar-27-2024

Phase transitions in deep learning The optimization process in deep learning can vary significantly in terms of smoothness and convergence rate, depending on various factors such as the complexity of the model, the quality/quantity of the data or the loss landscape characteristics. However, for non-convex problems this process has often been observed to be far from smooth and steady; instead it is rather dominated by discrete, successive phases. Recent studies have shed light on several key aspects influencing these phases and the overall optimization dynamics [1-10]. Figure 1: Phase transition in PINNs: The test error between the prediction and the exact solution converges faster after total diffusion (dashed lines), which occurs with an abrupt phase transition defined by homogeneous residuals. Although the convergence starts during the onset of the diffusion phase, the optimal training performance is met when the gradients of different batches become equivalent, indicating a general agreement on the direction of the optimizer steps (total diffusion). The importance of gradient noise in escaping local optima of non-convex optimization has been explored, demonstrating its role in guaranteeing polynomial time convergence to a global optimum [1]. The authors of the same work suggest the existence of a phase transition for a perturbed gradient descent GD algorithm, from escaping local optima to converging to a global solution as the artificial noise decreases. In a later work, a phenomenon called "super-convergence" has been highlighted, where models trained with a two-phase cyclical learning rate may lead to improved regularization balance and generalization [2]. Furthermore, recent investigations have discovered a two-phase learning regime for full-batch gradient descent (GD), characterized by distinct behaviors [3].

artificial intelligence, machine learning, optimization problem, (17 more...)

arXiv.org Artificial Intelligence

2403.18494

Country:

North America > United States (0.28)
Europe (0.28)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Residual-based attention and connection to information bottleneck theory in PINNs

Anagnostopoulos, Sokratis J., Toscano, Juan Diego, Stergiopulos, Nikolaos, Karniadakis, George Em

arXiv.org Artificial IntelligenceJul-1-2023

Driven by the need for more efficient and seamless integration of physical models and data, physics-informed neural networks (PINNs) have seen a surge of interest in recent years. However, ensuring the reliability of their convergence and accuracy remains a challenge. In this work, we propose an efficient, gradient-less weighting scheme for PINNs, that accelerates the convergence of dynamic or static systems. This simple yet effective attention mechanism is a function of the evolving cumulative residuals and aims to make the optimizer aware of problematic regions at no extra computational cost or adversarial learning. We illustrate that this general method consistently achieves a relative $L^{2}$ error of the order of $10^{-5}$ using standard optimizers on typical benchmark cases of the literature. Furthermore, by investigating the evolution of weights during training, we identify two distinct learning phases reminiscent of the fitting and diffusion phases proposed by the information bottleneck (IB) theory. Subsequent gradient analysis supports this hypothesis by aligning the transition from high to low signal-to-noise ratio (SNR) with the transition from fitting to diffusion regimes of the adopted weights. This novel correlation between PINNs and IB theory could open future possibilities for understanding the underlying mechanisms behind the training and stability of PINNs and, more broadly, of neural operators.

artificial intelligence, machine learning, neural network, (18 more...)

arXiv.org Artificial Intelligence

2307.00379

Country:

Europe (0.28)
North America > United States > California (0.14)

Genre: Research Report (0.82)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback