AITopics | Azizpour, Hossein

Collaborating Authors

Azizpour, Hossein

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Hessian-Informed Flow Matching

Sprague, Christopher Iliffe, Elofsson, Arne, Azizpour, Hossein

arXiv.org Artificial IntelligenceOct-15-2024

Modeling complex systems that evolve toward equilibrium distributions is important in various physical applications, including molecular dynamics and robotic control. These systems often follow the stochastic gradient descent of an underlying energy function, converging to stationary distributions around energy minima. The local covariance of these distributions is shaped by the energy landscape's curvature, often resulting in anisotropic characteristics. While flow-based generative models have gained traction in generating samples from equilibrium distributions in such applications, they predominately employ isotropic conditional probability paths, limiting their ability to capture such covariance structures. In this paper, we introduce Hessian-Informed Flow Matching (HI-FM), a novel approach that integrates the Hessian of an energy function into conditional flows within the flow matching framework. This integration allows HI-FM to account for local curvature and anisotropic covariance structures. Our approach leverages the linearization theorem from dynamical systems and incorporates additional considerations such as time transformations and equivariance. Empirical evaluations on the MNIST and Lennard-Jones particles datasets demonstrate that HI-FM improves the likelihood of test samples.

artificial intelligence, hessian, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2410.11433

Country: Europe > United Kingdom (0.29)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.54)

Add feedback

Opportunities for machine learning in scientific discovery

Vinuesa, Ricardo, Rabault, Jean, Azizpour, Hossein, Bauer, Stefan, Brunton, Bingni W., Elofsson, Arne, Jarlebring, Elias, Kjellstrom, Hedvig, Markidis, Stefano, Marlevi, David, Cinnella, Paola, Brunton, Steven L.

arXiv.org Artificial IntelligenceMay-7-2024

Technological advancements have substantially increased computational power and data availability, enabling the application of powerful machine-learning (ML) techniques across various fields. However, our ability to leverage ML methods for scientific discovery, {\it i.e.} to obtain fundamental and formalized knowledge about natural processes, is still in its infancy. In this review, we explore how the scientific community can increasingly leverage ML techniques to achieve scientific discoveries. We observe that the applicability and opportunity of ML depends strongly on the nature of the problem domain, and whether we have full ({\it e.g.}, turbulence), partial ({\it e.g.}, computational biochemistry), or no ({\it e.g.}, neuroscience) {\it a-priori} knowledge about the governing equations and physical properties of the system. Although challenges remain, principled use of ML is opening up new avenues for fundamental scientific discoveries. Throughout these diverse fields, there is a theme that ML is enabling researchers to embrace complexity in observational data that was previously intractable to classic analysis and numerical investigations.

artificial intelligence, deep learning, machine learning, (16 more...)

arXiv.org Artificial Intelligence

2405.04161

Country:

Europe (1.00)
North America > United States > Washington > King County > Seattle (0.14)
North America > United States > Massachusetts (0.14)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Energy > Oil & Gas > Upstream (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.88)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Scientific Discovery (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)

Add feedback

Indirectly Parameterized Concrete Autoencoders

Nilsson, Alfred, Wijk, Klas, Gutha, Sai bharath chandra, Englesson, Erik, Hotti, Alexandra, Saccardi, Carlo, Kviman, Oskar, Lagergren, Jens, Vinuesa, Ricardo, Azizpour, Hossein

arXiv.org Machine LearningMar-1-2024

Feature selection is a crucial task in settings where data is high-dimensional or acquiring the full set of features is costly. Recent developments in neural network-based embedded feature selection show promising results across a wide range of applications. Concrete Autoencoders (CAEs), considered state-of-the-art in embedded feature selection, may struggle to achieve stable joint optimization, hurting their training time and generalization. In this work, we identify that this instability is correlated with the CAE learning duplicate selections. To remedy this, we propose a simple and effective improvement: Indirectly Parameterized CAEs (IP-CAEs). IP-CAEs learn an embedding and a mapping from it to the Gumbel-Softmax distributions' parameters. Despite being simple to implement, IP-CAE exhibits significant and consistent improvements over CAE in both generalization and training time across several datasets for reconstruction and classification. Unlike CAE, IP-CAE effectively leverages non-linear relationships and does not require retraining the jointly optimized decoder. Furthermore, our approach is, in principle, generalizable to Gumbel-Softmax distributions beyond feature selection.

artificial intelligence, ip-cae, machine learning, (18 more...)

arXiv.org Machine Learning

2403.00563

Country: Europe > Sweden (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Stable Autonomous Flow Matching

Sprague, Christopher Iliffe, Elofsson, Arne, Azizpour, Hossein

arXiv.org Artificial IntelligenceFeb-8-2024

In contexts where data samples represent a physically stable state, it is often assumed that the data points represent the local minima of an energy landscape. In control theory, it is well-known that energy can serve as an effective Lyapunov function. Despite this, connections between control theory and generative models in the literature are sparse, even though there are several machine learning applications with physically stable data points. In this paper, we focus on such data and a recent class of deep generative models called flow matching. We apply tools of stochastic stability for time-independent systems to flow matching models. In doing so, we characterize the space of flow matching models that are amenable to this treatment, as well as draw connections to other control theory principles. We demonstrate our theoretical results on two examples.

artificial intelligence, definition 4, machine learning, (14 more...)

arXiv.org Artificial Intelligence

2402.05774

Country:

Europe (0.28)
North America > United States > Texas (0.14)

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.66)

Add feedback

On the Lipschitz Constant of Deep Networks and Double Descent

Gamba, Matteo, Azizpour, Hossein, Björkman, Mårten

arXiv.org Artificial IntelligenceNov-14-2023

A longstanding question towards understanding the remarkable generalization ability of deep networks is characterizing the hypothesis class of models trained in practice, thus isolating properties of the networks' model function that capture generalization (Hanin & Rolnick, 2019; Neyshabur et al., 2015). Chiefly, a central problem is understanding the role played by overparameterization (Arora et al., 2018; Neyshabur et al., 2018; Zhang et al., 2018) - a key design choice of state of the art models - in promoting regularization of the model function. Modern overparameterized networks can achieve good generalization while perfectly interpolating the training set (Nakkiran et al., 2019). This phenomenon is described by the double descent curve of the test error (Belkin et al., 2019; Geiger et al., 2019): as model size increases, the error follows the classical bias-variance trade-off curve (Geman et al., 1992), peaks when a model is large enough to interpolate the training data, and then decreases again as model size grows further.

lipschitz constant, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2301.12309

Country:

North America > United States > New York (0.14)
North America > Canada > Ontario > Toronto (0.14)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Logistic-Normal Likelihoods for Heteroscedastic Label Noise

Englesson, Erik, Mehrpanah, Amir, Azizpour, Hossein

arXiv.org Artificial IntelligenceAug-14-2023

A natural way of estimating heteroscedastic label noise in regression is to model the observed (potentially noisy) target as a sample from a normal distribution, whose parameters can be learned by minimizing the negative log-likelihood. This formulation has desirable loss attenuation properties, as it reduces the contribution of high-error examples. Intuitively, this behavior can improve robustness against label noise by reducing overfitting. We propose an extension of this simple and probabilistic approach to classification that has the same desirable loss attenuation properties. Furthermore, we discuss and address some practical challenges of this extension. We evaluate the effectiveness of the method by measuring its robustness against label noise in classification. We perform enlightening experiments exploring the inner workings of the method, including sensitivity to hyperparameters, ablation studies, and other insightful analyses.

artificial intelligence, machine learning, noise, (17 more...)

arXiv.org Artificial Intelligence

2304.02849

Country: Europe (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)

Add feedback

Hyperplane Arrangements of Trained ConvNets Are Biased

Gamba, Matteo, Carlsson, Stefan, Azizpour, Hossein, Björkman, Mårten

arXiv.org Artificial IntelligenceApr-14-2023

In recent years, understanding and interpreting the inner workings of deep networks has drawn considerable attention from the community [7, 15, 16, 13]. One long-standing question is the problem of identifying the inductive bias of state-of-the-art networks and the form of implicit regularization that is performed by the optimizer [22, 31, 2] and possibly by natural data itself [3]. While earlier studies focused on the theoretical expressivity of deep networks and the advantage of deeper representations [20, 25, 26], a recent trend in the literature is the study of the effective capacity of trained networks [31, 32, 9, 10]. In fact, while state-of-the-art deep networks are largely overparametrized, it is hypothesized that the full theoretical capacity of a model might not be realized in practice, due to some form of self-regulation at play during learning. Some recent works have, thus, tried to find statistical bias consistently present in trained state-of-the-art models that is interpretable and correlates well with generalization [14, 24]. In this work, we take a geometrical perspective and look for statistical bias in the weights of trained convolutional networks, in terms of hyperplane arrangements induced by convolutional layers with ReLU activations.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2003.07797

Country:

North America > United States (0.28)
Europe > Switzerland (0.28)

Genre:

Research Report > New Finding (0.46)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Deep Double Descent via Smooth Interpolation

Gamba, Matteo, Englesson, Erik, Björkman, Mårten, Azizpour, Hossein

arXiv.org Artificial IntelligenceApr-8-2023

The ability of overparameterized deep networks to interpolate noisy data, while at the same time showing good generalization performance, has been recently characterized in terms of the double descent curve for the test error. Common intuition from polynomial regression suggests that overparameterized networks are able to sharply interpolate noisy data, without considerably deviating from the ground-truth signal, thus preserving generalization ability. At present, a precise characterization of the relationship between interpolation and generalization for deep networks is missing. In this work, we quantify sharpness of fit of the training data interpolated by neural network functions, by studying the loss landscape w.r.t. to the input variable locally to each training point, over volumes around cleanly- and noisily-labelled training samples, as we systematically increase the number of model parameters and training epochs. Our findings show that loss sharpness in the input space follows both model- and epoch-wise double descent, with worse peaks observed around noisy labels. While small interpolating models sharply fit both clean and noisy data, large interpolating models express a smooth loss landscape, where noisy targets are predicted over large volumes around training data points, in contrast to existing intuition.

artificial intelligence, augmentation, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2209.1008

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Predicting the wall-shear stress and wall pressure through convolutional neural networks

Balasubramanian, Arivazhagan G., Guastoni, Luca, Schlatter, Philipp, Azizpour, Hossein, Vinuesa, Ricardo

arXiv.org Machine LearningMar-1-2023

The objective of this study is to assess the capability of convolution-based neural networks to predict wall quantities in a turbulent open channel flow. The first tests are performed by training a fully-convolutional network (FCN) to predict the 2D velocity-fluctuation fields at the inner-scaled wall-normal location $y^{+}_{\rm target}$, using the sampled velocity fluctuations in wall-parallel planes located farther from the wall, at $y^{+}_{\rm input}$. The predictions from the FCN are compared against the predictions from a proposed R-Net architecture. Since the R-Net model is found to perform better than the FCN model, the former architecture is optimized to predict the 2D streamwise and spanwise wall-shear-stress components and the wall pressure from the sampled velocity-fluctuation fields farther from the wall. The dataset is obtained from DNS of open channel flow at $Re_{\tau} = 180$ and $550$. The turbulent velocity-fluctuation fields are sampled at various inner-scaled wall-normal locations, along with the wall-shear stress and the wall pressure. At $Re_{\tau}=550$, both FCN and R-Net can take advantage of the self-similarity in the logarithmic region of the flow and predict the velocity-fluctuation fields at $y^{+} = 50$ using the velocity-fluctuation fields at $y^{+} = 100$ as input with about 10% error in prediction of streamwise-fluctuations intensity. Further, the R-Net is also able to predict the wall-shear-stress and wall-pressure fields using the velocity-fluctuation fields at $y^+ = 50$ with around 10% error in the intensity of the corresponding fluctuations at both $Re_{\tau} = 180$ and $550$. These results are an encouraging starting point to develop neural-network-based approaches for modelling turbulence near the wall in large-eddy simulations.

artificial intelligence, machine learning, prediction, (17 more...)

arXiv.org Machine Learning

doi: 10.1016/j.ijheatfluidflow.2023.109200

2303.00706

Country: Europe (0.68)

Genre: Research Report (0.64)

Industry: Energy > Oil & Gas > Upstream (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Consistency Regularization Can Improve Robustness to Label Noise

Englesson, Erik, Azizpour, Hossein

arXiv.org Machine LearningOct-4-2021

Consistency regularization is a commonly-used technique for semi-supervised and self-supervised learning. It is an auxiliary objective function that encourages the prediction of the network to be similar in the vicinity of the observed training samples. Hendrycks et al. (2020) have recently shown such regularization naturally brings test-time robustness to corrupted data and helps with calibration. This paper empirically studies the relevance of consistency regularization for training-time robustness to noisy labels. First, we make two interesting and useful observations regarding the consistency of networks trained with the standard cross entropy loss on noisy datasets which are: (i) networks trained on noisy data have lower consistency than those trained on clean data, and(ii) the consistency reduces more significantly around noisy-labelled training data points than correctly-labelled ones. Then, we show that a simple loss function that encourages consistency improves the robustness of the models to label noise on both synthetic (CIFAR-10, CIFAR-100) and real-world (WebVision) noise as well as different noise rates and types and achieves state-of-the-art results.

artificial intelligence, ground transportation, machine learning, (16 more...)

arXiv.org Machine Learning

2110.01242

Country: Europe > Sweden (0.14)

Genre: Research Report (1.00)

Industry: Transportation > Ground > Road (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback