Goto

Collaborating Authors

 feed forward neural network


Combining physics-based and data-driven models: advancing the frontiers of research with Scientific Machine Learning

arXiv.org Artificial Intelligence

Scientific Machine Learning (SciML) is a recently emerged research field which combines physics-based and data-driven models for the numerical approximation of differential problems. Physics-based models rely on the physical understanding of the problem at hand, subsequent mathematical formulation, and numerical approximation. Data-driven models instead aim to extract relations between input and output data without arguing any causality principle underlining the available data distribution. In recent years, data-driven models have been rapidly developed and popularized. Such a diffusion has been triggered by a huge availability of data (the so-called big data), an increasingly cheap computing power, and the development of powerful machine learning algorithms. SciML leverages the physical awareness of physics-based models and, at the same time, the efficiency of data-driven algorithms. With SciML, we can inject physics and mathematical knowledge into machine learning algorithms. Yet, we can rely on data-driven algorithms' capability to discover complex and non-linear patterns from data and improve the descriptive capacity of physics-based models. After recalling the mathematical foundations of digital modelling and machine learning algorithms, and presenting the most popular machine learning architectures, we discuss the great potential of a broad variety of SciML strategies in solving complex problems governed by partial differential equations. Finally, we illustrate the successful application of SciML to the simulation of the human cardiac function, a field of significant socio-economic importance that poses numerous challenges on both the mathematical and computational fronts. The corresponding mathematical model is a complex system of non-linear ordinary and partial differential equations describing the electromechanics, valve dynamics, blood circulation, perfusion in the coronary tree, and torso potential. Despite the robustness and accuracy of physics-based models, certain aspects, such as unveiling constitutive laws for cardiac cells and myocardial material properties, as well as devising efficient reduced order models to dominate the extraordinary computational complexity, have been successfully tackled by leveraging data-driven models.


Recurrent Neural Network

#artificialintelligence

Suppose we have a sentence and we have to predict whether the sentence is positive or negative, we can do it with the help of RNN. Time series data, stock fore-casting, Spam Classifier is where RNN is extensively used. Before we deep dive into neural network, lets mull over why do we need to deal with the sequential neural network? What we can obtain using neural network with sequences? Lets take an example of Spam classifier, we pass an input text data to the model.


An FNet based Auto Encoder for Long Sequence News Story Generation

arXiv.org Artificial Intelligence

In this paper, we design an auto encoder based off of Google's FNet Architecture in order to generate text from a subset of news stories contained in Google's C4 dataset. We discuss previous attempts and methods to generate text from autoencoders and non LLM Models. FNET poses multiple advantages to BERT based encoders in the realm of efficiency which train 80% faster on GPUs and 70% faster on TPUs. We then compare outputs of how this autencoder perfroms on different epochs. Finally, we analyze what outputs the encoder produces with different seed text.


On the High Symmetry of Neural Network Functions

arXiv.org Artificial Intelligence

Training neural networks means solving a high-dimensional optimization problem. Normally the goal is to minimize a loss function that depends on what is called the network function, or in other words the function that gives the network output given a certain input. This function depends on a large number of parameters, also known as weights, that depends on the network architecture. In general the goal of this optimization problem is to find the global minimum of the network function. In this paper it is discussed how due to how neural networks are designed, the neural network function present a very large symmetry in the parameter space. This work shows how the neural network function has a number of equivalent minima, in other words minima that give the same value for the loss function and the same exact output, that grows factorially with the number of neurons in each layer for feed forward neural network or with the number of filters in a convolutional neural networks. When the number of neurons and layers is large, the number of equivalent minima grows extremely fast. This will have of course consequences for the study of how neural networks converges to minima during training. This results is known, but in this paper for the first time a proper mathematical discussion is presented and an estimate of the number of equivalent minima is derived.


A new family of Constitutive Artificial Neural Networks towards automated model discovery

arXiv.org Artificial Intelligence

For more than 100 years, chemical, physical, and material scientists have proposed competing constitutive models to best characterize the behavior of natural and man-made materials in response to mechanical loading. Now, computer science offers a universal solution: Neural Networks. Neural Networks are powerful function approximators that can learn constitutive relations from large data without any knowledge of the underlying physics. However, classical Neural Networks ignore a century of research in constitutive modeling, violate thermodynamic considerations, and fail to predict the behavior outside the training regime. Here we design a new family of Constitutive Artificial Neural Networks that inherently satisfy common kinematic, thermodynamic, and physic constraints and, at the same time, constrain the design space of admissible functions to create robust approximators, even in the presence of sparse data. We revisit the non-linear field theories of mechanics and reverse-engineer the network input to account for material objectivity, symmetry, and incompressibility; the network output to enforce thermodynamic consistency; the activation functions to implement physically reasonable restrictions; and the network architecture to ensure polyconvexity. We demonstrate that this new class of models is a generalization of the classical neo Hooke, Blatz Ko, Mooney Rivlin, Yeoh, and Demiray models and that the network weights have a clear physical interpretation. When trained with classical benchmark data for rubber, our network autonomously selects the best constitutive model and learns its parameters. Our findings suggests that Constitutive Artificial Neural Networks have the potential to induce a paradigm shift in constitutive modeling, from user-defined model selection to automated model discovery. Our source code, data, and examples are available at https://github.com/LivingMatterLab/CANN.


Building Feedforward Neural Networks from Scratch

#artificialintelligence

Originally published on Towards AI the World's Leading AI and Technology News and Media Company. If you are building an AI-related product or service, we invite you to consider becoming an AI sponsor. At Towards AI, we help scale AI and technology startups. Let us help you unleash your technology to the masses. This article will give you a general idea of what Feed Forward Neural Networks (FFNNs) are.


Deep Learning Interviews: Hundreds of fully solved job interview questions from a wide range of key topics in AI

arXiv.org Artificial Intelligence

The second edition of Deep Learning Interviews is home to hundreds of fully-solved problems, from a wide range of key topics in AI. It is designed to both rehearse interview or exam specific topics and provide machine learning MSc / PhD. students, and those awaiting an interview a well-organized overview of the field. The problems it poses are tough enough to cut your teeth on and to dramatically improve your skills-but they're framed within thought-provoking questions and engaging stories. That is what makes the volume so specifically valuable to students and job seekers: it provides them with the ability to speak confidently and quickly on any relevant topic, to answer technical questions clearly and correctly, and to fully understand the purpose and meaning of interview questions and answers. Those are powerful, indispensable advantages to have when walking into the interview room. The book's contents is a large inventory of numerous topics relevant to DL job interviews and graduate level exams. That places this work at the forefront of the growing trend in science to teach a core set of practical mathematical and computational skills. It is widely accepted that the training of every computer scientist must include the fundamental theorems of ML, and AI appears in the curriculum of nearly every university. This volume is designed as an excellent reference for graduates of such programs.


Feed Forward Neural Network

#artificialintelligence

A Feed Forward Neural Network is commonly seen in its simplest form as a single layer perceptron. In this model, a series of inputs enter the layer and are multiplied by the weights. Each value is then added together to get a sum of the weighted input values. If the sum of the values is above a specific threshold, usually set at zero, the value produced is often 1, whereas if the sum falls below the threshold, the output value is -1. The single layer perceptron is an important model of feed forward neural networks and is often used in classification tasks. Furthermore, single layer perceptrons can incorporate aspects of machine learning.


The Race for Intelligent AI

#artificialintelligence

Many companies including: OpenAI, Google/DeepMind, Microsoft, and countless others, have started the race for "truly" intelligent AI. For the majority of this article I'll be referencing OpenAI's GPT series of machine learning models. However, the question: "what does it mean to be truly intelligent?" OpenAI have modeled this problem as a text transformer model. This model takes sequences of pieces of words (two character pairs) and tries to predict the next set of word parts.


Mathematical Perspective of Machine Learning

arXiv.org Machine Learning

We take a closer look at some theoretical challenges of Machine Learning as a function approximation, gradient descent as the default optimization algorithm, limitations of fixed length and width networks and a different approach to RNNs from a mathematical perspective.