AITopics | hyperbolic tangent activation function

Collaborating Authors

hyperbolic tangent activation function

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Partition of Unity Physics-Informed Neural Networks (POU-PINNs): An Unsupervised Framework for Physics-Informed Domain Decomposition and Mixtures of Experts

Rodriguez, Arturo, Chattopadhyay, Ashesh, Kumar, Piyush, Rodriguez, Luis F., Kumar, Vinod

arXiv.org Artificial IntelligenceDec-7-2024

Physics-informed neural networks (PINNs) commonly address ill-posed inverse problems by uncovering unknown physics. This study presents a novel unsupervised learning framework that identifies spatial subdomains with specific governing physics. It uses the partition of unity networks (POUs) to divide the space into subdomains, assigning unique nonlinear model parameters to each, which are integrated into the physics model. A vital feature of this method is a physics residual-based loss function that detects variations in physical properties without requiring labeled data. This approach enables the discovery of spatial decompositions and nonlinear parameters in partial differential equations (PDEs), optimizing the solution space by dividing it into subdomains and improving accuracy. Its effectiveness is demonstrated through applications in porous media thermal ablation and ice-sheet modeling, showcasing its potential for tackling real-world physics challenges.

artificial intelligence, machine learning, partition, (14 more...)

arXiv.org Artificial Intelligence

2412.06842

Country: North America > United States (1.00)

Genre: Research Report (0.64)

Industry:

Energy > Oil & Gas > Upstream (1.00)
Government > Regional Government > North America Government > United States Government (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Elliot Activation Function: What Is It and Is It Effective?

#artificialintelligenceFeb-5-2023, 09:10:46 GMT

Are you in the middle of creating a new machine-learning model and unsure of what activation function you should be using? But wait, what is an activation function? Activation functions allow machine learning models to understand and solve nonlinear problems. Using an activation function in neural networks specifically helps with the passing of the most important information from each neuron to the next. Today, the ReLU Activation Function is generally used in the architecture of Neural Networks, however, that does not necessarily mean it is always the best choice.

activation function, elliot activation function, relu activation function, (11 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

Multidimensional analysis using sensor arrays with deep learning for high-precision and high-accuracy diagnosis

Payette, Julie, Cloutier, Sylvain G., Vaussenat, Fabrice

arXiv.org Artificial IntelligenceDec-6-2022

In the upcoming years, artificial intelligence (AI) is going to transform the practice of medicine in most of its specialties. Deep learning can help achieve better and earlier problem detection, while reducing errors on diagnosis. By feeding a deep neural network (DNN) with the data from a low-cost and low-accuracy sensor array, we demonstrate that it becomes possible to significantly improve the measurements' precision and accuracy. The data collection is done with an array composed of 32 temperature sensors, including 16 analog and 16 digital sensors. All sensors have accuracies between 0.5-2.0$^\circ$C. 800 vectors are extracted, covering a range from to 30 to 45$^\circ$C. In order to improve the temperature readings, we use machine learning to perform a linear regression analysis through a DNN. In an attempt to minimize the model's complexity in order to eventually run inferences locally, the network with the best results involves only three layers using the hyperbolic tangent activation function and the Adam Stochastic Gradient Descent (SGD) optimizer. The model is trained with a randomly-selected dataset using 640 vectors (80% of the data) and tested with 160 vectors (20%). Using the mean squared error as a loss function between the data and the model's prediction, we achieve a loss of only 1.47x10$^{-4}$ on the training set and 1.22x10$^{-4}$ on the test set. As such, we believe this appealing approach offers a new pathway towards significantly better datasets using readily-available ultra low-cost sensors.

artificial intelligence, machine learning, sensor, (17 more...)

arXiv.org Artificial Intelligence

doi: 10.1038/s41598-023-38290-8

2211.17139

Country:

North America > Canada > Quebec > Montreal (0.04)
North America > United States (0.04)
Africa > Uganda (0.04)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Applications of Deep Neural Networks

Heaton, Jeff

arXiv.org Artificial IntelligenceSep-11-2020

Deep learning is a group of exciting new technologies for neural networks. Through a combination of advanced training techniques and neural network architectural components, it is now possible to create neural networks that can handle tabular data, images, text, and audio as both input and output. Deep learning allows a neural network to learn hierarchies of information in a way that is like the function of the human brain. This course will introduce the student to classic neural network structures, Convolution Neural Networks (CNN), Long Short-Term Memory (LSTM), Gated Recurrent Neural Networks (GRU), General Adversarial Networks (GAN), and reinforcement learning. Application of these architectures to computer vision, time series, security, natural language processing (NLP), and data generation will be covered. High-Performance Computing (HPC) aspects will demonstrate how deep learning can be leveraged both on graphical processing units (GPUs), as well as grids. Focus is primarily upon the application of deep learning to problems, with some introduction to mathematical foundations. Readers will use the Python programming language to implement deep learning using Google TensorFlow and Keras. It is not necessary to know Python prior to this book; however, familiarity with at least one programming language is assumed.

computer game, deep learning, hyperbolic tangent activation function, (26 more...)

arXiv.org Artificial Intelligence

2009.05673

Country: North America > United States > Pennsylvania (0.27)

Genre:

Workflow (1.00)
Research Report (1.00)
Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Transportation > Passenger (1.00)
Transportation > Ground > Road (1.00)
Media > Film (1.00)
(12 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Consistent feature selection for neural networks via Adaptive Group Lasso

Ho, Lam Si Tung, Dinh, Vu

arXiv.org Machine LearningJun-10-2020

One main obstacle for the wide use of deep learning in medical and engineering sciences is its interpretability. While neural network models are strong tools for making predictions, they often provide little information about which features play significant roles in influencing the prediction accuracy. To overcome this issue, many regularization procedures for learning with neural networks have been proposed for dropping non-significant features. Unfortunately, the lack of theoretical results casts doubt on the applicability of such pipelines. In this work, we propose and establish a theoretical guarantee for the use of the adaptive group lasso for selecting important features of neural networks. Specifically, we show that our feature selection method is consistent for single-output feed-forward neural networks with one hidden layer and hyperbolic tangent activation function. We demonstrate its applicability using both simulation and data analysis.

artificial intelligence, group lasso, machine learning, (16 more...)

arXiv.org Machine Learning

2006.00334

Country:

North America > United States > Delaware > New Castle County > Newark (0.14)
North America > Canada > Nova Scotia > Halifax Regional Municipality > Halifax (0.04)
Atlantic Ocean > Black Sea (0.04)
Asia > Middle East > Republic of Türkiye > Nevsehir Province > Nevsehir (0.04)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

How to Fix Vanishing Gradients Using the Rectified Linear Activation Function

#artificialintelligenceJan-12-2019, 05:45:58 GMT

The vanishing gradients problem is one example of unstable behavior that you may encounter when training a deep neural network. It describes the situation where a deep multilayer feed-forward network or a recurrent neural network is unable to propagate useful gradient information from the output end of the model back to the layers near the input end of the model. The result is the general inability of models with many layers to learn on a given dataset or to prematurely converge to a poor solution. Many fixes and workarounds have been proposed and investigated, such as alternate weight initialization schemes, unsupervised pre-training, layer-wise training, and variations on gradient descent. Perhaps the most common change is the use of the rectified linear activation function that has become the new default, instead of the hyperbolic tangent activation function that was the default through the late 1990s and 2000s. In this tutorial, you will discover how to diagnose a vanishing gradient problem when training a neural network model and how to fix it using an alternate activation function and weight initialization scheme.

artificial intelligence, deep learning, machine learning, (16 more...)

#artificialintelligence

Genre: Instructional Material > Course Syllabus & Notes (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback