AITopics | hyperbolic tangent

Collaborating Authors

hyperbolic tangent

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Latent Assistance Networks: Rediscovering Hyperbolic Tangents in RL

Kooi, Jacob E., Hoogendoorn, Mark, François-Lavet, Vincent

arXiv.org Artificial IntelligenceJun-13-2024

Activation functions are one of the key components of a neural network. The most commonly used activation functions can be classed into the category of continuously differentiable (e.g. tanh) and linear-unit functions (e.g. ReLU), both having their own strengths and drawbacks with respect to downstream performance and representation capacity through learning (e.g. measured by the number of dead neurons and the effective rank). In reinforcement learning, the performance of continuously differentiable activations often falls short as compared to linear-unit functions. From the perspective of the activations in the last hidden layer, this paper provides insights regarding this sub-optimality and explores how activation functions influence the occurrence of dead neurons and the magnitude of the effective rank. Additionally, a novel neural architecture is proposed that leverages the product of independent activation values. In the Atari domain, we show faster learning, a reduction in dead neurons and increased effective rank.

activation, hyperbolic tangent, lan, (15 more...)

arXiv.org Artificial Intelligence

2406.09079

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > Netherlands > North Holland > Amsterdam (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
(2 more...)

Genre: Research Report (0.82)

Industry: Leisure & Entertainment (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Physics Informed Piecewise Linear Neural Networks for Process Optimization

Koksal, Ece S., Aydin, Erdal

arXiv.org Artificial IntelligenceFeb-2-2023

Constructing first-principles models is usually a challenging and time-consuming task due to the complexity of the real-life processes. On the other hand, data-driven modeling, and in particular neural network models often suffer from issues such as overfitting and lack of useful and highquality data. At the same time, embedding trained machine learning models directly into the optimization problems has become an effective and state-of-the-art approach for surrogate optimization, whose performance can be improved by physics-informed training. In this study, it is proposed to upgrade piece-wise linear neural network models with physics informed knowledge for optimization problems with neural network models embedded. In addition to using widely accepted and naturally piece-wise linear rectified linear unit (ReLU) activation functions, this study also suggests piece-wise linear approximations for the hyperbolic tangent activation function to widen the domain. Optimization of three case studies, a blending process, an industrial distillation column and a crude oil column are investigated. For all cases, physics-informed trained neural network based optimal results are closer to global optimality. Finally, associated CPU times for the optimization problems are much shorter than the standard optimization results.

artificial intelligence, machine learning, optimization, (17 more...)

arXiv.org Artificial Intelligence

2302.0099

Genre: Research Report > New Finding (0.54)

Industry:

Energy > Oil & Gas > Downstream (1.00)
Energy > Oil & Gas > Upstream (0.68)
Materials > Chemicals > Commodity Chemicals > Petrochemicals (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.69)

Add feedback

Activation Functions (updated) – The Code-It List

#artificialintelligenceSep-27-2022, 11:50:39 GMT

This entry has been updated to TensorFlow v2.10.0 and PyTorch 1.12.1 An activation function is a function that is applied to a neuron in a neural network to help it learn complex patterns of data, deciding what should be transmitted to the next neuron in the network. A perceptron is a neural network unit that inputs the data to be learned into a neuron and processes it according to its activation function. The perceptron is a simple algorithm that, given an input vector x of m values(x_1, x_2, ..., x_m), outputs a 1 or a 0 (step function), and its function is defined as follows: Here, ω is a vector of weights, ωx is the dot product, and b is the bias. If x is on this line, the answer is positive; otherwise, it is negative.

activation function, activation function tf, frac, (14 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback

A Unified and Constructive Framework for the Universality of Neural Networks

Bui-Thanh, Tan

arXiv.org Machine LearningJan-7-2022

One of the reasons why many neural networks are capable of replicating complicated tasks or functions is their universal property. Though the past few decades have seen tremendous advances in theories of neural networks, a single constructive framework for neural network universality remains unavailable. This paper is an effort to provide a unified and constructive framework for the universality of a large class of activations including most of existing ones. At the heart of the framework is the concept of neural network approximate identity (nAI). The main result is: {\em any nAI activation function is universal}. It turns out that most of existing activations are nAI, and thus universal in the space of continuous functions on compacta. The framework has the following main properties. First, it is constructive with elementary means from functional analysis, probability theory, and numerical analysis. Second, it is the first unified attempt that is valid for most of existing activations. Third, as a by product, the framework provides the first university proof for some of the existing activation functions including Mish, SiLU, ELU, GELU, and etc. Fourth, it provides new proofs for most activation functions. Fifth, it discovers new activations with guaranteed universality property. Sixth, for a given activation and error tolerance, the framework provides precisely the architecture of the corresponding one-hidden neural network with predetermined number of neurons, and the values of weights/biases. Seventh, the framework allows us to abstractly present the first universal approximation with favorable non-asymptotic rate.

activation function, neural network, sigmoidal function, (14 more...)

arXiv.org Machine Learning

2112.14877

Country:

North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)
(5 more...)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

TensorFlow 101: Introduction to Deep Learning - CouponED

#artificialintelligenceAug-17-2021, 06:10:33 GMT

However, we have to terminate learning time in a reasonable value. In this example we define this ten thousand. Finally, we can make predictions I would like to predict for input instances directly and we need to define a method for that process predictions are made now we can dump these predictions for these inputs actual value is and we also need define a new variable we need to increase index our program is ready to run sorry we forgot yes! our machine learning classifier works with 100% accuracy expected 0, predicted 0 for 0 XOR 0 actual value is 1 and predicted value is also 1 for (0 XOR 1) and the other ones... So, we have developed a Exclusive OR classifier with tensorflow a hello world program I hope you guys enjoyed and understand Have a good day

classifier, tensorflow, tensorflow 101, (16 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.54)

Add feedback

Recurrent Neural Networks -- Part 1

#artificialintelligenceJul-29-2020, 21:56:26 GMT

These are the lecture notes for FAU's YouTube Lecture "Deep Learning". This is a full transcript of the lecture video & matching slides. We hope, you enjoy this as much as the videos. Of course, this transcript was created with deep learning techniques largely automatically and only minor manual modifications were performed. If you spot mistakes, please let us know!

artificial intelligence, machine learning, recurrent neural network, (17 more...)

#artificialintelligence

Country:

North America > Canada > Ontario > Toronto (0.15)
Europe > Spain > Andalusia > Málaga Province > Málaga (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)

Genre: Instructional Material > Course Syllabus & Notes (0.45)

Industry:

Media > Music (0.69)
Leisure & Entertainment (0.69)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Neural Network Verification through Replication

Sanchirico, Mauro J. III, Jiao, Xun, Nataraj, C.

arXiv.org Machine LearningJul-25-2020

A system identification based approach to neural network model replication is presented and the application of model replication to verification of fundamental, single hidden layer, neural network systems is demonstrated. The presented approach serves as a means to partially address the problem of verifying that a neural network implementation meets a provided specification given only grey-box access to the implemented network. The procedure developed involves stimulating a neural network with a chosen signal, extracting a replicated model from the response, and systematically checking that the replicated model is output-equivalent to a specified model in order to verify that the grey-box system under test is implemented to specification without direct access to its hidden parameters. The replication step is introduced to provide an inherent guarantee that the stimulus signals employed yield sufficient test coverage. This method is investigated as a neural network focused nonlinear counterpart to the traditional verification of circuits through system identification. A strategy for choosing the stimulus is provided and an algorithm for verifying that the resulting response is indicative of a specification-compliant neural network system under test is derived. We find that the method can reliably detect defects in small neural networks or in small sub-circuits within larger neural networks.

artificial intelligence, machine learning, neural network, (15 more...)

arXiv.org Machine Learning

2007.06226

Country:

North America > United States > California > Los Angeles County > Long Beach (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > New Jersey (0.04)
(7 more...)

Genre: Research Report (0.65)

Industry: Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Activation Functions for Deep Learning

#artificialintelligenceJun-16-2020, 04:36:45 GMT

Activation functions play a major role in the learning process of a neural network. So far, we have used only the sigmoid function as the activation function in our networks, but we saw how the sigmoid function has its shortcomings since it can lead to the vanishing gradient problem for the earlier layers. In this blog, we will discuss other activation functions; ones that are more efficient to use and are more applicable to deep learning applications. There are seven types of activation functions that you can use when building a neural network. There is the binary step function, the linear or identity function, there is our old friend the sigmoid or logistic function, there is the hyperbolic tangent, or tanh, function, the rectified linear unit (ReLU) function, the leaky ReLU function, and the softmax function.

artificial intelligence, machine learning, sigmoid function, (12 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

what_nns_learn.html

#artificialintelligenceJan-25-2019, 23:50:59 GMT

Neural networks are famously difficult to interpret. It's hard to know what they are actually learning when we train them. Let's take a closer look and see whether we can build a good picture of what's going on inside. Just like every other supervised machine learning model, neural networks learn relationships between input variables and output variables. In fact, we can even see how it's related to the most iconic model of all, linear regression. Linear regression assumes a straight line relationship between an input variable x and an output variable y. x is multiplied by a constant, m, which also happens to be the slope of the line, and it's added to another constant, b, which happens to be where the line crosses the y axis. We can represent this in a picture. Our input value x is multiplied by m. Our constant b, is multiplied by one. And then they are added together to get y.

artificial intelligence, machine learning, node, (19 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.70)

Add feedback

From Hard to Soft: Understanding Deep Network Nonlinearities via Vector Quantization and Statistical Inference

Balestriero, Randall, Baraniuk, Richard G.

arXiv.org Machine LearningOct-22-2018

Nonlinearity is crucial to the performance of a deep (neural) network (DN). To date there has been little progress understanding the menagerie of available nonlinearities, but recently progress has been made on understanding the r\^ole played by piecewise affine and convex nonlinearities like the ReLU and absolute value activation functions and max-pooling. In particular, DN layers constructed from these operations can be interpreted as {\em max-affine spline operators} (MASOs) that have an elegant link to vector quantization (VQ) and $K$-means. While this is good theoretical progress, the entire MASO approach is predicated on the requirement that the nonlinearities be piecewise affine and convex, which precludes important activation functions like the sigmoid, hyperbolic tangent, and softmax. {\em This paper extends the MASO framework to these and an infinitely large class of new nonlinearities by linking deterministic MASOs with probabilistic Gaussian Mixture Models (GMMs).} We show that, under a GMM, piecewise affine, convex nonlinearities like ReLU, absolute value, and max-pooling can be interpreted as solutions to certain natural "hard" VQ inference problems, while sigmoid, hyperbolic tangent, and softmax can be interpreted as solutions to corresponding "soft" VQ inference problems. We further extend the framework by hybridizing the hard and soft VQ optimizations to create a $\beta$-VQ inference that interpolates between hard, soft, and linear VQ inference. A prime example of a $\beta$-VQ DN nonlinearity is the {\em swish} nonlinearity, which offers state-of-the-art performance in a range of computer vision tasks but was developed ad hoc by experimentation. Finally, we validate with experiments an important assertion of our theory, namely that DN performance can be significantly improved by enforcing orthogonality in its linear filters.

artificial intelligence, machine learning, nonlinearity, (18 more...)

arXiv.org Machine Learning

1810.09274

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback