Goto

Collaborating Authors

hidden layer


Machine Learning Basics Everyone Should Know - InformationWeek

#artificialintelligence

AI is seeping into just about everything, from consumer products to industrial equipment. As enterprises utilize AI to become more competitive, more of them are taking advantage of machine learning to accomplish more in less time, reduce costs and discover something whether a drug or a latent market desire. While there's no need for non-data scientists to understand how machine learning (ML) works, they should understand enough to use basic terminology correctly. Although the scope of ML extends considerably past what's possible to cover in this short article, following are some of the fundamentals. Before one can grasp machine learning concepts, they need to understand what machine learning terms mean.


InformationWeek, serving the information needs of the Business Technology Community

#artificialintelligence

AI is seeping into just about everything, from consumer products to industrial equipment. As enterprises utilize AI to become more competitive, more of them are taking advantage of machine learning to accomplish more in less time, reduce costs and discover something whether a drug or a latent market desire. While there's no need for non-data scientists to understand how machine learning (ML) works, they should understand enough to use basic terminology correctly. Although the scope of ML extends considerably past what's possible to cover in this short article, following are some of the fundamentals. Before one can grasp machine learning concepts, they need to understand what machine learning terms mean.


Explainable Deep Neural Networks

#artificialintelligence

The emerging subject of deep learning mathematical analysis [1] has been tasked with answering some "mysterious" facts that appear to be inexplicable using traditional mathematical methodologies. They are attempting to comprehend what a neural network actually does. Deep Neural Networks (DNN) transform data at each layer, producing a new representation as output. DNN attempts to divide data in a classification problem, enhancing this action layer by layer until it reaches an output layer when DNN provides its best possible result. Under the manifold hypothesis (natural data creates lower-dimensional manifolds in its embedding space), this task can be viewed as the separation of lower-dimensional manifolds in a data space. DNN layers are linked by a realization function, Φ (an affine transformation) and a component-wise activation function, ρ. Consider the fully connected feedforward neural network depicted in Figure 2. The network architecture can be described by defining the number of layers N, L, the number of neurons, and the activation function.


How many neurons for a neural network?

#artificialintelligence

A neural network is a particular model that tries to catch the correlation between the features and the target transforming the dataset according to a layer of neurons. There are several books that have been written around neural networks and it's not in the scope of this article to give you a complete overview of this kind of model. Let me just say that a neural network is made by some layers of neurons. Each neuron gets some inputs, transforms them and returns an output. The output of a neuron can become the input of the neurons of the next layer and so on, building more and more complex architectures.


kdehumor at semeval-2020 task 7: a neural network model for detecting funniness in dataset humicroedit

arXiv.org Artificial Intelligence

This paper describes our contribution to SemEval-2020 Task 7: Assessing Humor in Edited News Headlines. Here we present a method based on a deep neural network. In recent years, quite some attention has been devoted to humor production and perception. Our team KdeHumor employs recurrent neural network models including Bi-Directional LSTMs (BiLSTMs). Moreover, we utilize the state-of-the-art pre-trained sentence embedding techniques. We analyze the performance of our method and demonstrate the contribution of each component of our architecture.


Why are neural networks so powerful?

#artificialintelligence

It is common knowledge that neural networks are very powerful and they can be used for almost any statistical learning problem with great results. But have you thought about why is this the case? Why is this method more powerful in most scenarios than many other algorithms? As always with machine learning, there is a precise mathematical reason for this. Simply saying, the set of functions described by a neural network model is very large.


Optimal Stopping via Randomized Neural Networks

arXiv.org Machine Learning

This paper presents new machine learning approaches to approximate the solution of optimal stopping problems. The key idea of these methods is to use neural networks, where the hidden layers are generated randomly and only the last layer is trained, in order to approximate the continuation value. Our approaches are applicable for high dimensional problems where the existing approaches become increasingly impractical. In addition, since our approaches can be optimized using a simple linear regression, they are very easy to implement and theoretical guarantees can be provided. In Markovian examples our randomized reinforcement learning approach and in non-Markovian examples our randomized recurrent neural network approach outperform the state-of-the-art and other relevant machine learning approaches.


Backpropagation in Neural Networks

#artificialintelligence

Do you know how a neural network trains itself to do some job? In this article, we will see the whole process of how a neural network learns. The main goal of a network is to reduce the loss incurring while predicting the outputs. To minimize this loss, we will apply some optimization technique called Gradient descent. In this technique, we update the value of parameters while backpropagating in the network, i.e., find the derivates of the error function with respect to the weights to decrease the loss function and use this Gradient to update the current weight.


DisCo: Remedy Self-supervised Learning on Lightweight Models with Distilled Contrastive Learning

arXiv.org Artificial Intelligence

While self-supervised representation learning (SSL) has received widespread attention from the community, recent research argue that its performance will suffer a cliff fall when the model size decreases. The current method mainly relies on contrastive learning to train the network and in this work, we propose a simple yet effective Distilled Contrastive Learning (DisCo) to ease the issue by a large margin. Specifically, we find the final embedding obtained by the mainstream SSL methods contains the most fruitful information, and propose to distill the final embedding to maximally transmit a teacher's knowledge to a lightweight model by constraining the last embedding of the student to be consistent with that of the teacher. In addition, in the experiment, we find that there exists a phenomenon termed Distilling BottleNeck and present to enlarge the embedding dimension to alleviate this problem. Our method does not introduce any extra parameter to lightweight models during deployment. Experimental results demonstrate that our method achieves the state-of-the-art on all lightweight models. Particularly, when ResNet-101/ResNet-50 is used as teacher to teach EfficientNet-B0, the linear result of EfficientNet-B0 on ImageNet is very close to ResNet-101/ResNet-50, but the number of parameters of EfficientNet-B0 is only 9.4%/16.3% of ResNet-101/ResNet-50.


Non-linear Functional Modeling using Neural Networks

arXiv.org Machine Learning

We introduce a new class of non-linear models for functional data based on neural networks. Deep learning has been very successful in non-linear modeling, but there has been little work done in the functional data setting. We propose two variations of our framework: a functional neural network with continuous hidden layers, called the Functional Direct Neural Network (FDNN), and a second version that utilizes basis expansions and continuous hidden layers, called the Functional Basis Neural Network (FBNN). Both are designed explicitly to exploit the structure inherent in functional data. To fit these models we derive a functional gradient based optimization algorithm. The effectiveness of the proposed methods in handling complex functional models is demonstrated by comprehensive simulation studies and real data examples.