Goto

Collaborating Authors

 Perceptrons


An Ensemble Approach for Multiple Emotion Descriptors Estimation Using Multi-task Learning

arXiv.org Artificial Intelligence

This paper illustrates our submission method to the fourth Affective Behavior Analysis in-the-Wild (ABAW) Competition. The method is used for the Multi-Task Learning Challenge. Instead of using only face information, we employ full information from a provided dataset containing face and the context around the face. We utilized the InceptionNet V3 model to extract deep features then we applied the attention mechanism to refine the features. After that, we put those features into the transformer block and multi-layer perceptron networks to get the final multiple kinds of emotion. Our model predicts arousal and valence, classifies the emotional expression and estimates the action units simultaneously. The proposed system achieves the performance of 0.917 on the MTL Challenge validation dataset.


GitHub - eaplatanios/tensorflow_scala: TensorFlow API for the Scala Programming Language

#artificialintelligence

It attempts to provide most of the functionality provided by the official Python API, while at the same type being strongly-typed and adding some new features. It is a work in progress and a project I started working on for my personal research purposes. Much of the API should be relatively stable by now, but things are still likely to change. Please refer to the main website for documentation and tutorials. For example, the following code shows how simple it is to train a multi-layer perceptron for MNIST using TensorFlow for Scala.


Perceptron: AI that can solve math problems and translate over 200 different languages โ€“ TechCrunch

#artificialintelligence

Research in the field of machine learning and AI, now a key technology in practically every industry and company, is far too voluminous for anyone to read it all. This column, Perceptron, aims to collect some of the most relevant recent discoveries and papers -- particularly in, but not limited to, artificial intelligence -- and explain why they matter. In this batch of recent research, Meta open-sourced a language system that it claims is the first capable of translating 200 different languages with "state-of-the-art" results. Not to be outdone, Google detailed a machine learning model, Minerva, that can solve quantitative reasoning problems including mathematical and scientific questions. And Microsoft released a language model, Godel, for generating "realistic" conversations that's along the lines of Google's widely publicized Lamda. And then we have some new text-to-image generators with a twist.


Perceptron: AI that solves math problems, translates 200 languages, and draws kangaroos

#artificialintelligence

Research in the field of machine learning and AI, now a key technology in practically every industry and company, is far too voluminous for anyoneย โ€ฆ


Approximation Capabilities of Neural Networks using Morphological Perceptrons and Generalizations

arXiv.org Artificial Intelligence

Standard artificial neural networks (ANNs) use sum-product or multiply-accumulate node operations with a memoryless nonlinear activation. These neural networks are known to have universal function approximation capabilities. Previously proposed morphological perceptrons use max-sum, in place of sum-product, node processing and have promising properties for circuit implementations. In this paper we show that these max-sum ANNs do not have universal approximation capabilities. Furthermore, we consider proposed signed-max-sum and max-star-sum generalizations of morphological ANNs and show that these variants also do not have universal approximation capabilities. We contrast these variations to log-number system (LNS) implementations which also avoid multiplications, but do exhibit universal approximation capabilities.


ODFNet: Using orientation distribution functions to characterize 3D point clouds

arXiv.org Artificial Intelligence

Learning new representations of 3D point clouds is an active research area in 3D vision, as the order-invariant point cloud structure still presents challenges to the design of neural network architectures. Recent works explored learning either global or local features or both for point clouds, however none of the earlier methods focused on capturing contextual shape information by analysing local orientation distribution of points. In this paper, we leverage on point orientation distributions around a point in order to obtain an expressive local neighborhood representation for point clouds. We achieve this by dividing the spherical neighborhood of a given point into predefined cone volumes, and statistics inside each volume are used as point features. In this way, a local patch can be represented by not only the selected point's nearest neighbors, but also considering a point density distribution defined along multiple orientations around the point. We are then able to construct an orientation distribution function (ODF) neural network that involves an ODFBlock which relies on mlp (multi-layer perceptron) layers. The new ODFNet model achieves state-of the-art accuracy for object classification on ModelNet40 and ScanObjectNN datasets, and segmentation on ShapeNet S3DIS datasets.


The Mechanical Neural Network(MNN) -- A physical implementation of a multilayer perceptron for education and hands-on experimentation

arXiv.org Artificial Intelligence

In this paper the Mechanical Neural Network(MNN) is introduced, a physical implementation of a multilayer perceptron(MLP) with ReLU activation functions, two input neurons, four hidden neurons and two output neurons. This physical model of a MLP is used in education to give a hands on experience and allow students to experience the effect of changing the parameters of the network on the output. Neurons are small wooden levers which are connected by threads. Students can adapt the weights between the neurons by moving the clamps connecting a neuron via a thread to the next. The MNN can model real valued functions and logical operators including XOR.


Digit Classification with Single-Layer Perceptron

#artificialintelligence

Generally the first thought that comes to mind when one is about to apply Supervised Learning techniques on images is to make use of Convolutional Neural Networks (CNNs). Indeed, this type of neural network is the most suitable for this type of tasks, mainly due to the reduction of dimensionality. If we imagine a dataset of images where the images have been flattened (for example, an image that is a 4x4 matrix is converted to a 16-dimensional vector, as shown in Figure 1), the images are data points in an n-dimensional space, where n is the number of pixels in the image. As can be deduced, the dimensionality of the data when we talk about images is enormous, and therefore this implies having an immense number of parameters in the neural network, which in turn leads to a higher computational cost and execution time. CNNs reduce the dimensionality of the image in each layer of the neural network, also reducing the number of parameters required in training and optimizing the performance of the model for this type of tasks.


Camera Pose Auto-Encoders for Improving Pose Regression

arXiv.org Artificial Intelligence

Absolute pose regressor (APR) networks are trained to estimate the pose of the camera given a captured image. They compute latent image representations from which the camera position and orientation are regressed. APRs provide a different tradeoff between localization accuracy, runtime, and memory, compared to structure-based localization schemes that provide state-of-the-art accuracy. In this work, we introduce Camera Pose Auto-Encoders (PAEs), multilayer perceptrons that are trained via a Teacher-Student approach to encode camera poses using APRs as their teachers. We show that the resulting latent pose representations can closely reproduce APR performance and demonstrate their effectiveness for related tasks. Specifically, we propose a light-weight test-time optimization in which the closest train poses are encoded and used to refine camera position estimation. This procedure achieves a new state-of-the-art position accuracy for APRs, on both the CambridgeLandmarks and 7Scenes benchmarks. We also show that train images can be reconstructed from the learned pose encoding, paving the way for integrating visual information from the train set at a low memory cost.


Amazon.com: Introduction to Machine Learning, fourth edition (Adaptive Computation and Machine Learning series) eBook : Alpaydin, Ethem: Kindle Store

#artificialintelligence

The book covers a broad array of topics not usually included in introductory machine learning texts, including supervised learning, Bayesian decision theory, parametric methods, semiparametric methods, nonparametric methods, multivariate analysis, hidden Markov models, reinforcement learning, kernel machines, graphical models, Bayesian estimation, and statistical testing. The fourth edition offers a new chapter on deep learning that discusses training, regularizing, and structuring deep neural networks such as convolutional and generative adversarial networks; new material in the chapter on reinforcement learning that covers the use of deep networks, the policy gradient methods, and deep reinforcement learning; new material in the chapter on multilayer perceptrons on autoencoders and the word2vec network; and discussion of a popular method of dimensionality reduction, t-SNE. New appendixes offer background material on linear algebra and optimization. End-of-chapter exercises help readers to apply concepts learned. Introduction to Machine Learning can be used in courses for advanced undergraduate and graduate students and as a reference for professionals.