AITopics | feed forward network

Collaborating Authors

feed forward network

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Recent Advances in Non-convex Smoothness Conditions and Applicability to Deep Linear Neural Networks

Patel, Vivak, Varner, Christian

arXiv.org Artificial IntelligenceSep-20-2024

The presence of non-convexity in smooth optimization problems arising from deep learning have sparked new smoothness conditions in the literature and corresponding convergence analyses. We discuss these smoothness conditions, order them, provide conditions for determining whether they hold, and evaluate their applicability to training a deep linear neural network for binary classification.

globally lipschitz, gradient function, smoothness condition, (13 more...)

arXiv.org Artificial Intelligence

2409.13672

Country:

North America > United States > Wisconsin (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > New Hampshire > Hillsborough County > Nashua (0.04)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Bi-Level Spatial and Channel-aware Transformer for Learned Image Compression

Soltani, Hamidreza, Ghasemi, Erfan

arXiv.org Artificial IntelligenceAug-7-2024

Recent advancements in learned image compression (LIC) methods have demonstrated superior performance over traditional hand-crafted codecs. These learning-based methods often employ convolutional neural networks (CNNs) or Transformer-based architectures. However, these nonlinear approaches frequently overlook the frequency characteristics of images, which limits their compression efficiency. To address this issue, we propose a novel Transformer-based image compression method that enhances the transformation stage by considering frequency components within the feature map. Our method integrates a novel Hybrid Spatial-Channel Attention Transformer Block (HSCATB), where a spatial-based branch independently handles high and low frequencies at the attention layer, and a Channel-aware Self-Attention (CaSA) module captures information across channels, significantly improving compression performance. Additionally, we introduce a Mixed Local-Global Feed Forward Network (MLGFFN) within the Transformer block to enhance the extraction of diverse and rich information, which is crucial for effective compression. These innovations collectively improve the transformation's ability to project data into a more decorrelated latent space, thereby boosting overall compression efficiency. Experimental results demonstrate that our framework surpasses state-of-the-art LIC methods in rate-distortion performance.

compression, image compression, transformer, (14 more...)

arXiv.org Artificial Intelligence

2408.03842

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

The CHIR Algorithm for Feed Forward Networks with Binary Weights

Neural Information Processing SystemsApr-6-2023, 19:52:41 GMT

A new learning algorithm, Learning by Choice of Internal Rep(cid:173) resetations (CHIR), was recently introduced. Whereas many algo(cid:173) rithms reduce the learning process to minimizing a cost function over the weights, our method treats the internal representations as the fundamental entities to be determined. The algorithm applies a search procedure in the space of internal representations, and a cooperative adaptation of the weights (e.g. by using the perceptron learning rule). Since the introduction of its basic, single output ver(cid:173) sion, the CHIR algorithm was generalized to train any feed forward network of binary neurons. Here we present the generalised version of the CHIR algorithm, and further demonstrate its versatility by describing how it can be modified in order to train networks with binary ( 1) weights.

binary weight, chir algorithm, feed forward network, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (1.00)

Add feedback

Analytical Study of the Interplay between Architecture and Predictability

Neural Information Processing SystemsApr-6-2023, 17:57:43 GMT

We study model feed forward networks as time series predictors in the stationary limit. The focus is on complex, yet non-chaotic, behavior. The main question we address is whether the asymptotic behavior is governed by the architecture, regardless the details of the weights . We find hierarchies among classes of architectures with respect to the attract or dimension of the long term sequence they are capable of generating; larger number of hidden units can generate higher dimensional attractors. In the case of a perceptron, we develop the stationary solution for general weights, and show that the flow is typically one dimensional.

analytical study, architecture and predictability, interplay, (5 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.65)

Add feedback

Movement Analytics: Current Status, Application to Manufacturing, and Future Prospects from an AI Perspective

Baumgartner, Peter, Smith, Daniel, Rana, Mashud, Kapoor, Reena, Tartaglia, Elena, Schutt, Andreas, Rahman, Ashfaqur, Taylor, John, Dunstall, Simon

arXiv.org Artificial IntelligenceOct-3-2022

Data-driven decision making is becoming an integral part of manufacturing companies. Data is collected and commonly used to improve efficiency and produce high quality items for the customers. IoT-based and other forms of object tracking are an emerging tool for collecting movement data of objects/entities (e.g. human workers, moving vehicles, trolleys etc.) over space and time. Movement data can provide valuable insights like process bottlenecks, resource utilization, effective working time etc. that can be used for decision making and improving efficiency. Turning movement data into valuable information for industrial management and decision making requires analysis methods. We refer to this process as movement analytics. The purpose of this document is to review the current state of work for movement analytics both in manufacturing and more broadly. We survey relevant work from both a theoretical perspective and an application perspective. From the theoretical perspective, we put an emphasis on useful methods from two research areas: machine learning, and logic-based knowledge representation. We also review their combinations in view of movement analytics, and we discuss promising areas for future development and application. Furthermore, we touch on constraint optimization. From an application perspective, we review applications of these methods to movement analytics in a general sense and across various industries. We also describe currently available commercial off-the-shelf products for tracking in manufacturing, and we overview main concepts of digital twins and their applications.

logic & formal reasoning, machine learning, real time system, (29 more...)

arXiv.org Artificial Intelligence

2210.01344

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(31 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)
Transportation > Ground > Rail (1.00)
(8 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
(18 more...)

Add feedback

Kformer: Knowledge Injection in Transformer Feed-Forward Layers

Yao, Yunzhi, Huang, Shaohan, Dong, Li, Wei, Furu, Chen, Huajun, Zhang, Ningyu

arXiv.org Artificial IntelligenceAug-10-2022

Recent days have witnessed a diverse set of knowledge injection models for pre-trained language models (PTMs); however, most previous studies neglect the PTMs' own ability with quantities of implicit knowledge stored in parameters. A recent study has observed knowledge neurons in the Feed Forward Network (FFN), which are responsible for expressing factual knowledge. In this work, we propose a simple model, Kformer, which takes advantage of the knowledge stored in PTMs and external knowledge via knowledge injection in Transformer FFN layers. Empirically results on two knowledge-intensive tasks, commonsense reasoning (i.e., SocialIQA) and medical question answering (i.e., MedQA-USMLE), demonstrate that Kformer can yield better performance than other knowledge injection technologies such as concatenation or attention-based injection. We think the proposed simple model and empirical findings may be helpful for the community to develop more powerful knowledge injection methods. Code available in https://github.com/zjunlp/Kformer.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2201.05742

Country:

North America > United States (0.14)
Asia > Taiwan (0.04)
Asia > China > Zhejiang Province > Ningbo (0.04)
(3 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.90)

Add feedback

Transformers oversimplified

#artificialintelligenceJan-7-2022, 14:45:31 GMT

Deep learning has kept evolving throughout the years. And that is an important reason for its reputation. Deep learning practices highly emphasize the use of large buckets of parameters to extract useful information about the dataset we're dealing with. By having a large set of parameters, it becomes easier to classify/detect something as we have more data to identify distinctly. One notable milestone in the journey of Deep Learning so far, and specifically in Natural Language Processing, was the introduction of Language Models that highly improved the accuracy and efficiency of doing various NLP tasks. A sequence-sequence model is an encoder-decoder mechanism-based model that takes a sequence of inputs and returns a sequence of outputs as result.

attention layer, attention value, information, (16 more...)

#artificialintelligence

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Self-Attention Network for Hierarchical Data Structures with an Application to Claims Management

Löw, Leander, Spindler, Martin, Brechmann, Eike

arXiv.org Machine LearningAug-30-2018

Insurance companies must manage millions of claims per year. While most of these claims are non-fraudulent, fraud detection is core for insurance companies. The ultimate goal is a predictive model to single out the fraudulent claims and pay out the non-fraudulent ones immediately. Modern machine learning methods are well suited for this kind of problem. Health care claims often have a data structure that is hierarchical and of variable length. We propose one model based on piecewise feed forward neural networks (deep learning) and another model based on self-attention neural networks for the task of claim management. We show that the proposed methods outperform bag-of-words based models, hand designed features, and models based on convolutional neural networks, on a data set of two million health care claims. The proposed self-attention method performs the best.

artificial intelligence, machine learning, sequence, (16 more...)

arXiv.org Machine Learning

1808.10543

Country: Europe > Germany > Hamburg (0.04)

Genre: Research Report (0.41)

Industry:

Health & Medicine (1.00)
Banking & Finance (1.00)
Law Enforcement & Public Safety > Fraud (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Hear and Speak Your Natural -- NLP keras – Data Driven Investor – Medium

#artificialintelligenceAug-19-2018, 18:49:56 GMT

The Human's are evolved about 2.3 to 2.4 million years ago. Since the 18th century, Scientists thought the great apes to be closely related to human beings. In the 19th century, They speculated that closest living relatives of humans were either chimpanzees or gorillas. Do you know what made us different from our closest living relatives? Humans have a persistent process of thinking.

artificial intelligence, machine learning, neural network, (18 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.74)

Add feedback

Under The Hood of Neural Networks. Part 2: Recurrent.

#artificialintelligenceJul-18-2018, 12:22:08 GMT

In Part 1 of this series, we have studied the Forward and Backward passes of a Feed Forward Fully-Connected network. In spite of the fact, that Feed Forward networks are widespread and find a lot of real-world applications, they have a main limitation. Feed Forward networks cannot handle sequential data. This means that they cannot work with inputs of different sizes and they do not store information about previous states (memory). Thus, in this article, we will talk about Recurrent Neural Networks (RNNs) allowing overcome named limitations.

artificial intelligence, machine learning, neural network, (17 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.52)

Add feedback