AITopics | Perceptrons

Collaborating Authors

Perceptrons

News Overviews Instructional Materials AI-Alerts Classics

WavPool: A New Block for Deep Neural Networks

McDermott, Samuel D., Voetberg, M., Nord, Brian

arXiv.org Artificial IntelligenceJun-14-2023

Modern deep neural networks comprise many operational layers, such as dense or convolutional layers, which are often collected into blocks. In this work, we introduce a new, wavelet-transform-based network architecture that we call the multi-resolution perceptron: by adding a pooling layer, we create a new network block, the WavPool. The first step of the multi-resolution perceptron is transforming the data into its multi-resolution decomposition form by convolving the input data with filters of fixed coefficients but increasing size. Following image processing techniques, we are able to make scale and spatial information simultaneously accessible to the network without increasing the size of the data vector. WavPool outperforms a similar multilayer perceptron while using fewer parameters, and outperforms a comparable convolutional neural network by ~ 10% on relative accuracy on CIFAR-10.

artificial intelligence, deep learning, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2306.08734

Country:

North America > United States > Illinois > Cook County > Chicago (0.05)
North America > United States > Illinois > Kane County > Batavia (0.04)
Europe > Italy > Piedmont > Turin Province > Turin (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report > New Finding (0.68)

Industry:

Energy (0.46)
Government > Regional Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.95)

Add feedback

Automating Microservices Test Failure Analysis using Kubernetes Cluster Logs

Sarika, Pawan Kumar, Badampudi, Deepika, Josyula, Sai Prashanth, Usman, Muhammad

arXiv.org Artificial IntelligenceJun-13-2023

Kubernetes is a free, open-source container orchestration system for deploying and managing Docker containers that host microservices. Kubernetes cluster logs help in determining the reason for the failure. However, as systems become more complex, identifying failure reasons manually becomes more difficult and time-consuming. This study aims to identify effective and efficient classification algorithms to automatically determine the failure reason. We compare five classification algorithms, Support Vector Machines, K-Nearest Neighbors, Random Forest, Gradient Boosting Classifier, and Multilayer Perceptron. Our results indicate that Random Forest produces good accuracy while requiring fewer computational resources than other algorithms.

algorithm, developer, microservice, (13 more...)

arXiv.org Artificial Intelligence

doi: 10.1145/3593434.3593472

2306.07653

Country:

Europe > Finland > Northern Ostrobothnia > Oulu (0.06)
Europe > Sweden > Blekinge County > Karlskrona (0.05)
North America > United States (0.04)
Europe > Sweden > Stockholm > Stockholm (0.04)

Genre:

Research Report > New Finding (0.49)
Research Report > Experimental Study (0.49)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.55)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.54)

Add feedback

Comparing machine learning models for tau triggers

Yaary, Maayan, Barron, Uriel, Domínguez, Luis Pascual, Chen, Boping, Barak, Liron, Etzion, Erez, Giryes, Raja

arXiv.org Artificial IntelligenceJun-11-2023

This paper introduces novel supervised learning techniques for real-time selection (triggering) of hadronically decaying tau leptons in proton-proton colliders. By implementing classic machine learning decision trees and advanced deep learning models, such as Multi-Layer Perceptron or residual NN, visible improvements in performance compared to standard tau triggers are observed. We show how such an implementation may lower the current energy thresholds, thus contributing to increasing the sensitivity of searches for new phenomena in proton-proton collisions classified by low-energy tau leptons.

artificial intelligence, machine learning, tob, (20 more...)

arXiv.org Artificial Intelligence

2306.06743

Country:

Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.04)
North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)

Add feedback

Quantifying the Knowledge in GNNs for Reliable Distillation into MLPs

Wu, Lirong, Lin, Haitao, Huang, Yufei, Li, Stan Z.

arXiv.org Artificial IntelligenceJun-8-2023

To bridge the gaps between topology-aware Graph Neural Networks (GNNs) and inference-efficient Multi-Layer Perceptron (MLPs), GLNN proposes to distill knowledge from a well-trained teacher GNN into a student MLP. Despite their great progress, comparatively little work has been done to explore the reliability of different knowledge points (nodes) in GNNs, especially their roles played during distillation. In this paper, we first quantify the knowledge reliability in GNN by measuring the invariance of their information entropy to noise perturbations, from which we observe that different knowledge points (1) show different distillation speeds (temporally); (2) are differentially distributed in the graph (spatially). To achieve reliable distillation, we propose an effective approach, namely Knowledge-inspired Reliable Distillation (KRD), that models the probability of each node being an informative and reliable knowledge point, based on which we sample a set of additional reliable knowledge points as supervision for training student MLPs. Extensive experiments show that KRD improves over the vanilla MLPs by 12.62% and outperforms its corresponding teacher GNNs by 2.16% averaged over 7 datasets and 3 GNN architectures.

distillation, knowledge point, student mlp, (14 more...)

arXiv.org Artificial Intelligence

2306.05628

Country:

North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)

Genre: Research Report (0.40)

Industry: Education (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)

Add feedback

Mixture-of-Domain-Adapters: Decoupling and Injecting Domain Knowledge to Pre-trained Language Models Memories

Diao, Shizhe, Xu, Tianyang, Xu, Ruijia, Wang, Jiawei, Zhang, Tong

arXiv.org Artificial IntelligenceJun-8-2023

Pre-trained language models (PLMs) demonstrate excellent abilities to understand texts in the generic domain while struggling in a specific domain. Although continued pre-training on a large domain-specific corpus is effective, it is costly to tune all the parameters on the domain. In this paper, we investigate whether we can adapt PLMs both effectively and efficiently by only tuning a few parameters. Specifically, we decouple the feed-forward networks (FFNs) of the Transformer architecture into two parts: the original pre-trained FFNs to maintain the old-domain knowledge and our novel domain-specific adapters to inject domain-specific knowledge in parallel. Then we adopt a mixture-of-adapters gate to fuse the knowledge from different domain adapters dynamically. Our proposed Mixture-of-Domain-Adapters (MixDA) employs a two-stage adapter-tuning strategy that leverages both unlabeled data and labeled data to help the domain adaptation: i) domain-specific adapter on unlabeled data; followed by ii) the task-specific adapter on labeled data. MixDA can be seamlessly plugged into the pretraining-finetuning paradigm and our experiments demonstrate that MixDA achieves superior performance on in-domain tasks (GLUE), out-of-domain tasks (ChemProt, RCT, IMDB, Amazon), and knowledge-intensive tasks (KILT). Further analyses demonstrate the reliability, scalability, and efficiency of our method. The code is available at https://github.com/Amano-Aki/Mixture-of-Domain-Adapters.

artificial intelligence, knowledge, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2306.05406

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Hong Kong (0.04)
North America > United States > Oregon (0.04)
(6 more...)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.68)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.34)

Add feedback

Attention Weighted Mixture of Experts with Contrastive Learning for Personalized Ranking in E-commerce

Gong, Juan, Chen, Zhenlin, Ma, Chaoyi, Xiao, Zhuojian, Wang, Haonan, Tang, Guoyu, Liu, Lin, Xu, Sulong, Long, Bo, Jiang, Yunjiang

arXiv.org Artificial IntelligenceJun-8-2023

Ranking model plays an essential role in e-commerce search and recommendation. An effective ranking model should give a personalized ranking list for each user according to the user preference. Existing algorithms usually extract a user representation vector from the user behavior sequence, then feed the vector into a feed-forward network (FFN) together with other features for feature interactions, and finally produce a personalized ranking score. Despite tremendous progress in the past, there is still room for improvement. Firstly, the personalized patterns of feature interactions for different users are not explicitly modeled. Secondly, most of existing algorithms have poor personalized ranking results for long-tail users with few historical behaviors due to the data sparsity. To overcome the two challenges, we propose Attention Weighted Mixture of Experts (AW-MoE) with contrastive learning for personalized ranking. Firstly, AW-MoE leverages the MoE framework to capture personalized feature interactions for different users. To model the user preference, the user behavior sequence is simultaneously fed into expert networks and the gate network. Within the gate network, one gate unit and one activation unit are designed to adaptively learn the fine-grained activation vector for experts using an attention mechanism. Secondly, a random masking strategy is applied to the user behavior sequence to simulate long-tail users, and an auxiliary contrastive loss is imposed to the output of the gate network to improve the model generalization for these users. This is validated by a higher performance gain on the long-tail user test set. Experiment results on a JD real production dataset and a public dataset demonstrate the effectiveness of AW-MoE, which significantly outperforms state-of-art methods. Notably, AW-MoE has been successfully deployed in the JD e-commerce search engine, ...

behavior sequence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2306.05011

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Beijing > Beijing (0.05)
North America > United States > California > Santa Clara County > Mountain View (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Industry: Information Technology > Services > e-Commerce Services (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Personal Assistant Systems (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.48)

Add feedback

Teaching Yourself: Graph Self-Distillation on Neighborhood for Node Classification

Wu, Lirong, Xia, Jun, Lin, Haitao, Gao, Zhangyang, Liu, Zicheng, Zhao, Guojiang, Li, Stan Z.

arXiv.org Artificial IntelligenceJun-4-2023

Recent years have witnessed great success in handling graph-related tasks with Graph Neural Networks (GNNs). Despite their great academic success, Multi-Layer Perceptrons (MLPs) remain the primary workhorse for practical industrial applications. One reason for this academic-industrial gap is the neighborhood-fetching latency incurred by data dependency in GNNs, which make it hard to deploy for latency-sensitive applications that require fast inference. Conversely, without involving any feature aggregation, MLPs have no data dependency and infer much faster than GNNs, but their performance is less competitive. Motivated by these complementary strengths and weaknesses, we propose a Graph Self-Distillation on Neighborhood (GSDN) framework to reduce the gap between GNNs and MLPs. Specifically, the GSDN framework is based purely on MLPs, where structural information is only implicitly used as prior to guide knowledge self-distillation between the neighborhood and the target, substituting the explicit neighborhood information propagation as in GNNs. As a result, GSDN enjoys the benefits of graph topology-awareness in training but has no data dependency in inference. Extensive experiments have shown that the performance of vanilla MLPs can be greatly improved with self-distillation, e.g., GSDN improves over stand-alone MLPs by 15.54% on average and outperforms the state-of-the-art GNNs on six datasets. Regarding inference speed, GSDN infers 75X-89X faster than existing GNNs and 16X-25X faster than other inference acceleration methods.

dataset, information, node, (14 more...)

arXiv.org Artificial Intelligence

2210.02097

Genre: Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.68)

Add feedback

Are We Really Making Much Progress in Text Classification? A Comparative Review

Galke, Lukas, Diera, Andor, Lin, Bao Xin, Khera, Bhakti, Meuser, Tim, Singhal, Tushar, Karl, Fabian, Scherp, Ansgar

arXiv.org Artificial IntelligenceJun-4-2023

This study reviews and compares methods for single-label and multi-label text classification, categorized into bag-of-words, sequence-based, graph-based, and hierarchical methods. The comparison aggregates results from the literature over five single-label and seven multi-label datasets and complements them with new experiments. The findings reveal that all recently proposed graph-based and hierarchy-based methods fail to outperform pre-trained language models and sometimes perform worse than standard machine learning methods like a multilayer perceptron on a bag-of-words. To assess the true scientific progress in text classification, future work should thoroughly test against strong bag-of-words baselines and state-of-the-art pre-trained language models.

machine learning, natural language, text classification, (17 more...)

arXiv.org Artificial Intelligence

2204.03954

Country:

North America > United States > Texas > Travis County > Austin (0.14)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > Los Angeles County > Long Beach (0.14)
(44 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
(2 more...)

Add feedback

Chemical Property-Guided Neural Networks for Naphtha Composition Prediction

Joo, Chonghyo, Kim, Jeongdong, Cho, Hyungtae, Lee, Jaewon, Suh, Sungho, Kim, Junghwan

arXiv.org Artificial IntelligenceJun-2-2023

The naphtha cracking process heavily relies on the composition of naphtha, which is a complex blend of different hydrocarbons. Predicting the naphtha composition accurately is crucial for efficiently controlling the cracking process and achieving maximum performance. Traditional methods, such as gas chromatography and true boiling curve, are not feasible due to the need for pilot-plant-scale experiments or cost constraints. In this paper, we propose a neural network framework that utilizes chemical property information to improve the performance of naphtha composition prediction. Our proposed framework comprises two parts: a Watson K factor estimation network and a naphtha composition prediction network. Both networks share a feature extraction network based on Convolutional Neural Network (CNN) architecture, while the output layers use Multi-Layer Perceptron (MLP) based networks to generate two different outputs - Watson K factor and naphtha composition. The naphtha composition is expressed in percentages, and its sum should be 100%. To enhance the naphtha composition prediction, we utilize a distillation simulator to obtain the distillation curve from the naphtha composition, which is dependent on its chemical properties. By designing a loss function between the estimated and simulated Watson K factors, we improve the performance of both Watson K estimation and naphtha composition prediction. The experimental results show that our proposed framework can predict the naphtha composition accurately while reflecting real naphtha chemical properties.

artificial intelligence, composition, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2306.01391

Country:

Europe > Germany (0.14)
Asia > South Korea (0.14)

Genre: Research Report > New Finding (0.34)

Industry:

Materials > Chemicals > Commodity Chemicals > Petrochemicals (1.00)
Energy > Oil & Gas > Downstream (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)

Add feedback

MLP-Mixer as a Wide and Sparse MLP

Hayase, Tomohiro, Karakida, Ryo

arXiv.org Artificial IntelligenceJun-2-2023

Multi-layer perceptron (MLP) is a fundamental component of deep learning that has been extensively employed for various problems. However, recent empirical successes in MLP-based architectures, particularly the progress of the MLP-Mixer, have revealed that there is still hidden potential in improving MLPs to achieve better performance. In this study, we reveal that the MLP-Mixer works effectively as a wide MLP with certain sparse weights. Initially, we clarify that the mixing layer of the Mixer has an effective expression as a wider MLP whose weights are sparse and represented by the Kronecker product. This expression naturally defines a permuted-Kronecker (PK) family, which can be regarded as a general class of mixing layers and is also regarded as an approximation of Monarch matrices. Subsequently, because the PK family effectively constitutes a wide MLP with sparse weights, one can apply the hypothesis proposed by Golubeva, Neyshabur and Gur-Ari (2021) that the prediction performance improves as the width (sparsity) increases when the number of weights is fixed. We empirically verify this hypothesis by maximizing the effective width of the MLP-Mixer, which enables us to determine the appropriate size of the mixing layers quantitatively.

artificial intelligence, machine learning, mlp-mixer, (17 more...)

arXiv.org Artificial Intelligence

2306.0147

Country: Asia > Japan (0.04)

Genre: Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)

Add feedback