AITopics | learning architecture

Collaborating Authors

learning architecture

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

38db3aed920cf82ab059bfccbd02be6a-Reviews.html

Neural Information Processing SystemsOct-3-2025, 09:08:34 GMT

It is know that adding an additive gaussian noise to the feature is equivalent to an l_2 regularization in a least square problem (Bishop). This paper studies multiplicative Bernoulli feature noising, in a shallow learning architecture, with a general loss function and shows that it has the effect of adapting the geometry through an l_2 regularizer that rescales the feature (beta^{\top} D(beta,X) beta). The Matrix D(beta,X) is a estimate of the inverse diagonal fisher information. It is worth noting that D does not depend on the labels. The equivalent regularizer of dropout is non convex in general.

delta, dropout, regularizer, (14 more...)

Neural Information Processing Systems

Country: North America > United States > Nevada (0.04)

Genre:

Summary/Review (0.48)
Research Report > New Finding (0.30)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.49)

Add feedback

Leveraging Self-Supervised Learning Methods for Remote Screening of Subjects with Paroxysmal Atrial Fibrillation

Atienza, Adrian, Manimaran, Gouthamaan, Puthusserypady, Sadasivan, Dominguez, Helena, Jacobsen, Peter K., Bardram, Jakob E.

arXiv.org Artificial IntelligenceMar-4-2025

The integration of Artificial Intelligence (AI) into clinical research has great potential to reveal patterns that are difficult for humans to detect, creating impactful connections between inputs and clinical outcomes. However, these methods often require large amounts of labeled data, which can be difficult to obtain in healthcare due to strict privacy laws and the need for experts to annotate data. This requirement creates a bottleneck when investigating unexplored clinical questions. This study explores the application of Self-Supervised Learning (SSL) as a way to obtain preliminary results from clinical studies with limited sized cohorts. To assess our approach, we focus on an underexplored clinical task: screening subjects for Paroxysmal Atrial Fibrillation (P-AF) using remote monitoring, single-lead ECG signals captured during normal sinus rhythm. We evaluate state-of-the-art SSL methods alongside supervised learning approaches, where SSL outperforms supervised learning in this task of interest. More importantly, it prevents misleading conclusions that may arise from poor performance in the latter paradigm when dealing with limited cohort settings.

clinical question, learning, representation, (13 more...)

arXiv.org Artificial Intelligence

2503.02621

Country:

Europe > Denmark > Capital Region > Copenhagen (0.04)
Europe > Denmark > Capital Region > Bispebjerg (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

A Unified Multi-Task Learning Architecture for Hate Detection Leveraging User-Based Information

Kapil, Prashant, Ekbal, Asif

arXiv.org Artificial IntelligenceNov-12-2024

Hate speech, offensive language, aggression, racism, sexism, and other abusive language are common phenomena in social media. There is a need for Artificial Intelligence(AI)based intervention which can filter hate content at scale. Most existing hate speech detection solutions have utilized the features by treating each post as an isolated input instance for the classification. This paper addresses this issue by introducing a unique model that improves hate speech identification for the English language by utilising intra-user and inter-user-based information. The experiment is conducted over single-task learning (STL) and multi-task learning (MTL) paradigms that use deep neural networks, such as convolutional neural networks (CNN), gated recurrent unit (GRU), bidirectional encoder representations from the transformer (BERT), and A Lite BERT (ALBERT). We use three benchmark datasets and conclude that combining certain user features with textual features gives significant improvements in macro-F1 and weighted-F1.

detection, information, tweet, (13 more...)

arXiv.org Artificial Intelligence

2411.06855

Country: Asia > India (0.04)

Genre: Research Report (0.50)

Industry: Information Technology > Services (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Permutative redundancy and uncertainty of the objective in deep learning

Glukhov, Vacslav

arXiv.org Artificial IntelligenceNov-11-2024

Implications of uncertain objective functions and permutative symmetry of traditional deep learning architectures are discussed. It is shown that traditional architectures are polluted by an astronomical number of equivalent global and local optima. Uncertainty of the objective makes local optima unattainable, and, as the size of the network grows, the global optimization landscape likely becomes a tangled web of valleys and ridges. Some remedies which reduce or eliminate ghost optima are discussed including forced pre-pruning, re-ordering, ortho-polynomial activations, and modular bio-inspired architectures.

artificial intelligence, deep learning, machine learning, (15 more...)

arXiv.org Artificial Intelligence

2411.07008

Country:

North America > United States > Massachusetts > Suffolk County > Boston (0.04)
North America > United States > Illinois (0.04)
Europe > Ukraine (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Understanding Deep Learning via Notions of Rank

Razin, Noam

arXiv.org Machine LearningAug-4-2024

Despite the extreme popularity of deep learning in science and industry, its formal understanding is limited. This thesis puts forth notions of rank as key for developing a theory of deep learning, focusing on the fundamental aspects of generalization and expressiveness. In particular, we establish that gradient-based training can induce an implicit regularization towards low rank for several neural network architectures, and demonstrate empirically that this phenomenon may facilitate an explanation of generalization over natural data (e.g., audio, images, and text). Then, we characterize the ability of graph neural networks to model interactions via a notion of rank, which is commonly used for quantifying entanglement in quantum physics. A central tool underlying these results is a connection between neural networks and tensor factorizations. Practical implications of our theory for designing explicit regularization schemes and data preprocessing algorithms are presented.

regularization coefficient, standard normal distribution, tensor network corresponding, (16 more...)

arXiv.org Machine Learning

2408.02111

Country:

North America > United States (0.13)
Africa > Senegal > Kolda Region > Kolda (0.04)
Asia (0.04)

Genre: Research Report > New Finding (0.45)

Industry: Information Technology (0.92)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Deep learning architectures for data-driven damage detection in nonlinear dynamic systems

Joseph, Harrish, Quaranta, Giuseppe, Carboni, Biagio, Lacarbonara, Walter

arXiv.org Artificial IntelligenceJul-4-2024

The primary goal of structural health monitoring is to detect damage at its onset before it reaches a critical level. The in-depth investigation in the present work addresses deep learning applied to data-driven damage detection in nonlinear dynamic systems. In particular, autoencoders (AEs) and generative adversarial networks (GANs) are implemented leveraging on 1D convolutional neural networks. The onset of damage is detected in the investigated nonlinear dynamic systems by exciting random vibrations of varying intensity, without prior knowledge of the system or the excitation and in unsupervised manner. The comprehensive numerical study is conducted on dynamic systems exhibiting different types of nonlinear behavior. An experimental application related to a magneto-elastic nonlinear system is also presented to corroborate the conclusions.

architecture, damage detection, learning architecture, (13 more...)

arXiv.org Artificial Intelligence

2407.037

Country:

North America > United States > Florida (0.14)
South America > Brazil > São Paulo (0.04)
South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
(4 more...)

Genre: Research Report (0.81)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

MUSTAN: Multi-scale Temporal Context as Attention for Robust Video Foreground Segmentation

Pokala, Praveen Kumar, Patibandla, Jaya Sai Kiran, Pandey, Naveen Kumar, Pailla, Balakrishna Reddy

arXiv.org Artificial IntelligenceFeb-1-2024

Video foreground segmentation (VFS) is an important computer vision task wherein one aims to segment the objects under motion from the background. Most of the current methods are image-based, i.e., rely only on spatial cues while ignoring motion cues. Therefore, they tend to overfit the training data and don't generalize well to out-of-domain (OOD) distribution. To solve the above problem, prior works exploited several cues such as optical flow, background subtraction mask, etc. However, having a video data with annotations like optical flow is a challenging task. In this paper, we utilize the temporal information and the spatial cues from the video data to improve OOD performance. However, the challenge lies in how we model the temporal information given the video data in an interpretable way creates a very noticeable difference. We therefore devise a strategy that integrates the temporal context of the video in the development of VFS. Our approach give rise to deep learning architectures, namely MUSTAN1 and MUSTAN2 and they are based on the idea of multi-scale temporal context as an attention, i.e., aids our models to learn better representations that are beneficial for VFS. Further, we introduce a new video dataset, namely Indoor Surveillance Dataset (ISD) for VFS. It has multiple annotations on a frame level such as foreground binary mask, depth map, and instance semantic annotations. Therefore, ISD can benefit other computer vision tasks. We validate the efficacy of our architectures and compare the performance with baselines. We demonstrate that proposed methods significantly outperform the benchmark methods on OOD. In addition, the performance of MUSTAN2 is significantly improved on certain video categories on OOD data due to ISD.

dataset, foreground segmentation, segmentation, (14 more...)

arXiv.org Artificial Intelligence

2402.00918

Country: Asia > India (0.05)

Genre:

Research Report (0.50)
Overview (0.46)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

An appointment with Reproducing Kernel Hilbert Space generated by Generalized Gaussian RBF as $L^2-$measure

Singh, Himanshu

arXiv.org Artificial IntelligenceDec-17-2023

Gaussian Radial Basis Function (RBF) Kernels are the most-often-employed kernels in artificial intelligence and machine learning routines for providing optimally-best results in contrast to their respective counter-parts. However, a little is known about the application of the Generalized Gaussian Radial Basis Function on various machine learning algorithms namely, kernel regression, support vector machine (SVM) and pattern-recognition via neural networks. The results that are yielded by Generalized Gaussian RBF in the kernel sense outperforms in stark contrast to Gaussian RBF Kernel, Sigmoid Function and ReLU Function. This manuscript demonstrates the application of the Generalized Gaussian RBF in the kernel sense on the aforementioned machine learning routines along with the comparisons against the aforementioned functions as well.

kernel, learning architecture, novel kernel function, (9 more...)

arXiv.org Artificial Intelligence

2312.10693

Country:

North America > United States > Florida (0.04)
North America > United States > Texas > Smith County > Tyler (0.04)
North America > United States > Pennsylvania (0.04)
(3 more...)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.70)

Add feedback

A Blockchain-empowered Multi-Aggregator Federated Learning Architecture in Edge Computing with Deep Reinforcement Learning Optimization

Li, Xiao, Wu, Weili

arXiv.org Artificial IntelligenceOct-14-2023

Federated learning (FL) is emerging as a sought-after distributed machine learning architecture, offering the advantage of model training without direct exposure of raw data. With advancements in network infrastructure, FL has been seamlessly integrated into edge computing. However, the limited resources on edge devices introduce security vulnerabilities to FL in the context. While blockchain technology promises to bolster security, practical deployment on resource-constrained edge devices remains a challenge. Moreover, the exploration of FL with multiple aggregators in edge computing is still new in the literature. Addressing these gaps, we introduce the Blockchain-empowered Heterogeneous Multi-Aggregator Federated Learning Architecture (BMA-FL). We design a novel light-weight Byzantine consensus mechanism, namely PBCM, to enable secure and fast model aggregation and synchronization in BMA-FL. We also dive into the heterogeneity problem in BMA-FL that the aggregators are associated with varied number of connected trainers with Non-IID data distributions and diverse training speed. We proposed a multi-agent deep reinforcement learning algorithm to help aggregators decide the best training strategies. The experiments on real-word datasets demonstrate the efficiency of BMA-FL to achieve better models faster than baselines, showing the efficacy of PBCM and proposed deep reinforcement learning algorithm.

aggregation, edge device, edge server, (14 more...)

arXiv.org Artificial Intelligence

2310.09665

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)
North America > United States > Texas > Dallas County > Richardson (0.04)
North America > United States > Washington > King County > Seattle (0.04)
(3 more...)

Genre: Research Report (0.50)

Industry:

Information Technology > Security & Privacy (1.00)
Energy (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Active Learning with Statistical Models

Neural Information Processing SystemsApr-6-2023, 18:38:30 GMT

An active learning problem is one where the learner has the ability or need to influence or select its own training data. Many problems of great practical interest allow active learning, and many even require it. We consider the problem of actively learning a mapping X - Y based on a set of training examples {(Xi,Yi)} l' where Xi E X and Yi E Y. The learner is allowed to iteratively select new inputs x (possibly from a constrained set), observe the resulting output y, and incorporate the new examples (x, y) into its training set. The primary question of active learning is how to choose which x to try next. There are many heuristics for choosing x based on intuition, including choosing places where we don't have data, where we perform poorly [Linden and Weber, 1993], where we have low confidence [Thrun and Moller, 1992], where we expect it

learner, learning architecture, statistical model, (10 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.57)

Add feedback