AITopics

2407.08254

Country:

Oceania > Australia (0.14)
Europe > Estonia (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Leisure & Entertainment > Games (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents > Agent Societies (0.66)

arXiv.org Artificial IntelligenceJul-10-2024

PAT: Pixel-wise Adaptive Training for Long-tailed Segmentation

Do, Khoi, Nguyen, Duong, Tran, Nguyen H., Nguyen, Viet Dung

Beyond class frequency, we recognize the impact of class-wise relationships among various class-specific predictions and the imbalance in label masks on long-tailed segmentation learning. To address these challenges, we propose an innovative Pixel-wise Adaptive Training (PAT) technique tailored for long-tailed segmentation. PAT has two key features: 1) class-wise gradient magnitude homogenization, and 2) pixel-wise class-specific loss adaptation (PCLA). First, the class-wise gradient magnitude homogenization helps alleviate the imbalance among label masks by ensuring equal consideration of the class-wise impact on model updates. Second, PCLA tackles the detrimental impact of both rare classes within the long-tailed distribution and inaccurate predictions from previous training stages by encouraging learning classes with low prediction confidence and guarding against forgetting classes with high confidence. This combined approach fosters robust learning while preventing the model from forgetting previously learned knowledge. PAT exhibits significant performance improvements, surpassing the current state-of-the-art by 2.2% in the NyU dataset. Moreover, it enhances overall pixel-wise accuracy by 2.85% and intersection over union value by 2.07%, with a particularly notable declination of 0.39% in detecting rare classes compared to Balance Logits Variation, as demonstrated on the three popular datasets, i.e., OxfordPetIII, CityScape, and NYU.

artificial intelligence, machine learning, segmentation, (16 more...)

2404.05393

Country: Asia > Vietnam (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

arXiv.org Artificial IntelligenceJan-28-2024

Revisiting LARS for Large Batch Training Generalization of Neural Networks

Do, Khoi, Nguyen, Duong, Nguyen, Hoa, Tran-Thanh, Long, Pham, Quoc-Viet

This paper explores Large Batch Training techniques using layer-wise adaptive scaling ratio (LARS) across diverse settings, uncovering insights. LARS algorithms with warm-up tend to be trapped in sharp minimizers early on due to redundant ratio scaling. Additionally, a fixed steep decline in the latter phase restricts deep neural networks from effectively navigating early-phase sharp minimizers. Building on these findings, we propose Time Varying LARS (TVLARS), a novel algorithm that replaces warm-up with a configurable sigmoid-like function for robust training in the initial phase. TVLARS promotes gradient exploration early on, surpassing sharp optimizers and gradually transitioning to LARS for robustness in later phases. Extensive experiments demonstrate that TVLARS consistently outperforms LARS and LAMB in most cases, with up to 2\% improvement in classification scenarios. Notably, in all self-supervised learning cases, TVLARS dominates LARS and LAMB with performance improvements of up to 10\%.

artificial intelligence, base lr, machine learning, (16 more...)

2309.14053

Country: Europe > Austria > Vienna (0.14)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.46)

arXiv.org Artificial IntelligenceSep-8-2021

TrAISformer-A generative transformer for AIS trajectory prediction

Nguyen, Duong, Fablet, Ronan

Abstract--Modelling trajectory in general, and vessel trajectory in particular, is a difficult task because of the multimodal and complex nature of motion data. In this paper, we present TrAISformer--a novel deep learning architecture that can forecast vessel positions using AIS (Automatic Identification System) observations. We address the multimodality by introducing a discrete representation of AIS data and re-frame the prediction, which is originally a regression problem, as a classification problem. The model encodes complex movement patterns in AIS data in high-dimensional vectors, then applies a transformer to extract useful long-term correlations from sequences of those embeddings to sample future vessel positions. Experimental results on real, public AIS data demonstrate that TrAISformer significantly outperforms state-of-the-art methods.

deep learning, marine transportation, prediction, (21 more...)

2109.03958

Country: North America > United States > Oregon (0.14)

Genre: Research Report > Promising Solution (0.48)

Industry: Transportation > Marine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningFeb-15-2021

Improving Bayesian Inference in Deep Neural Networks with Variational Structured Dropout

Nguyen, Son, Nguyen, Duong, Nguyen, Khai, Ho, Nhat, Than, Khoat, Bui, Hung

Bayesian Neural Networks (BNNs) [37, 47] offer a probabilistic interpretation for deep learning models by imposing a prior distribution on the weight parameters and aim to obtain a posterior distribution instead of only point estimates. By marginalizing over this posterior for prediction, BNNs perform a procedure of ensemble learning. These principles facilitate the model to improve generalization, robustness and allow for uncertainty quantification. However, computing exactly the posterior of non-linear Bayesian networks is infeasible and approximate inference has been devised. The core challenge is how to construct an expressive approximation for the true posterior while maintaining computational efficiency and scalability, especially for modern deep learning architectures. Variational inference is a popular deterministic approximation approach to to deal with this challenge. The first practical methods are proposed in [15, 5, 28], in which, the approximate posterior is assumed to be a fully factorized distribution, also called mean-field variational inference. Generally, the mean-field approximation family encourages some advantages in inference including computational tractability and effective optimization with the stochastic gradient-based methods. However, it will ignore strong statistical dependencies among random weights of the neural networks, which leads to an inability to capture the complicated structure of the true posterior and to estimate true model uncertainty.

approximation, deep learning, neural network, (17 more...)

2102.07927

Country:

North America > United States > Texas (0.14)
North America > Canada > Ontario > Toronto (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Asia > Vietnam (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

arXiv.org Machine LearningSep-30-2020

Variational Deep Learning for the Identification and Reconstruction of Chaotic and Stochastic Dynamical Systems from Noisy and Partial Observations

Nguyen, Duong, Ouala, Said, Drumetz, Lucas, Fablet, Ronan

The data-driven recovery of the unknown governing equations of dynamical systems has recently received an increasing interest. However, the identification of the governing equations remains challenging when dealing with noisy and partial observations. Here, we address this challenge and investigate variational deep learning schemes. Within the proposed framework, we jointly learn an inference model to reconstruct the true states of the system from series of noisy and partial data and the governing equations of these states. In doing so, this framework bridges classical data assimilation and state-of-the-art machine learning techniques and we show that it generalizes state-of-the-art methods. Importantly, both the inference model and the governing equations embed stochastic components to account for stochastic variabilities, model errors and reconstruction uncertainties. Various experiments on chaotic and stochastic dynamical systems support the relevance of our scheme w.r.t. state-of-the-art approaches.

architecture, deep learning, neural network, (17 more...)

2009.02296

Country:

North America > United States (0.14)
Europe > United Kingdom > England (0.14)
Europe > France (0.14)

Genre: Research Report > Promising Solution (0.54)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningJul-4-2019

Learning Latent Dynamics for Partially-Observed Chaotic Systems

Ouala, Said, Nguyen, Duong, Drumetz, Lucas, Chapron, Bertrand, Pascual, Ananda, Collard, Fabrice, Gaultier, Lucile, Fablet, Ronan

This paper addresses the data-driven identification of latent dynamical representations of partially-observed systems, i.e., dynamical systems for which some components are never observed, with an emphasis on forecasting applications, including long-term asymptotic patterns. Whereas state-of-the-art data-driven approaches rely on delay embeddings and linear decompositions of the underlying operators, we introduce a framework based on the data-driven identification of an augmented state-space model using a neural-network-based representation. For a given training dataset, it amounts to jointly learn an ODE (Ordinary Differential Equation) representation in the latent space and reconstructing latent states. Through numerical experiments, we demonstrate the relevance of the proposed framework w.r.t. state-of-the-art approaches in terms of short-term forecasting performance and long-term behaviour. We further discuss how the proposed framework relates to Koopman operator theory and Takens' embedding theorem.

modeling & simulation, neural network, representation, (19 more...)

1907.02452

Country:

Europe (0.68)
North America > United States > New York (0.14)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.89)

arXiv.org Machine LearningMar-25-2019

EM-like Learning Chaotic Dynamics from Noisy and Partial Observations

Nguyen, Duong, Ouala, Said, Drumetz, Lucas, Fablet, Ronan

The identification of the governing equations of chaotic dynamical systems from data has recently emerged as a hot topic. While the seminal work by Brunton et al. reported proof-of-concepts for idealized observation setting for fully-observed systems, {\em i.e.} large signal-to-noise ratios and high-frequency sampling of all system variables, we here address the learning of data-driven representations of chaotic dynamics for partially-observed systems, including significant noise patterns and possibly lower and irregular sampling setting. Instead of considering training losses based on short-term prediction error like state-of-the-art learning-based schemes, we adopt a Bayesian formulation and state this issue as a data assimilation problem with unknown model parameters. To solve for the joint inference of the hidden dynamics and of model parameters, we combine neural-network representations and state-of-the-art assimilation schemes. Using iterative Expectation-Maximization (EM)-like procedures, the key feature of the proposed inference schemes is the derivation of the posterior of the hidden dynamics. Using a neural-network-based Ordinary Differential Equation (ODE) representation of these dynamics, we investigate two strategies: their combination to Ensemble Kalman Smoothers and Long Short-Term Memory (LSTM)-based variational approximations of the posterior. Through numerical experiments on the Lorenz-63 system with different noise and time sampling settings, we demonstrate the ability of the proposed schemes to recover and reproduce the hidden chaotic dynamics, including their Lyapunov characteristic exponents, when classic machine learning approaches fail.

deep learning, dynamical systems, neural network, (16 more...)

1903.10335

Country: Europe > France (0.14)

Genre:

Research Report (0.82)
Instructional Material > Course Syllabus & Notes (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.67)

arXiv.org Machine LearningSep-7-2018

BiasedWalk: Biased Sampling for Representation Learning on Graphs

Nguyen, Duong, Malliaros, Fragkiskos D.

Network embedding algorithms are able to learn latent feature representations of nodes, transforming networks into lower dimensional vector representations. Typical key applications, which have effectively been addressed using network embeddings, include link prediction, multilabel classification and community detection. In this paper, we propose BiasedWalk, a scalable, unsupervised feature learning algorithm that is based on biased random walks to sample context information about each node in the network. Our random-walk based sampling can behave as Breath-First-Search (BFS) and Depth-First-Search (DFS) samplings with the goal to capture homophily and role equivalence between the nodes in the network. We have performed a detailed experimental evaluation comparing the performance of the proposed algorithm against various baseline methods, on several datasets and learning tasks. The experiment results show that the proposed method outperforms the baseline ones in most of the tasks and datasets.

deep learning, neural network, node, (21 more...)

1809.02482

Country: North America > United States (0.68)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

arXiv.org Machine LearningJun-13-2018

Multi-task Learning for Maritime Traffic Surveillance from AIS Data Streams

Nguyen, Duong, Vadaine, Rodolphe, Hajduch, Guillaume, Garello, René, Fablet, Ronan

In a world of global trading, maritime safety, security and efficiency are crucial issues. We propose a multi-task deep learning framework for vessel monitoring using Automatic Identification System (AIS) data streams. We combine recurrent neural networks with latent variable modeling and an embedding of AIS messages to a new representation space to jointly address key issues to be dealt with when considering AIS data streams: massive amount of streaming data, noisy data and irregular time-sampling. We demonstrate the relevance of the proposed deep learning framework on real AIS datasets for a three-task setting, namely trajectory reconstruction, anomaly detection and vessel type identification.

ais message, deep learning, neural network, (16 more...)

1806.03972

Country:

North America > United States (0.29)
Europe (0.28)

Genre: Research Report (1.00)

Industry: Transportation (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)