AITopics | Choi, Hayoung

Collaborating Authors

Choi, Hayoung

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Robust Weight Initialization for Tanh Neural Networks with Fixed Point Analysis

Lee, Hyunwoo, Choi, Hayoung, Kim, Hyunju

arXiv.org Artificial IntelligenceOct-3-2024

As a neural network's depth increases, it can achieve strong generalization performance. Training, however, becomes challenging due to gradient issues. Theoretical research and various methods have been introduced to address this issues. However, research on weight initialization methods that can be effectively applied to tanh neural networks of varying sizes still needs to be completed. This paper presents a novel weight initialization method for Feedforward Neural Networks with tanh activation function. Based on an analysis of the fixed points of the function $\tanh(ax)$, our proposed method aims to determine values of $a$ that prevent the saturation of activations. A series of experiments on various classification datasets demonstrate that the proposed method is more robust to network size variations than the existing method. Furthermore, when applied to Physics-Informed Neural Networks, the method exhibits faster convergence and robustness to variations of the network size compared to Xavier initialization in problems of Partial Differential Equations.

artificial intelligence, machine learning, neural network, (16 more...)

arXiv.org Artificial Intelligence

2410.02242

Genre: Research Report (0.40)

Industry: Energy > Oil & Gas > Upstream (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Improved weight initialization for deep and narrow feedforward neural network

Lee, Hyunwoo, Kim, Yunho, Yang, Seungyeop, Choi, Hayoung

arXiv.org Artificial IntelligenceNov-7-2023

Appropriate weight initialization settings, along with the ReLU activation function, have been a cornerstone of modern deep learning, making it possible to train and deploy highly effective and efficient neural network models across diverse artificial intelligence. The problem of dying ReLU, where ReLU neurons become inactive and yield zero output, presents a significant challenge in the training of deep neural networks with ReLU activation function. Theoretical research and various methods have been introduced to address the problem. However, even with these methods and research, training remains challenging for extremely deep and narrow feedforward networks with ReLU activation function. In this paper, we propose a new weight initialization method to address this issue. We prove the properties of the proposed initial weight matrix and demonstrate how these properties facilitate the effective propagation of signal vectors. Through a series of experiments and comparisons with existing methods, we demonstrate the effectiveness of the new initialization method.

artificial intelligence, machine learning, neural network, (18 more...)

arXiv.org Artificial Intelligence

2311.03733

Country: Asia > South Korea (0.14)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback

An Algebraic-Geometric Approach to Shuffled Linear Regression

Tsakiris, Manolis C., Peng, Liangzu, Conca, Aldo, Kneip, Laurent, Shi, Yuanming, Choi, Hayoung

arXiv.org Machine LearningOct-12-2018

Shuffled linear regression is the problem of performing a linear regression fit to a dataset for which the correspondences between the independent samples and the observations are unknown. Such a problem arises in diverse domains such as computer vision, communications and biology. In its simplest form, it is tantamount to solving a linear system of equations, for which the entries of the right hand side vector have been permuted. This type of data corruption renders the linear regression task considerably harder, even in the absence of other corruptions, such as noise, outliers or missing entries. Existing methods are either applicable only to noiseless data or they are very sensitive to initialization and work only for partially shuffled data. In this paper we address both of these issues via an algebraic geometric approach, which uses symmetric polynomials to extract permutation-invariant constraints that the parameters $\boldsymbol{x} \in \mathbb{R}^n$ of the linear regression model must satisfy. This naturally leads to a polynomial system of $n$ equations in $n$ unknowns, which contains $\boldsymbol{x}$ in its root locus. Using the machinery of algebraic geometry we prove that as long as the independent samples are generic, this polynomial system is always consistent with at most $n!$ complex roots, regardless of any type of corruption inflicted on the observations. The algorithmic implication of this fact is that one can always solve this polynomial system and use its most suitable root as initialization to the Expectation Maximization algorithm. To the best of our knowledge, the resulting method is the first working solution for small values of $n$ able to handle thousands of fully shuffled noisy observations in milliseconds.

artificial intelligence, machine learning, polynomial, (17 more...)

arXiv.org Machine Learning

1810.0544

Country: Asia > China (0.14)

Genre: Research Report (0.50)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Add feedback