AITopics | Cichocki, Andrzej

Collaborating Authors

Cichocki, Andrzej

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Mirror Descent and Novel Exponentiated Gradient Algorithms Using Trace-Form Entropies and Deformed Logarithms

Cichocki, Andrzej, Tanaka, Toshihisa, Cruces, Sergio

arXiv.org Artificial IntelligenceMar-20-2025

In this paper we propose and investigate a wide class of Mirror Descent updates (MD) and associated novel Generalized Exponentiated Gradient (GEG) algorithms by exploiting various trace-form entropies and associated deformed logarithms and their inverses - deformed (generalized) exponential functions. The proposed algorithms can be considered as extension of entropic MD and generalization of multiplicative updates. In the literature, there exist nowadays over fifty mathematically well defined generalized entropies, so impossible to exploit all of them in one research paper. So we focus on a few selected most popular entropies and associated logarithms like the Tsallis, Kaniadakis and Sharma-Taneja-Mittal and some of their extension like Tempesta or Kaniadakis-Scarfone entropies. The shape and properties of the deformed logarithms and their inverses are tuned by one or more hyperparameters. By learning these hyperparameters, we can adapt to distribution of training data, which can be designed to the specific geometry of the optimization problem, leading to potentially faster convergence and better performance. The using generalized entropies and associated deformed logarithms in the Bregman divergence, used as a regularization term, provides some new insight into exponentiated gradient descent updates.

artificial intelligence, entropy, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2503.08748

Country:

Asia > Japan (0.14)
Europe > Spain (0.14)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.36)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.34)

Add feedback

Generalized Exponentiated Gradient Algorithms Using the Euler Two-Parameter Logarithm

Cichocki, Andrzej

arXiv.org Artificial IntelligenceFeb-21-2025

In this paper we propose and investigate a new class of Generalized Exponentiated Gradient (GEG) algorithms using Mirror Descent (MD) approaches, and applying as a regularization function the Bregman divergence with two-parameter deformation of logarithm as a link function. This link function (referred to as the Euler logarithm) is associated with a wide class of generalized entropies. In order to derive novel GEG/MD updates, we estimate generalized exponential function, which closely approximates the inverse of the Euler two-parameter logarithm. The characteristic/shape and properties of the Euler logarithm and its inverse -- deformed exponential functions are tuned by two or even more hyperparameters. By learning these hyperparameters, we can adapt to distribution of training data, and we can adjust them to achieve desired properties of gradient descent algorithms. The concept of generalized entropies and associated deformed logarithms provide deeper insight into novel gradient descent updates. In literature, there exist nowadays over fifty mathematically well-defined entropic functionals and associated deformed logarithms, so impossible to investigate all of them in one research paper. Therefore, we focus here on a wide-class of trace-form entropies and associated generalized logarithm. We applied the developed algorithms for Online Portfolio Selection (OPLS) in order to improve its performance and robustness.

artificial intelligence, logarithm, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2502.175

Country:

Europe (0.46)
Asia > Japan (0.14)

Genre: Research Report (0.40)

Industry: Banking & Finance > Trading (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)

Add feedback

Generalized Exponentiated Gradient Algorithms and Their Application to On-Line Portfolio Selection

Cichocki, Andrzej, Cruces, Sergio, Sarmiento, Auxiliadora, Tanaka, Toshihisa

arXiv.org Artificial IntelligenceJun-2-2024

This paper introduces a novel family of generalized exponentiated gradient (EG) updates derived from an Alpha-Beta divergence regularization function. Collectively referred to as EGAB, the proposed updates belong to the category of multiplicative gradient algorithms for positive data and demonstrate considerable flexibility by controlling iteration behavior and performance through three hyperparameters: $\alpha$, $\beta$, and the learning rate $\eta$. To enforce a unit $l_1$ norm constraint for nonnegative weight vectors within generalized EGAB algorithms, we develop two slightly distinct approaches. One method exploits scale-invariant loss functions, while the other relies on gradient projections onto the feasible domain. As an illustration of their applicability, we evaluate the proposed updates in addressing the online portfolio selection problem (OLPS) using gradient-based methods. Here, they not only offer a unified perspective on the search directions of various OLPS algorithms (including the standard exponentiated gradient and diverse mean-reversion strategies), but also facilitate smooth interpolation and extension of these updates due to the flexibility in hyperparameter selection. Simulation results confirm that the adaptability of these generalized gradient updates can effectively enhance the performance for some portfolios, particularly in scenarios involving transaction costs.

algorithm, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2406.00655

Country:

Europe (1.00)
North America > United States > Massachusetts (0.14)

Genre: Research Report > New Finding (0.65)

Industry:

Banking & Finance > Trading (1.00)
Health & Medicine (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.88)

Add feedback

AnoOnly: Semi-Supervised Anomaly Detection with the Only Loss on Anomalies

Zhou, Yixuan, Yang, Peiyu, Qu, Yi, Xu, Xing, Sun, Zhe, Cichocki, Andrzej

arXiv.org Artificial IntelligenceSep-6-2023

Semi-supervised anomaly detection (SSAD) methods have demonstrated their effectiveness in enhancing unsupervised anomaly detection (UAD) by leveraging few-shot but instructive abnormal instances. However, the dominance of homogeneous normal data over anomalies biases the SSAD models against effectively perceiving anomalies. To address this issue and achieve balanced supervision between heavily imbalanced normal and abnormal data, we develop a novel framework called AnoOnly (Anomaly Only). Unlike existing SSAD methods that resort to strict loss supervision, AnoOnly suspends it and introduces a form of weak supervision for normal data. This weak supervision is instantiated through the utilization of batch normalization, which implicitly performs cluster learning on normal data. When integrated into existing SSAD methods, the proposed AnoOnly demonstrates remarkable performance enhancements across various models and datasets, achieving new state-of-the-art performance. Additionally, our AnoOnly is natively robust to label noise when suffering from data contamination. Our code is publicly available at https://github.com/cool-xuan/AnoOnly.

data mining, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2305.18798

Country: Asia (0.14)

Genre: Research Report (0.64)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Quantization Aware Factorization for Deep Neural Network Compression

Cherniuk, Daria, Abukhovich, Stanislav, Phan, Anh-Huy, Oseledets, Ivan, Cichocki, Andrzej, Gusak, Julia

arXiv.org Artificial IntelligenceAug-8-2023

Tensor decomposition of convolutional and fully-connected layers is an effective way to reduce parameters and FLOP in neural networks. Due to memory and power consumption limitations of mobile or embedded devices, the quantization step is usually necessary when pre-trained models are deployed. A conventional post-training quantization approach applied to networks with decomposed weights yields a drop in accuracy. This motivated us to develop an algorithm that finds tensor approximation directly with quantized factors and thus benefit from both compression techniques while keeping the prediction quality of the model. Namely, we propose to use Alternating Direction Method of Multipliers (ADMM) for Canonical Polyadic (CP) decomposition with factors whose elements lie on a specified quantization grid. We compress neural network weights with a devised algorithm and evaluate it's prediction quality and performance. We compare our approach to state-of-the-art post-training quantization methods and demonstrate competitive results and high flexibility in achiving a desirable quality-performance tradeoff.

artificial intelligence, machine learning, quantization, (17 more...)

arXiv.org Artificial Intelligence

2308.04595

Country: Europe > France (0.14)

Genre:

Research Report (0.50)
Overview (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.83)

Add feedback

Tensor Networks Meet Neural Networks: A Survey and Future Perspectives

Wang, Maolin, Pan, Yu, Xu, Zenglin, Yang, Xiangli, Li, Guangxi, Cichocki, Andrzej

arXiv.org Artificial IntelligenceMay-8-2023

Tensor networks (TNs) and neural networks (NNs) are two fundamental data modeling approaches. TNs were introduced to solve the curse of dimensionality in large-scale tensors by converting an exponential number of dimensions to polynomial complexity. As a result, they have attracted significant attention in the fields of quantum physics and machine learning. Meanwhile, NNs have displayed exceptional performance in various applications, e.g., computer vision, natural language processing, and robotics research. Interestingly, although these two types of networks originate from different observations, they are inherently linked through the common multilinearity structure underlying both TNs and NNs, thereby motivating a significant number of intellectual developments regarding combinations of TNs and NNs. In this paper, we refer to these combinations as tensorial neural networks (TNNs), and present an introduction to TNNs in three primary aspects: network compression, information fusion, and quantum circuit simulation. Furthermore, this survey also explores methods for improving TNNs, examines flexible toolboxes for implementing TNNs, and documents TNN development while highlighting potential future directions. To the best of our knowledge, this is the first comprehensive survey that bridges the connections among NNs, TNs, and quantum circuits. We provide a curated list of TNNs at \url{https://github.com/tnbar/awesome-tensorial-neural-networks}.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2302.09019

Country: Asia > China (0.68)

Genre:

Overview (1.00)
Research Report (0.63)

Industry:

Media > Television (1.00)
Leisure & Entertainment (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

On Multiple Intelligences and Learning Styles for Artificial Intelligence Systems: Future Research Trends in AI with a Human Face?

Cichocki, Andrzej

arXiv.org Artificial IntelligenceAug-30-2020

This article discusses recent trends and concepts in developing new kinds of artificial intelligence (AI) systems which relate to complex facets and different types of human intelligence, especially social, emotional, attentional and ethical intelligence, which to date have been under-discussed. We describe various aspects of multiple human intelligence and learning styles, which may impact on a variety of AI problem domains. Using the concept of multiple intelligence rather than a single type of intelligence, we categorize and provide working definitions of various AI depending on their cognitive skills or capacities. Future AI systems will be able not only to communicate with human actors and each other, but also to efficiently exchange knowledge with abilities of cooperation, collaboration and even co-creating something new and valuable and have meta-learning capacities. Multi-agent systems such as these can be used to solve problems that would be difficult to solve by any individual intelligent agent.

deep learning, intelligence, neural network, (19 more...)

arXiv.org Artificial Intelligence

2008.04793

Country:

North America > United States (0.46)
Europe (0.46)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Therapeutic Area > Psychiatry/Psychology (0.95)
Leisure & Entertainment (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.46)

Add feedback

Reduced-Order Modeling of Deep Neural Networks

Daulbaev, Talgat, Gusak, Julia, Ponomarev, Evgeny, Cichocki, Andrzej, Oseledets, Ivan

arXiv.org Machine LearningOct-15-2019

We introduce a new method for speeding up the inference of deep neural networks. It is somewhat inspired by the reduced-order modeling techniques for dynamical systems. The cornerstone of the proposed method is the maximum volume algorithm. We demonstrate efficiency on VGG and ResNet architectures pre-trained on different datasets. We show that in many practical cases it is possible to replace convolutional layers with much smaller fully-connected layers with a relatively small drop in accuracy.

artificial intelligence, deep learning, machine learning, (17 more...)

arXiv.org Machine Learning

1910.06995

Country: Europe > Russia (0.28)

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

Multi-Kernel Capsule Network for Schizophrenia Identification

Wang, Tian, Bezerianos, Anastasios, Cichocki, Andrzej, Li, Junhua

arXiv.org Machine LearningJul-30-2019

Objective: Schizophrenia seriously affects the quality of life. To date, both simple (linear discriminant analysis) and complex (deep neural network) machine learning methods have been utilized to identify schizophrenia based on functional connectivity features. The existing simple methods need two separate steps (i.e., feature extraction and classification) to achieve the identification, which disables simultaneous tuning for the best feature extraction and classifier training. The complex methods integrate two steps and can be simultaneously tuned to achieve optimal performance, but these methods require a much larger amount of data for model training. Methods: To overcome the aforementioned drawbacks, we proposed a multi-kernel capsule network (MKCapsnet), which was developed by considering the brain anatomical structure. Kernels were set to match with partition sizes of brain anatomical structure in order to capture interregional connectivities at the varying scales. With the inspiration of widely-used dropout strategy in deep learning, we developed vector dropout in the capsule layer to prevent overfitting of the model. Results: The comparison results showed that the proposed method outperformed the state-of-the-art methods. Besides, we compared performances using different parameters and illustrated the routing process to reveal characteristics of the proposed method. Conclusion: MKCapsnet is promising for schizophrenia identification. Significance: Our study not only proposed a multi-kernel capsule network but also provided useful information in the parameter setting, which is informative for further studies using a capsule network for neurophysiological signal classification.

capsule network, deep learning, neural network, (21 more...)

arXiv.org Machine Learning

1907.12827

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback

One time is not enough: iterative tensor decomposition for neural network compression

Gusak, Julia, Kholyavchenko, Maksym, Ponomarev, Evgeny, Markeeva, Larisa, Oseledets, Ivan, Cichocki, Andrzej

arXiv.org Machine LearningMar-24-2019

The low-rank tensor approximation is very promising for the compression of deep neural networks. We propose a new simple and efficient iterative approach, which alternates low-rank factorization with a smart rank selection and fine-tuning. We demonstrate the efficiency of our method comparing to non-iterative ones. Our approach improves the compression rate while maintaining the accuracy for a variety of tasks.

compression, deep learning, neural network, (20 more...)

arXiv.org Machine Learning

1903.09973

Genre: Research Report (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)

Add feedback