AITopics | cht

Collaborating Authors

cht

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Brain-inspired sparse training enables Transformers and LLMs to perform as fully connected

Zhang, Yingtao, Zhao, Jialin, Wu, Wenjing, Liao, Ziheng, Michieli, Umberto, Cannistraci, Carlo Vittorio

arXiv.org Artificial IntelligenceJan-31-2025

This study aims to enlarge our current knowledge on application of brain-inspired network science principles for training artificial neural networks (ANNs) with sparse connectivity. Dynamic sparse training (DST) can reduce the computational demands in ANNs, but faces difficulties to keep peak performance at high sparsity levels. The Cannistraci-Hebb training (CHT) is a brain-inspired method for growing connectivity in DST. CHT leverages a gradient-free, topology-driven link regrowth, which has shown ultra-sparse (1% connectivity or lower) advantage across various tasks compared to fully connected networks. Yet, CHT suffers two main drawbacks: (i) its time complexity is O(Nd^3) - N node network size, d node degree - hence it can apply only to ultra-sparse networks. (ii) it selects top link prediction scores, which is inappropriate for the early training epochs, when the network presents unreliable connections. We propose a GPU-friendly approximation of the CH link predictor, which reduces the computational complexity to O(N^3), enabling a fast implementation of CHT in large-scale models. We introduce the Cannistraci-Hebb training soft rule (CHTs), which adopts a strategy for sampling connections in both link removal and regrowth, balancing the exploration and exploitation of network topology. To improve performance, we integrate CHTs with a sigmoid gradual density decay (CHTss). Empirical results show that, using 1% of connections, CHTs outperforms fully connected networks in MLP on visual classification tasks, compressing some networks to < 30% nodes. Using 5% of the connections, CHTss outperforms fully connected networks in two Transformer-based machine translation tasks. Using 30% of the connections, CHTss achieves superior performance compared to other dynamic sparse training methods in language modeling, and it surpasses the fully connected counterpart in zero-shot evaluations.

large language model, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2501.19107

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > California (0.04)
North America > Puerto Rico > San Juan > San Juan (0.04)
(3 more...)

Genre: Research Report > New Finding (0.48)

Industry: Health & Medicine > Therapeutic Area (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback

Continual Few-Shot Learning Using HyperTransformers

Vladymyrov, Max, Zhmoginov, Andrey, Sandler, Mark

arXiv.org Artificial IntelligenceJan-12-2023

We focus on the problem of learning without forgetting from multiple tasks arriving sequentially, where each task is defined using a few-shot episode of novel or already seen classes. We approach this problem using the recently published HyperTransformer (HT), a Transformer-based hypernetwork that generates specialized task-specific CNN weights directly from the support set. In order to learn from a continual sequence of tasks, we propose to recursively re-use the generated weights as input to the HT for the next task. This way, the generated CNN weights themselves act as a representation of previously learned tasks, and the HT is trained to update these weights so that the new task can be learned without forgetting past tasks. This approach is different from most continual learning algorithms that typically rely on using replay buffers, weight regularization or task-dependent architectural changes. We demonstrate that our proposed Continual HyperTransformer method equipped with a prototypical loss is capable of learning and retaining knowledge about past tasks for a variety of scenarios, including learning from mini-batches, and task-incremental and class-incremental learning scenarios.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2301.04584

Country:

North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
North America > United States > Utah > Salt Lake County > Salt Lake City (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
(2 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Energy reconstruction for large liquid scintillator detectors with machine learning techniques: aggregated features approach

Gavrikov, Arsenii, Malyshkin, Yury, Ratnikov, Fedor

arXiv.org Artificial IntelligenceNov-14-2022

Large-scale detectors consisting of a liquid scintillator target surrounded by an array of photo-multiplier tubes (PMTs) are widely used in the modern neutrino experiments: Borexino, KamLAND, Daya Bay, Double Chooz, RENO, and the upcoming JUNO with its satellite detector TAO. Such apparatuses are able to measure neutrino energy which can be derived from the amount of light and its spatial and temporal distribution over PMT channels. However, achieving a fine energy resolution in large-scale detectors is challenging. In this work, we present machine learning methods for energy reconstruction in the JUNO detector, the most advanced of its type. We focus on positron events in the energy range of 0-10 MeV which corresponds to the main signal in JUNO -- neutrinos originated from nuclear reactor cores and detected via the inverse beta decay channel. We consider the following models: Boosted Decision Trees and Fully Connected Deep Neural Network, trained on aggregated features, calculated using the information collected by PMTs. We describe the details of our feature engineering procedure and show that machine learning models can provide the energy resolution $\sigma = 3\%$ at 1 MeV using subsets of engineered features. The dataset for model training and testing is generated by the Monte Carlo method with the official JUNO software.

artificial intelligence, detector, machine learning, (19 more...)

arXiv.org Artificial Intelligence

doi: 10.1140/epjc/s10052-022-11004-6

2206.0904

Country:

Asia > Russia (0.14)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
Europe > Italy (0.04)
Asia > China (0.04)

Genre: Research Report (1.00)

Industry: Energy > Power Industry > Utilities > Nuclear (0.86)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Exploring Constraint Handling Techniques in Real-world Problems on MOEA/D with Limited Budget of Evaluations

Vaz, Felipe, Lavinas, Yuri, Aranha, Claus, Ladeira, Marcelo

arXiv.org Artificial IntelligenceNov-19-2020

Finding good solutions for Multi-objective Optimization (MOPs) Problems is considered a hard problem, especially when considering MOPs with constraints. Thus, most of the works in the context of MOPs do not explore in-depth how different constraints affect the performance of MOP solvers. Here, we focus on exploring the effects of different Constraint Handling Techniques (CHTs) on MOEA/D, a commonly used MOP solver when solving complex real-world MOPs. Moreover, we introduce a simple and effective CHT focusing on the exploration of the decision space, the Three Stage Penalty. We explore each of these CHTs in MOEA/D on two simulated MOPs and six analytic MOPs (eight in total). The results of this work indicate that while the best CHT is problem-dependent, our new proposed Three Stage Penalty achieves competitive results and remarkable performance in terms of hypervolume values in the hard simulated car design MOP.

cht, constraint handling technique, penalty, (15 more...)

arXiv.org Artificial Intelligence

2011.09722

Country:

Asia > Japan > Honshū > Kantō > Ibaraki Prefecture > Tsukuba (0.05)
South America > Brazil > Federal District > Brasília (0.04)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.97)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.90)

Add feedback

CHT to recruit talent for AI, IoT, big data

#artificialintelligenceDec-27-2018, 17:50:25 GMT

Chunghwa Telecom (CHT) plans to launch a large-scale recruitment drive in 2019 as it expects to see an unprecedented wave of up to 5,000 of its employees applying for retirements over the next five years. As many as 1,600 jobs would be available at the Taiwan-based telecom carrier in 2019, according to company chairman David Cheng, who added that the number of new employees hired each year will be over 1,000 for a few years after 2019. However, to cope with changing industry developments, including the forthcoming 5G era and increasing competition, the company plans to hire more talent with expertise related to AI, big data analysis, IoT, mobile payment, 5G and information security, Cheng said. Including its subsidiaries, CPT currently has about 33,500 employees.

artificial intelligence, data mining, recruit talent, (4 more...)

#artificialintelligence

Country: Asia > Taiwan (0.32)

Industry: Information Technology > Security & Privacy (0.69)

Technology:

Information Technology > Artificial Intelligence (0.85)
Information Technology > Security & Privacy (0.69)
Information Technology > Data Science > Data Mining > Big Data (0.68)

Add feedback