AITopics | ebruary 14

Collaborating Authors

ebruary 14

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Impact of Batch Normalization on Convolutional Network Representations

Potgieter, Hermanus L., Mouton, Coenraad, Davel, Marelie H.

arXiv.org Artificial IntelligenceFeb-13-2025

Deep learning has become a particularly important set of machine learning techniques and is widely applied to solve real-world tasks. At the same time, many open questions remain with regard to the ability of these deep neural networks (DNNs) to generalize so well, that is, their ability to perform well on unseen data. Although there is not yet a theoretical framework to assist us in reasoning about these models [2], the generalization ability of DNNs has been studied from many perspectives, such as the geometry of the loss landscape [3], statistical measures of stability and robustness [4], size of margins (distance to the decision boundary between classes) [5], and information-theoretic techniques [6], among others. A promising research direction is to study the characteristics of the internal data representations formed by DNNs, where each representation is the vector of activation values from a specific layer for a given sample. Aspects of these representations that have been studied include the size of margins in the representation space [7, 8, 9]; the'quality' of representations, evaluated using the consistency of class-specific representations and their robustness when combined [9]; and representation sparsity, that is, the number of non-zero elements in a data representation [10]. In this work, we also study the characteristics of the internal representations of DNNs, but focus on the effect that a very specific technique - Batch Normalization (BatchNorm) - has on internal representation quality. BatchNorm [11] is a popular technique used to normalize hidden activations when training DNNs. Networks trained with BatchNorm show desirable properties such as faster convergence and better generalization ability [12, 13]. Despite the success and widespread adoption of BatchNorm, the exact mechanisms by which BatchNorm achieves its performance remain unclear.

artificial intelligence, machine learning, representation, (18 more...)

arXiv.org Artificial Intelligence

doi: 10.1007/978-3-031-78255-8_14

2501.14441

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.87)

Add feedback

Self-Supervised Graph Contrastive Pretraining for Device-level Integrated Circuits

Lee, Sungyoung, Wang, Ziyi, Kim, Seunggeun, Lee, Taekyun, Pan, David Z.

arXiv.org Artificial IntelligenceFeb-12-2025

Self-supervised graph representation learning has driven significant advancements in domains such as social network analysis, molecular design, and electronics design automation (EDA). However, prior works in EDA have mainly focused on the representation of gate-level digital circuits, failing to capture analog and mixed-signal circuits. To address this gap, we introduce DICE: Device-level Integrated Circuits Encoder, the first self-supervised pretrained graph neural network (GNN) model for any circuit expressed at the device level. DICE is a message-passing neural network (MPNN) trained through graph contrastive learning, and its pretraining process is simulation-free, incorporating two novel data augmentation techniques. Experimental results demonstrate that DICE achieves substantial performance gains across three downstream tasks, underscoring its effectiveness for both analog and digital circuits.

artificial intelligence, machine learning, representation, (15 more...)

arXiv.org Artificial Intelligence

2502.08949

Country:

North America > United States > Texas > Travis County > Austin (0.05)
Asia > Middle East > Iran > Alborz Province > Karaj (0.04)
Europe (0.04)
(2 more...)

Genre: Research Report > New Finding (0.48)

Industry: Semiconductors & Electronics (0.71)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)

Add feedback

Causal-Informed Contrastive Learning: Towards Bias-Resilient Pre-training under Concept Drift

Yang, Xiaoyu, Lu, Jie, Yu, En

arXiv.org Artificial IntelligenceFeb-11-2025

Contrastive learning has proven to be highly effective in pre-training large-scale models, especially in large vision models exemplified by frameworks like SimCLR [1, 2], MoCo series [3, 4], DINO series [5, 6]. However, with the ongoing scaling of large models, data hunger for contrastive learning is raising more attention in the community towards pre-training effectively from drift data. It could be caused by long-tailed data, noise, and domain shift, where concept drift [7, 8] is utilized to uniformly summarize this phenomenon of unpredictable distribution changes in the pre-training through contrastive learning. Hence, a pertinent question emerges: beyond the existing contrastive learning methods, can contrastive paradigm learn from drift pre-training? In this work, we aim to bridge this gap by providing a systematic analysis of the above question. Our findings highlight critical vulnerabilities of the current contrastive pre-training paradigm in adapting to these challenges, underscoring the need for novel strategies to enhance their robustness in drift data streams. More related works are provided in Appendix A. Current contrastive pre-training methods predominantly adhere to the paradigm of comparing two distinct views of the same object, typically derived from different encoders.

artificial intelligence, concept drift, machine learning, (13 more...)

arXiv.org Artificial Intelligence

2502.0762

Country:

Oceania > Australia > New South Wales > Sydney (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Switzerland (0.04)

Genre: Research Report > New Finding (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

HuDEx: Integrating Hallucination Detection and Explainability for Enhancing the Reliability of LLM responses

Lee, Sujeong, Lee, Hayoung, Heo, Seongsoo, Choi, Wonik

arXiv.org Artificial IntelligenceFeb-11-2025

Recent advances in large language models (LLMs) have shown promising improvements, often surpassing existing methods across a wide range of downstream tasks in natural language processing. However, these models still face challenges, which may hinder their practical applicability. For example, the phenomenon of hallucination is known to compromise the reliability of LLMs, especially in fields that demand high factual precision. Current benchmarks primarily focus on hallucination detection and factuality evaluation but do not extend beyond identification. This paper proposes an explanation enhanced hallucination-detection model, coined as HuDEx, aimed at enhancing the reliability of LLM-generated responses by both detecting hallucinations and providing detailed explanations. The proposed model provides a novel approach to integrate detection with explanations, and enable both users and the LLM itself to understand and reduce errors. Our measurement results demonstrate that the proposed model surpasses larger LLMs, such as Llama3 70B and GPT-4, in hallucination detection accuracy, while maintaining reliable explanations. Furthermore, the proposed model performs well in both zero-shot and other test environments, showcasing its adaptability across diverse benchmark datasets. The proposed approach further enhances the hallucination detection research by introducing a novel approach to integrating interpretability with hallucination detection, which further enhances the performance and reliability of evaluating hallucinations in language models.

large language model, machine learning, natural language, (20 more...)

arXiv.org Artificial Intelligence

2502.08109

Country:

Asia > South Korea > Incheon > Incheon (0.05)
Asia > Indonesia > Bali (0.04)
South America > Colombia > Meta Department > Villavicencio (0.04)
(2 more...)

Genre: Research Report > New Finding (0.88)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Approximating Families of Sharp Solutions to Fisher's Equation with Physics-Informed Neural Networks

Rohrhofer, Franz M., Posch, Stefan, Gößnitzer, Clemens, Geiger, Bernhard C.

arXiv.org Artificial IntelligenceFeb-13-2024

This paper employs physics-informed neural networks (PINNs) to solve Fisher's equation, a fundamental representation of a reaction-diffusion system with both simplicity and significance. The focus lies specifically in investigating Fisher's equation under conditions of large reaction rate coefficients, wherein solutions manifest as traveling waves, posing a challenge for numerical methods due to the occurring steepness of the wave front. To address optimization challenges associated with the standard PINN approach, a residual weighting scheme is introduced. This scheme is designed to enhance the tracking of propagating wave fronts by considering the reaction term in the reaction-diffusion equation. Furthermore, a specific network architecture is studied which is tailored for solutions in the form of traveling waves. Lastly, the capacity of PINNs to approximate an entire family of solutions is assessed by incorporating the reaction rate coefficient as an additional input to the network architecture. This modification enables the approximation of the solution across a broad and continuous range of reaction rate coefficients, thus solving a class of reaction-diffusion systems using a single PINN instance.

architecture, equation, physics-informed neural network, (12 more...)

arXiv.org Artificial Intelligence

2402.08313

Country:

Europe > Austria > Styria > Graz (0.04)
Europe > Russia > Central Federal District > Moscow Oblast > Moscow (0.04)
Europe > Austria > Vienna (0.04)
(2 more...)

Genre: Research Report (0.64)

Industry: Energy (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Generative Sampling in Bundle Tractography using Autoencoders (GESTA)

Legarreta, Jon Haitz, Petit, Laurent, Jodoin, Pierre-Marc, Descoteaux, Maxime

arXiv.org Artificial IntelligenceFeb-10-2023

Current tractography methods use the local orientation information to propagate streamlines from seed locations. Many such seeds provide streamlines that stop prematurely or fail to map the true white matter pathways because some bundles are "harder-to-track" than others. This results in tractography reconstructions with poor white and gray matter spatial coverage. In this work, we propose a generative, autoencoder-based method, named GESTA (Generative Sampling in Bundle Tractography using Autoencoders), that produces streamlines achieving better spatial coverage. Compared to other deep learning methods, our autoencoder-based framework uses a single model to generate streamlines in a bundle-wise fashion, and does not require to propagate local orientations. GESTA produces new and complete streamlines for any given white matter bundle, including hard-to-track bundles. Applied on top of a given tractogram, GESTA is shown to be effective in improving the white matter volume coverage in poorly populated bundles, both on synthetic and human brain in vivo data. Our streamline evaluation framework ensures that the streamlines produced by GESTA are anatomically plausible and fit well to the local diffusion signal. The streamline evaluation criteria assess anatomy (white matter coverage), local orientation alignment (direction), and geometry features of streamlines, and optionally, gray matter connectivity. GESTA is thus a novel deep generative bundle tractography method that can be used to improve the tractography reconstruction of the white matter.

artificial intelligence, machine learning, streamline, (19 more...)

arXiv.org Artificial Intelligence

2204.10891

Country:

North America > Canada > Quebec > Estrie Region > Sherbrooke (0.14)
South America > Peru > Lima Department > Lima Province > Lima (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(5 more...)

Genre: Research Report (1.00)

Industry:

Health & Medicine > Therapeutic Area > Neurology (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Health & Medicine > Health Care Technology (0.93)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

PEg TRAnsfer Workflow recognition challenge report: Does multi-modal data improve recognition?

Huaulmé, Arnaud, Harada, Kanako, Nguyen, Quang-Minh, Park, Bogyu, Hong, Seungbum, Choi, Min-Kook, Peven, Michael, Li, Yunshuang, Long, Yonghao, Dou, Qi, Kumar, Satyadwyoom, Lalithkumar, Seenivasan, Hongliang, Ren, Matsuzaki, Hiroki, Ishikawa, Yuto, Harai, Yuriko, Kondo, Satoshi, Mitsuishi, Mamoru, Jannin, Pierre

arXiv.org Artificial IntelligenceFeb-11-2022

This paper presents the design and results of the "PEg TRAnsfert Workflow recognition" (PETRAW) challenge whose objective was to develop surgical workflow recognition methods based on one or several modalities, among video, kinematic, and segmentation data, in order to study their added value. The PETRAW challenge provided a data set of 150 peg transfer sequences performed on a virtual simulator. This data set was composed of videos, kinematics, semantic segmentation, and workflow annotations which described the sequences at three different granularity levels: phase, step, and activity. Five tasks were proposed to the participants: three of them were related to the recognition of all granularities with one of the available modalities, while the others addressed the recognition with a combination of modalities. Average application-dependent balanced accuracy (AD-Accuracy) was used as evaluation metric to take unbalanced classes into account and because it is more clinically relevant than a frame-by-frame score. Seven teams participated in at least one task and four of them in all tasks. Best results are obtained with the use of the video and the kinematics data with an AD-Accuracy between 93% and 90% for the four teams who participated in all tasks. The improvement between video/kinematic-based methods and the uni-modality ones was significant for all of the teams. However, the difference in testing execution time between the video/kinematic-based and the kinematic-based methods has to be taken into consideration. Is it relevant to spend 20 to 200 times more computing time for less than 3% of improvement? The PETRAW data set is publicly available at www.synapse.org/PETRAW to encourage further research in surgical workflow recognition.

ebruary 14, preprint, recognition, (16 more...)

arXiv.org Artificial Intelligence

2202.05821

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Asia > China > Hong Kong (0.04)
Europe > France > Brittany > Ille-et-Vilaine > Rennes (0.04)
(7 more...)

Genre:

Workflow (1.00)
Research Report (1.00)

Industry:

Health & Medicine > Surgery (1.00)
Health & Medicine > Therapeutic Area (0.67)
Health & Medicine > Health Care Technology (0.67)
Health & Medicine > Diagnostic Medicine > Imaging (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Add feedback

An Optimized Recurrent Unit for Ultra-Low-Power Keyword Spotting

Amoh, Justice, Odame, Kofi

arXiv.org Machine LearningFeb-13-2019

There is growing interest in being able to run neural networks on sensors, wearables and internet-of-things (IoT) devices. However, the computational demands of neural networks make them difficult to deploy on resource-constrained edge devices. To meet this need, our work introduces a new recurrent unit architecture that is specifically adapted for on-device low power acoustic event detection (AED). The proposed architecture is based on the gated recurrent unit (`GRU') but features optimizations that make it implementable on ultra-low power micro-controllers such as the Arm Cortex M0+. Our new architecture, the Embedded Gated Recurrent Unit (eGRU) is demonstrated to be highly efficient and suitable for short-duration AED and keyword spotting tasks. A single eGRU cell is 60x faster and 10x smaller than a GRU cell. Despite its optimizations, eGRU compares well with GRU across tasks of varying complexities. The practicality of eGRU is investigated in a wearable acoustic event detection application. An eGRU model is implemented and tested on the Arm Cortex M0-based Atmel ATSAMD21E18 processor. The Arm M0+ implementation of the eGRU model compares favorably with a full precision GRU that is running on a workstation. The embedded eGRU model achieves a classification accuracy 95.3%, which is only 2% less than the full precision GRU.

architecture, egru, neural network, (16 more...)

arXiv.org Machine Learning

1902.05026

Country:

North America > United States > New Hampshire > Grafton County > Hanover (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Texas (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)

Genre: Research Report (0.51)

Industry: Information Technology (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback