conv 1
- North America > Canada > Ontario > Toronto (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Implémentation Efficiente de Fonctions de Convolution sur FPGA à l'Aide de Blocs Paramétrables et d'Approximations Polynomiales
Magalhães, Philippe, Fresse, Virginie, Suffran, Benoît, Alata, Olivier
Implementing convolutional neural networks (CNNs) on field-programmable gate arrays (FPGAs) has emerged as a promising alternative to GPUs, offering lower latency, greater power efficiency and greater flexibility. However, this development remains complex due to the hardware knowledge required and the long synthesis, placement and routing stages, which slow down design cycles and prevent rapid exploration of network configurations, making resource optimisation under severe constraints particularly challenging. This paper proposes a library of configurable convolution Blocks designed to optimize FPGA implementation and adapt to available resources. It also presents a methodological framework for developing mathematical models that predict FPGA resources utilization. The approach is validated by analyzing the correlation between the parameters, followed by error metrics. The results show that the designed blocks enable adaptation of convolution layers to hardware constraints, and that the models accurately predict resource consumption, providing a useful tool for FPGA selection and optimized CNN deployment.
- Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.05)
- North America > Mexico > Quintana Roo > Cancún (0.04)
- North America > Canada > Ontario > Middlesex County > London (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- North America > Canada > Ontario > Toronto (0.04)
- Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Convolutions Through the Lens of Tensor Networks
Despite their simple intuition, convolutions are more tedious to analyze than dense layers, which complicates the generalization of theoretical and algorithmic ideas. We provide a new perspective onto convolutions through tensor networks (TNs) which allow reasoning about the underlying tensor multiplications by drawing diagrams, and manipulating them to perform function transformations, sub-tensor access, and fusion. We demonstrate this expressive power by deriving the diagrams of various autodiff operations and popular approximations of second-order information with full hyper-parameter support, batching, channel groups, and generalization to arbitrary convolution dimensions. Further, we provide convolution-specific transformations based on the connectivity pattern which allow to re-wire and simplify diagrams before evaluation. Finally, we probe computational performance, relying on established machinery for efficient TN contraction. Our TN implementation speeds up a recently-proposed KFAC variant up to 4.5 x and enables new hardware-efficient tensor dropout for approximate backpropagation. Despite the success of transformers [68], CNNs continue to be widely used and show competitive performance when incorporating architecture modernizations [41; 40] and attention [30; 70; 11; 39]. While the intuition behind convolution is simple to understand with graphical illustrations such as in Dumoulin & Visin [20], convolutions are more challenging to analyze than fully-connected layers in multi-layer perceptrons (MLPs). One reason is that the operation is hard to express in matrix expressions and--even when switching to index notation--compact expressions that are convenient to work with only exist for special hyper-parameter choices [e.g. The many hyper-parameters of convolution and additional features like channel groups [35] introduce additional complexity. And related objects like (higher-order) derivatives and related routines for autodiff inherit this complexity. TNs express tensor multiplications as diagrams (Figure 1).
- North America > Canada > Ontario > Toronto (0.04)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Sharp Eyes: A Salient Object Detector Working The Same Way as Human Visual Characteristics
Zhu, Ge, Li, Jinbao, Guo, Yahong
Current methods aggregate multi-level features or introduce edge and skeleton to get more refined saliency maps. However, little attention is paid to how to obtain the complete salient object in cluttered background, where the targets are usually similar in color and texture to the background. To handle this complex scene, we propose a sharp eyes network (SENet) that first seperates the object from scene, and then finely segments it, which is in line with human visual characteristics, i.e., to look first and then focus. Different from previous methods which directly integrate edge or skeleton to supplement the defects of objects, the proposed method aims to utilize the expanded objects to guide the network obtain complete prediction. Specifically, SENet mainly consists of target separation (TS) brach and object segmentation (OS) branch trained by minimizing a new hierarchical difference aware (HDA) loss. In the TS branch, we construct a fractal structure to produce saliency features with expanded boundary via the supervision of expanded ground truth, which can enlarge the detail difference between foreground and background. In the OS branch, we first aggregate multi-level features to adaptively select complementary components, and then feed the saliency features with expanded boundary into aggregated features to guide the network obtain complete prediction. Moreover, we propose the HDA loss to further improve the structural integrity and local details of the salient objects, which assigns weight to each pixel according to its distance from the boundary hierarchically. Hard pixels with similar appearance in border region will be given more attention hierarchically to emphasize their importance in completeness prediction. Comprehensive experimental results on five datasets demonstrate that the proposed approach outperforms the state-of-the-art methods both quantitatively and qualitatively.
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- Asia > China > Heilongjiang Province > Harbin (0.04)
Deep Multimodal Subspace Clustering Networks
Abavisani, Mahdi, Patel, Vishal M.
Abstract--We present convolutional neural network (CNN) based approaches for unsupervised multimodal subspace clustering. The proposed framework consists of three main stages - multimodal encoder, self-expressive layer, and multimodal decoder . The encoder takes multimodal data as input and fuses them to a latent space representation. We investigate early, late and intermediate fusion techniques and propose three different encoders corresponding to them for spatial fusion. The self-expressive layers and multimodal decoders are essentially the same for different spatial fusion-based approaches. In addition to various spatial fusion-based methods, an affinity fusion-based network is also proposed in which the self-expressiveness layer corresponding to different modalities is enforced to be the same. Extensive experiments on three datasets show that the proposed methods significantly outperform the state-of-the-art multimodal subspace clustering methods. ANY practical applications in image processing, computer vision, and speech processing require one to process very high-dimensional data. However, these data often lie in a low-dimensional subspace. For instance, facial images with variation in illumination [1], handwritten digits [2] and trajectories of a rigidly moving object in a video [3] are examples where the high-dimensional data can be represented by low-dimensional subspaces. Subspace clustering algorithms essentially use this fact to find clusters in different subspaces within a dataset [4].
- North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
- Asia > Middle East > Jordan (0.04)
- Health & Medicine > Diagnostic Medicine > Imaging (0.46)
- Health & Medicine > Therapeutic Area > Neurology (0.46)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Do not Choose Representation just Change: An Experimental Study in States based EA
Bercachi, Maroun, Collard, Philippe, Clergue, Manuel, Verel, Sebastien
Our aim in this paper is to analyse the phenotypic effects (evolvability) of diverse coding conversion operators in an instance of the states based evolutionary algorithm (SEA). Since the representation of solutions or the selection of the best encoding during the optimization process has been proved to be very important for the efficiency of evolutionary algorithms (EAs), we will discuss a strategy of coupling more than one representation and different procedures of conversion from one coding to another during the search. Elsewhere, some EAs try to use multiple representations (SM-GA, SEA, etc.) in intention to benefit from the characteristics of each of them. In spite of those results, this paper shows that the change of the representation is also a crucial approach to take into consideration while attempting to increase the performances of such EAs. As a demonstrative example, we use a two states SEA (2-SEA) which has two identical search spaces but different coding conversion operators. The results show that the way of changing from one coding to another and not only the choice of the best representation nor the representation itself is very advantageous and must be taken into account in order to well-desing and improve EAs execution.