weight data
Not All Unlabeled Data are Equal: Learning to Weight Data in Semi-supervised Learning
Existing semi-supervised learning (SSL) algorithms use a single weight to balance the loss of labeled and unlabeled examples, i.e., all unlabeled examples are equally weighted. But not all unlabeled data are equal. In this paper we study how to use a different weight for "every" unlabeled example. Manual tuning of all those weights -- as done in prior work -- is no longer possible. Instead, we adjust those weights via an algorithm based on the influence function, a measure of a model's dependency on one training example. To make the approach efficient, we propose a fast and effective approximation of the influence function. We demonstrate that this technique outperforms state-of-the-art methods on semi-supervised image and language classification tasks.
HH-PIM: Dynamic Optimization of Power and Performance with Heterogeneous-Hybrid PIM for Edge AI Devices
Jeon, Sangmin, Lee, Kangju, Lee, Kyeongwon, Lee, Woojoo
--Processing-in-Memory (PIM) architectures offer promising solutions for efficiently handling AI applications in energy-constrained edge environments. While traditional PIM designs enhance performance and energy efficiency by reducing data movement between memory and processing units, they are limited in edge devices due to continuous power demands and the storage requirements of large neural network weights in SRAM and DRAM. Hybrid PIM architectures, incorporating nonvolatile memories like MRAM and ReRAM, mitigate these limitations but struggle with a mismatch between fixed computing resources and dynamically changing inference workloads. T o address these challenges, this study introduces a Heterogeneous-Hybrid PIM ( HH-PIM) architecture, comprising high-performance MRAM-SRAM PIM modules and low-power MRAM-SRAM PIM modules. We further propose a data placement optimization algorithm that dynamically allocates data based on computational demand, maximizing energy efficiency. FPGA prototyping and power simulations with processors featuring HH-PIM and other PIM types demonstrate that the proposed HH-PIM achieves up to 60.43% average energy savings over conventional PIMs while meeting application latency requirements. These results confirm HH-PIM's suitability for adaptive, energy-efficient AI processing in edge devices. With the advent of artificial intelligence (AI), real-world applications are rapidly expanding, fueling a trend to embed AI capabilities into IoT devices across diverse fields. However, traditional server-centric data processing, such as cloud computing, faces significant energy and latency challenges due to processing and communication overloads.
SpikeX: Exploring Accelerator Architecture and Network-Hardware Co-Optimization for Sparse Spiking Neural Networks
Xu, Boxun, Boone, Richard, Li, Peng
Spiking Neural Networks (SNNs) are promising biologically plausible models of computation which utilize a spiking binary activation function similar to that of biological neurons. SNNs are well positioned to process spatiotemporal data, and are advantageous in ultra-low power and real-time processing. Despite a large body of work on conventional artificial neural network accelerators, much less attention has been given to efficient SNN hardware accelerator design. In particular, SNNs exhibit inherent unstructured spatial and temporal firing sparsity, an opportunity yet to be fully explored for great hardware processing efficiency. In this work, we propose a novel systolic-array SNN accelerator architecture, called SpikeX, to take on the challenges and opportunities stemming from unstructured sparsity while taking into account the unique characteristics of spike-based computation. By developing an efficient dataflow targeting expensive multi-bit weight data movements, SpikeX reduces memory access and increases data sharing and hardware utilization for computations spanning across both time and space, thereby significantly improving energy efficiency and inference latency. Furthermore, recognizing the importance of SNN network and hardware co-design, we develop a co-optimization methodology facilitating not only hardware-aware SNN training but also hardware accelerator architecture search, allowing joint network weight parameter optimization and accelerator architectural reconfiguration. This end-to-end network/accelerator co-design approach offers a significant reduction of 15.1x-150.87x in energy-delay-product(EDP) without comprising model accuracy.
Review for NeurIPS paper: Not All Unlabeled Data are Equal: Learning to Weight Data in Semi-supervised Learning
Strengths: 1) This paper considers an ignorable case in SSL where not all unlabeled data should be treated equally during training. Though automatic weight tuning for different samples is already studies in supervised learning setting, it is new in SSL context to my best knowledge. Therefore, the motivation is clear and valid. The proposed algorithm is simple and practical and demonstrates benefits compared to different baselines, so the paper worked towards its motivation. Influence function is a tool of measuring models' dependency on samples in train set.
Review for NeurIPS paper: Not All Unlabeled Data are Equal: Learning to Weight Data in Semi-supervised Learning
After rebuttal, 2 reviewers initially inclined to reject the paper raise their grades, leading to a consensus among reviewers for weak acceptance. Although their remains room for improvements, the AC agrees that the contributions are solid and interesting for the community, and therefore recommends acceptance. The authors are highly encouraged to update the final version of the paper based on reviewers' comments.
Not All Unlabeled Data are Equal: Learning to Weight Data in Semi-supervised Learning
Existing semi-supervised learning (SSL) algorithms use a single weight to balance the loss of labeled and unlabeled examples, i.e., all unlabeled examples are equally weighted. But not all unlabeled data are equal. In this paper we study how to use a different weight for "every" unlabeled example. Manual tuning of all those weights -- as done in prior work -- is no longer possible. Instead, we adjust those weights via an algorithm based on the influence function, a measure of a model's dependency on one training example.