compression level
Scientific Data Compression and Super-Resolution Sampling
Modern scientific simulations, observations, and large-scale experiments generate data at volumes that often exceed the limits of storage, processing, and analysis. This challenge drives the development of data reduction methods that efficiently manage massive datasets while preserving essential physical features and quantities of interest. In many scientific workflows, it is also crucial to enable data recovery from compressed representations - a task known as super-resolution - with guarantees on the preservation of key physical characteristics. A notable example is checkpointing and restarting, which is essential for long-running simulations to recover from failures, resume after interruptions, or examine intermediate results. In this work, we introduce a novel framework for scientific data compression and super-resolution, grounded in recent advances in learning exponential families. Our method preserves and quantifies uncertainty in physical quantities of interest and supports flexible trade-offs between compression ratio and reconstruction fidelity.
- North America > United States > New Mexico > Los Alamos County > Los Alamos (0.05)
- North America > United States > Tennessee > Anderson County > Oak Ridge (0.04)
- North America > United States > District of Columbia > Washington (0.04)
- (3 more...)
- Research Report (1.00)
- Workflow (0.66)
- Energy (1.00)
- Government > Regional Government > North America Government > United States Government (0.46)
- Information Technology > Data Science (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)
- (2 more...)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision (0.98)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.33)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.31)
Easz: An Agile Transformer-based Image Compression Framework for Resource-constrained IoTs
Mao, Yu, Li, Jingzong, Wang, Jun, Xu, Hong, Kuo, Tei-Wei, Guan, Nan, Xue, Chun Jason
Neural image compression, necessary in various machine-to-machine communication scenarios, suffers from its heavy encode-decode structures and inflexibility in switching between different compression levels. Consequently, it raises significant challenges in applying the neural image compression to edge devices that are developed for powerful servers with high computational and storage capacities. We take a step to solve the challenges by proposing a new transformer-based edge-compute-free image coding framework called Easz. Easz shifts the computational overhead to the server, and hence avoids the heavy encoding and model switching overhead on the edge. Easz utilizes a patch-erase algorithm to selectively remove image contents using a conditional uniform-based sampler. The erased pixels are reconstructed on the receiver side through a transformer-based framework. To further reduce the computational overhead on the receiver, we then introduce a lightweight transformer-based reconstruction structure to reduce the reconstruction load on the receiver side. Extensive evaluations conducted on a real-world testbed demonstrate multiple advantages of Easz over existing compression approaches, in terms of adaptability to different compression levels, computational efficiency, and image reconstruction quality.
- Asia > China > Hong Kong (0.05)
- Asia > Taiwan (0.04)
- North America > Canada (0.04)
- Asia > Middle East > UAE (0.04)
A Brief Review for Compression and Transfer Learning Techniques in DeepFake Detection
Karathanasis, Andreas, Violos, John, Kompatsiaris, Ioannis, Papadopoulos, Symeon
Training and deploying deepfake detection models on edge devices offers the advantage of maintaining data privacy and confidentiality by processing it close to its source. However, this approach is constrained by the limited computational and memory resources available at the edge. To address this challenge, we explore compression techniques to reduce computational demands and inference time, alongside transfer learning methods to minimize training overhead. Using the Synthbuster, RAISE, and ForenSynths datasets, we evaluate the effectiveness of pruning, knowledge distillation (KD), quantization, fine-tuning, and adapter-based techniques. Our experimental results demonstrate that both compression and transfer learning can be effectively achieved, even with a high compression level of 90%, remaining at the same performance level when the training and validation data originate from the same DeepFake model. However, when the testing dataset is generated by DeepFake models not present in the training set, a domain generalization issue becomes evident.
Systematic Weight Evaluation for Pruning Large Language Models: Enhancing Performance and Sustainability
Islam, Ashhadul, Belhaouari, Samir Brahim, Bermak, Amine
Impact of Compression on Model Performance: Through comprehensive experiments, the study demonstrates that moderate pruning can enhance model efficiency, but excessive compression leads to substantial performance degradation in both language and multimodal models. Sustainable AI Development: The findings emphasize the need for optimized AI models to reduce the environmental impact, addressing critical issues like carbon footprint, electricity, and water consumption associated with training and deploying large-scale AI systems. Systematic Weight Evaluation for Pruning Large Language Models: Enhancing Performance and Sustainability Ashhadul Islam a, Samir Brahim Belhaouari a, Amine Bermak a a College Of Science & Engineering, Hamad Bin Khalifa University, Education City, Doha, 34110, QatarAbstract The exponential growth of large language models (LLMs) like ChatGPT has revolutionized artificial intelligence, offering unprecedented capabilities in natural language processing. However, the extensive computational resources required for training these models have significant environmental implications, including high carbon emissions, energy consumption, and water usage. This research presents a novel approach to LLM pruning, focusing on the systematic evaluation of individual weight importance throughout the training process. By monitoring parameter evolution over time, we propose a method that effectively reduces model size without compromising performance. Extensive experiments with both a scaled-down LLM and a large multimodal model reveal that moderate pruning enhances efficiency and reduces loss, while excessive pruning drastically deteriorates model performance. These findings highlight the critical need for optimized AI models to ensure sustainable development, balancing technological advancement with environmental responsibility.
A General Error-Theoretical Analysis Framework for Constructing Compression Strategies
Zhang, Boyang, Cheng, Daning, Zhang, Yunquan, Tu, Meiqi, Liu, Fangmin, Tian, Jiake
The exponential growth in parameter size and computational complexity of deep models poses significant challenges for efficient deployment. The core problem of existing compression methods is that different layers of the model have significant differences in their tolerance to compression levels. For instance, the first layer of a model can typically sustain a higher compression level compared to the last layer without compromising performance. Thus, the key challenge lies in how to allocate compression levels across layers in a way that minimizes performance loss while maximizing parameter reduction. To address this challenge, we propose a Compression Error Theory (CET) framework, designed to determine the optimal compression level for each layer. Taking quantization as an example, CET leverages differential expansion and algebraic geometry to reconstruct the quadratic form of quantization error as ellipsoids and hyperbolic paraboloids, and utilizes their geometric structures to define an error subspace. To identify the error subspace with minimal performance loss, by performing orthogonal decomposition of the geometric space, CET transforms the optimization process of the error subspace into a complementary problem. The final theoretical analysis shows that constructing the quantization subspace along the major axis results in minimal performance degradation. Through experimental verification of the theory, CET can greatly retain performance while compressing. Specifically, on the ResNet-34 model, CET achieves nearly 11$\times$ parameter compression while even surpassing performance comparable to the original model.
Real-Time LiDAR Point Cloud Compression and Transmission for Resource-constrained Robots
Cao, Yuhao, Wang, Yu, Chen, Haoyao
LiDARs are widely used in autonomous robots due to their ability to provide accurate environment structural information. However, the large size of point clouds poses challenges in terms of data storage and transmission. In this paper, we propose a novel point cloud compression and transmission framework for resource-constrained robotic applications, called RCPCC. We iteratively fit the surface of point clouds with a similar range value and eliminate redundancy through their spatial relationships. Then, we use Shape-adaptive DCT (SA-DCT) to transform the unfit points and reduce the data volume by quantizing the transformed coefficients. We design an adaptive bitrate control strategy based on QoE as the optimization goal to control the quality of the transmitted point cloud. Experiments show that our framework achieves compression rates of 40$\times$ to 80$\times$ while maintaining high accuracy for downstream applications. our method significantly outperforms other baselines in terms of accuracy when the compression rate exceeds 70$\times$. Furthermore, in situations of reduced communication bandwidth, our adaptive bitrate control strategy demonstrates significant QoE improvements. The code will be available at https://github.com/HITSZ-NRSL/RCPCC.git.
- Europe > Switzerland > Zürich > Zürich (0.14)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
- Asia > China > Heilongjiang Province > Harbin (0.04)
BYON: Bring Your Own Networks for Digital Agriculture Applications
Sie, Emerson, Tao, Bill, Mihigo, Aganze, Karmehan, Parithimaal, Zhang, Max, Sivakumar, Arun N., Chowdhary, Girish, Vasisht, Deepak
Digital agriculture technologies rely on sensors, drones, robots, and autonomous farm equipment to improve farm yields and incorporate sustainability practices. However, the adoption of such technologies is severely limited by the lack of broadband connectivity in rural areas. We argue that farming applications do not require permanent always-on connectivity. Instead, farming activity and digital agriculture applications follow seasonal rhythms of agriculture. Therefore, the need for connectivity is highly localized in time and space. We introduce BYON, a new connectivity model for high bandwidth agricultural applications that relies on emerging connectivity solutions like citizens broadband radio service (CBRS) and satellite networks. BYON creates an agile connectivity solution that can be moved along a farm to create spatio-temporal connectivity bubbles. BYON incorporates a new gateway design that reacts to the presence of crops and optimizes coverage in agricultural settings. We evaluate BYON in a production farm and demonstrate its benefits.
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
- North America > United States > Illinois > Champaign County > Urbana (0.04)
- Europe > United Kingdom > England > Greater London > London (0.04)
- Information Technology > Communications > Networks (1.00)
- Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.67)
EvoPress: Towards Optimal Dynamic Model Compression via Evolutionary Search
Sieberling, Oliver, Kuznedelev, Denis, Kurtic, Eldar, Alistarh, Dan
The high computational costs of large language models (LLMs) have led to a flurry of research on LLM compression, via methods such as quantization, sparsification, or structured pruning. A new frontier in this area is given by \emph{dynamic, non-uniform} compression methods, which adjust the compression levels (e.g., sparsity) per-block or even per-layer in order to minimize accuracy loss, while guaranteeing a global compression threshold. Yet, current methods rely on heuristics for identifying the "importance" of a given layer towards the loss, based on assumptions such as \emph{error monotonicity}, i.e. that the end-to-end model compression error is proportional to the sum of layer-wise errors. In this paper, we revisit this area, and propose a new and general approach for dynamic compression that is provably optimal in a given input range. We begin from the motivating observation that, in general, \emph{error monotonicity does not hold for LLMs}: compressed models with lower sum of per-layer errors can perform \emph{worse} than models with higher error sums. To address this, we propose a new general evolutionary framework for dynamic LLM compression called EvoPress, which has provable convergence, and low sample and evaluation complexity. We show that these theoretical guarantees lead to highly competitive practical performance for dynamic compression of Llama, Mistral and Phi models. Via EvoPress, we set new state-of-the-art results across all compression approaches: structural pruning (block/layer dropping), unstructured sparsity, as well as quantization with dynamic bitwidths. Our code is available at https://github.com/IST-DASLab/EvoPress.
- North America > United States > New York > New York County > New York City (0.04)
- Europe > Austria (0.04)
- Europe > Switzerland > Zürich > Zürich (0.04)
- (2 more...)