Goto

Collaborating Authors

 macroblock


Digital Video Manipulation Detection Technique Based on Compression Algorithms

arXiv.org Artificial Intelligence

Digital images and videos play a very important role in everyday life. Nowadays, people have access the affordable mobile devices equipped with advanced integrated cameras and powerful image processing applications. Technological development facilitates not only the generation of multimedia content, but also the intentional modification of it, either with recreational or malicious purposes. This is where forensic techniques to detect manipulation of images and videos become essential. This paper proposes a forensic technique by analysing compression algorithms used by the H.264 coding. The presence of recompression uses information of macroblocks, a characteristic of the H.264-MPEG4 standard, and motion vectors. A Vector Support Machine is used to create the model that allows to accurately detect if a video has been recompressed.


A Video Codec Designed for AI Analysis

#artificialintelligence

Though techno-thriller The Circle (2017) is more a comment on the ethical implications of social networks than the practicalities of external video analytics, the improbably tiny'SeeChange' camera at the center of the plot is what truly pushes the movie into the'science-fiction' category. A wireless and free-roaming device about the size of a large marble, it's not the lack of solar panels or the inefficiency of drawing power from other ambient sources (such as radio waves) that makes SeeChange an unlikely prospect, but the fact that it's going to have to compress video 24/7, on whatever scant charge it's able to maintain. Powering cheap sensors of this type is a core area of research in computer vision (CV) and video analytics, particularly in non-urban environments where the sensor will have to eke out the maximum performance from very limited power resources (batteries, solar, etc.). In cases where such an edge IoT/CV device of this type must send image content to a central server (often through conventional cell coverage networks), the choices are hard: either the device needs to run some kind of lightweight neural network locally in order to send only optimized segments of relevant data for server side processing; or it has to send'dumb' video for the plugged-in cloud resources to evaluate. Though motion-activation through event-based Smart Vision Sensors (SVS) can cut down this overhead, that activation monitoring also costs energy.


BRIEF: Backward Reduction of CNNs with Information Flow Analysis

arXiv.org Machine Learning

Abstract--This paper proposes BRIEF, a backward reduction algorithm that explores compact CNN-model designs from the information flow perspective. This algorithm can remove substantial nonzero weighting parameters (redundant neural channels) of a network by considering its dynamic behavior, which traditional model-compaction techniques cannot achieve. With the aid of our proposed algorithm, we achieve significant model reduction on ResNet-34 in the ImageNet scale (32.3% reduction), which is 3 better than the previous result (10.8%). Even for highly optimized models such as SqueezeNet and MobileNet, we can achieve additional 10.81% and 37.56% reduction, respectively, with negligible performance degradation. Since the breakthrough performance demonstrated by convolutional neural networks (CNNs) on ImageNet, deep architecture has been successfully applied to a number of areas such as speech recognition, object tracking, and image classification. As the width and depth of a CNN is increased to improve prediction accuracy, the model complexity and training time increase as well. Whereas model training can be sped up by employing a large number of GPUs, inferencing on mobile and wearable devices (e.g., mobile VR) faces the resource limitations of memory, power and computation. In this work, we utilize information flow analysis to perform CNN model reduction while preserving prediction accuracy. Traditionally, a complex CNN is simplified for embedded systems by using the teacher-student model [1], [2].


MBS: Macroblock Scaling for CNN Model Reduction

arXiv.org Machine Learning

We estimate the proper channel (width) scaling of Convolution Neural Networks (CNNs) for model reduction. Unlike the traditional scaling method that reduces every CNN channel width by the same scaling factor, we address each CNN macroblock adaptively depending on its information redundancy measured by our proposed effective flops. Our proposed macroblock scaling (MBS) algorithm can be applied to various CNN architectures to reduce their model size. These applicable models range from compact CNN models such as MobileNet (25.53% reduction, ImageNet) and ShuffleNet (20.74% reduction, ImageNet) to ultra-deep ones such as ResNet-101 (51.67% reduction, ImageNet) and ResNet-1202 (72.71% reduction, CIFAR-10) with negligible accuracy degradation. MBS also performs better reduction at a much lower cost than does the state-of-the-art optimization-based method. MBS's simplicity and efficiency, its flexibility to work with any CNN model, and its scalability to work with models of any depth makes it an attractive choice for CNN model size reduction.