bpp
Switchable Token-Specific Codebook Quantization For Face Image Compression
With the ever-increasing volume of visual data, the efficient and lossless transmission, along with its subsequent interpretation and understanding, has become a critical bottleneck in modern information systems. The emerged codebook-based solution utilize a globally shared codebook to quantize and dequantize each token, controlling the bpp by adjusting the number of tokens or the codebook size. However, for facial images--which are rich in attributes--such global codebook strategies overlook both the category-specific correlations within images and the semantic differences among tokens, resulting in suboptimal performance, especially at low bpp. Motivated by these observations, we propose a Switchable Token-Specific Codebook Quantization for face image compression, which learns distinct codebook groups for different image categories and assigns an independent codebook to each token. By recording the codebook group to which each token belongs with a small number of bits, our method can reduce the loss incurred when decreasing the size of each codebook group. This enables a larger total number of codebooks under a lower overall bpp, thereby enhancing the expressive capability and improving reconstruction performance. Owing to its generalizable design, our method can be integrated into any existing codebook-based representation learning approach and has demonstrated its effectiveness on face recognition datasets, achieving an average accuracy of 93.51% for reconstructed images at 0.05 bpp.
Appendix - Compression with Bayesian Implicit Neural Representations Anonymous Author(s) Affiliation Address email
In addition to the four appendix sections mentioned in our main paper, we would like to draw atten-1 tion to two additional experiments: one evaluating the practical training and coding time, and the2 other investigating the impact of the number of training samples. These two experiments, especially3 the later one, offer crucial insights and are detailed in Appendix E1 and Appendix E2, respectively.4 Algorithm 1 A* encoding Require: Proposal distribution pw and target distribution qw. In our experiments, we used global-bound depth-limited A*7 coding to achieve this [1]. We describe the encoding procedure in Algorithm 1 and the decoding8 procedure in Algorithm 2. For brevity, we refer to this particular variant of the algorithm as A*9 coding for the rest of the appendix.10
Baxter Permutation Process
In this paper, a Bayesian nonparametric (BNP) model for Baxter permutations (BPs), termed BP process (BPP) is proposed and applied to relational data analysis. The BPs are a well-studied class of permutations, and it has been demonstrated that there is one-to-one correspondence between BPs and several interesting objects including floorplan partitioning (FP), which constitutes a subset of rectangular partitioning (RP). Accordingly, the BPP can be used as an FP model. We combine the BPP with a multi-dimensional extension of the stick-breaking process called the {\it block-breaking process} to fill the gap between FP and RP, and obtain a stochastic process on arbitrary RPs. Compared with conventional BNP models for arbitrary RPs, the proposed model is simpler and has a high affinity with Bayesian inference.