fine-tuning stage
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.93)
- Banking & Finance (0.93)
- Information Technology > Security & Privacy (0.67)
- Law (0.67)
- Europe > Bosnia and Herzegovina > Federation of Bosnia and Herzegovina > Sarajevo Canton > Sarajevo (0.04)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Asia > South Korea > Gangwon-do > Pyeongchang (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.94)
- Information Technology (0.93)
- Leisure & Entertainment > Sports (0.67)
- Law Enforcement & Public Safety (0.67)
- (2 more...)
Precoder Design in Multi-User FDD Systems with VQ-VAE and GNN
Allaparapu, Srikar, Baur, Michael, Böck, Benedikt, Joham, Michael, Utschick, Wolfgang
ABSTRACT Robust precoding is efficiently feasible in frequency divis ion duplex (FDD) systems by incorporating the learnt statistic s of the propagation environment through a generative model. W e build on previous work that successfully designed site-specific precoders based on a combination of Gaussian mixture models (GMMs) and graph neural networks (GNNs). In this paper, by utilizing a vector quantized-variational au toen-coder (VQ-V AE), we circumvent one of the key drawbacks of GMMs, i.e., the number of GMM components scales exponentially to the feedback bits. In addition, the deep lear n-ing architecture of the VQ-V AE allows us to jointly train the GNN together with VQ-V AE along with pilot optimization forming an end-to-end (E2E) model, resulting in considerable performance gains in sum rate for multi-user wireless systems. Simulations demonstrate the superiority of the pr o-posed frameworks over the conventional methods involving the sub-discrete Fourier transform (DFT) pilot matrix and i t-erative precoder algorithms enabling the deployment of sys - tems characterized by fewer pilots or feedback bits.
- Europe > Slovenia > Drava > Municipality of Benedikt > Benedikt (0.05)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Research Report > Experimental Study (1.00)
- Research Report > New Finding (0.93)
- Banking & Finance (0.93)
- Information Technology > Security & Privacy (0.67)
- Law (0.67)
- Europe > Bosnia and Herzegovina > Federation of Bosnia and Herzegovina > Sarajevo Canton > Sarajevo (0.04)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- North America > United States > California > Santa Clara County > Palo Alto (0.04)
- Asia > South Korea > Gangwon-do > Pyeongchang (0.04)
- Research Report > New Finding (1.00)
- Research Report > Experimental Study (1.00)
- Health & Medicine > Therapeutic Area > Vaccines (0.50)
- Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Addiction Disorder (0.48)
- Health & Medicine > Therapeutic Area > Immunology (0.40)
- Materials > Chemicals > Industrial Gases > Liquified Gas (0.31)
- Materials > Chemicals > Commodity Chemicals > Petrochemicals > LNG (0.31)
- Energy > Oil & Gas > Midstream (0.31)
Multimodal Medical Image Classification via Synergistic Learning Pre-training
Lin, Qinghua, Liu, Guang-Hai, Li, Zuoyong, Li, Yang, Jiang, Yuting, Wu, Xiang
Multimodal pathological images are usually in clinical diagnosis, but computer vision-based multimodal image-assisted diagnosis faces challenges with modality fusion, especially in the absence of expert-annotated data. To achieve the modality fusion in multimodal images with label scarcity, we propose a novel ``pretraining + fine-tuning" framework for multimodal semi-supervised medical image classification. Specifically, we propose a synergistic learning pretraining framework of consistency, reconstructive, and aligned learning. By treating one modality as an augmented sample of another modality, we implement a self-supervised learning pre-train, enhancing the baseline model's feature representation capability. Then, we design a fine-tuning method for multimodal fusion. During the fine-tuning stage, we set different encoders to extract features from the original modalities and provide a multimodal fusion encoder for fusion modality. In addition, we propose a distribution shift method for multimodal fusion features, which alleviates the prediction uncertainty and overfitting risks caused by the lack of labeled samples. We conduct extensive experiments on the publicly available gastroscopy image datasets Kvasir and Kvasirv2. Quantitative and qualitative results demonstrate that the proposed method outperforms the current state-of-the-art classification methods. The code will be released at: https://github.com/LQH89757/MICS.
- Research Report (0.84)
- Instructional Material > Online (0.62)
- Instructional Material > Course Syllabus & Notes (0.62)
- Health & Medicine > Therapeutic Area (1.00)
- Health & Medicine > Diagnostic Medicine > Imaging (1.00)
Appendixes A An Example for Scenario 2 We give an example of G(A)
Below is a detailed explanation of the comparative methods covered in the paper. The network architecture of PI-DeepONet used for Burgers' equation is such that both In order to solve the Eq. Fig.6 shows model predictions of MAD-L and MAD-LM compared with the reference solutions under Fig.7(a) shows that the accuracy of MAD-L after convergence increases with Fig.7(b) shows that the accuracy and convergence speed of MAD-LM do not change For Burgers' equation, we also consider the scenario when the viscosity coefficients Fig.8 compares the convergence curves of mean MAD-LM has obvious speed and accuracy improvement over From-Scratch and Transfer-Learning . We investigated the effect of the dimension of the latent vector (latent size) in Burgers' equation on performance. As can be seen from Fig.9(a), for MAD-L, different latent sizes have different performances and the best performance is achieved when it is equal to 128.
Appendix A Patch based Negative Data Augmentation Reduces Texture Bias
Figure 5: ViTs trained only on our patch-based transformations exhibit stronger texture bias. Each bar is the texture accuracy ( %) on Conflict Stimuli (Geirhos et al., 2018), and a higher texture accuracy indicates the model has a higher bias towards texture. The "texture accuracy" is defined as the percentage of images that are classified as the "texture" label, provided the image is classified as either "texture" or "shape" label. The baseline model is ViT -B/16 in (Dosovitskiy et al., 2021) trained on original images. Other models are trained on patch-based transformed images, e.g., "P-Shuffle" stands for a ViT -B/16 model trained on patch-based shuffled images.
- Materials > Chemicals > Industrial Gases > Liquified Gas (0.31)
- Materials > Chemicals > Commodity Chemicals > Petrochemicals > LNG (0.31)
- Energy > Oil & Gas > Midstream (0.31)