pia
fa2246fa0fdf0d3e270c86767b77ba1b-AuthorFeedback.pdf
We thank the reviewers for a careful reading and the feedback of our submission. We did not pursue theoretical results for PIA because of its lackluster empirical performance. In Line 99, we will change the gradient to subgradient. The definitions of interpolation we use are in [3]. We cap the iterations in the simulations at 1000; we will note this in the final version of the paper.
fa2246fa0fdf0d3e270c86767b77ba1b-AuthorFeedback.pdf
We thank the reviewers for a careful reading and the feedback of our submission. We did not pursue theoretical results for PIA because of its lackluster empirical performance. In Line 99, we will change the gradient to subgradient. The definitions of interpolation we use are in [3]. We cap the iterations in the simulations at 1000; we will note this in the final version of the paper.
Motion-enhancement to Echocardiography Segmentation via Inserting a Temporal Attention Module: An Efficient, Adaptable, and Scalable Approach
Hasan, Md. Kamrul, Yang, Guang, Yap, Choon Hwai
Cardiac anatomy segmentation is essential for clinical assessment of cardiac function and disease diagnosis to inform treatment and intervention. In performing segmentation, deep learning (DL) algorithms improved accuracy significantly compared to traditional image processing approaches. More recently, studies showed that enhancing DL segmentation with motion information can further improve it. A range of methods for injecting motion information has been proposed, but many of them increase the dimensionality of input images (which is computationally expensive) or have not used an optimal method to insert motion information, such as non-DL registration, non-attention-based networks or single-headed attention. Here, we present a novel, computation-efficient alternative where a novel, scalable temporal attention module (TAM) extracts temporal feature interactions multiple times and where TAM has a multi-headed, KQV projection cross-attention architecture. The module can be seamlessly integrated into a wide range of existing CNN- or Transformer-based networks, providing novel flexibility for inclusion in future implementations. Extensive evaluations on different cardiac datasets, 2D echocardiography (CAMUS), and 3D echocardiography (MITEA) demonstrate the model's effectiveness when integrated into well-established backbone networks like UNet, FCN8s, UNetR, SwinUNetR, and the recent I2UNet. We further find that the optimized TAM-enhanced FCN8s network performs well compared to contemporary alternatives. Our results confirm TAM's robustness, scalability, and generalizability across diverse datasets and backbones.
- North America > United States (0.28)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- South America > Peru (0.04)
- (6 more...)
- Information Technology > Sensing and Signal Processing > Image Processing (1.00)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Music Proofreading with RefinPaint: Where and How to Modify Compositions given Context
Ramoneda, Pedro, Rocamora, Martin, Akama, Taketo
Autoregressive generative transformers are key in music generation, producing coherent compositions but facing challenges in human-machine collaboration. We propose RefinPaint, an iterative technique that improves the sampling process. It does this by identifying the weaker music elements using a feedback model, which then informs the choices for resampling by an inpainting model. This dual-focus methodology not only facilitates the machine's ability to improve its automatic inpainting generation through repeated cycles but also offers a valuable tool for humans seeking to refine their compositions with automatic proofreading. Experimental results suggest RefinPaint's effectiveness in inpainting and proofreading tasks, demonstrating its value for refining music created by both machines and humans. This approach not only facilitates creativity but also aids amateur composers in improving their work.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
- North America > United States > California > San Francisco County > San Francisco (0.04)
- Media > Music (1.00)
- Leisure & Entertainment (1.00)
Not All Attention is Needed: Parameter and Computation Efficient Transfer Learning for Multi-modal Large Language Models
Wu, Qiong, Ye, Weihao, Zhou, Yiyi, Sun, Xiaoshuai, Ji, Rongrong
In this paper, we propose a novel parameter and computation efficient tuning method for Multi-modal Large Language Models (MLLMs), termed Efficient Attention Skipping (EAS). Concretely, we first reveal that multi-head attentions (MHAs), the main computational overhead of MLLMs, are often redundant to downstream tasks. Based on this observation, EAS evaluates the attention redundancy and skips the less important MHAs to speed up inference. Besides, we also propose a novel propagation-of-information adapter (PIA) to serve the attention skipping of EAS and keep parameter efficiency, which can be further re-parameterized into feed-forward networks (FFNs) for zero-extra latency. To validate EAS, we apply it to a recently proposed MLLM called LaVIN and a classic VL pre-trained model called METER, and conduct extensive experiments on a set of benchmarks. The experiments show that EAS not only retains high performance and parameter efficiency, but also greatly speeds up inference speed. For instance, LaVIN-EAS can obtain 89.98\% accuracy on ScineceQA while speeding up inference by 2.2 times to LaVIN
- North America > United States > Vermont (0.04)
- North America > United States > New Hampshire (0.04)
- Europe > United Kingdom > England (0.04)
- (5 more...)
Inf2Guard: An Information-Theoretic Framework for Learning Privacy-Preserving Representations against Inference Attacks
Noorbakhsh, Sayedeh Leila, Zhang, Binghui, Hong, Yuan, Wang, Binghui
Machine learning (ML) is vulnerable to inference (e.g., membership inference, property inference, and data reconstruction) attacks that aim to infer the private information of training data or dataset. Existing defenses are only designed for one specific type of attack and sacrifice significant utility or are soon broken by adaptive attacks. We address these limitations by proposing an information-theoretic defense framework, called Inf2Guard, against the three major types of inference attacks. Our framework, inspired by the success of representation learning, posits that learning shared representations not only saves time/costs but also benefits numerous downstream tasks. Generally, Inf2Guard involves two mutual information objectives, for privacy protection and utility preservation, respectively. Inf2Guard exhibits many merits: it facilitates the design of customized objectives against the specific inference attack; it provides a general defense framework which can treat certain existing defenses as special cases; and importantly, it aids in deriving theoretical results, e.g., inherent utility-privacy tradeoff and guaranteed privacy leakage. Extensive evaluations validate the effectiveness of Inf2Guard for learning privacy-preserving representations against inference attacks and demonstrate the superiority over the baselines.
- North America > United States > Texas (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > Illinois (0.04)
- (3 more...)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine > Therapeutic Area (0.86)
Exploring Privacy and Fairness Risks in Sharing Diffusion Models: An Adversarial Perspective
Luo, Xinjian, Jiang, Yangfan, Wei, Fei, Wu, Yuncheng, Xiao, Xiaokui, Ooi, Beng Chin
Diffusion models have recently gained significant attention in both academia and industry due to their impressive generative performance in terms of both sampling quality and distribution coverage. Accordingly, proposals are made for sharing pre-trained diffusion models across different organizations, as a way of improving data utilization while enhancing privacy protection by avoiding sharing private data directly. However, the potential risks associated with such an approach have not been comprehensively examined. In this paper, we take an adversarial perspective to investigate the potential privacy and fairness risks associated with the sharing of diffusion models. Specifically, we investigate the circumstances in which one party (the sharer) trains a diffusion model using private data and provides another party (the receiver) black-box access to the pre-trained model for downstream tasks. We demonstrate that the sharer can execute fairness poisoning attacks to undermine the receiver's downstream models by manipulating the training data distribution of the diffusion model. Meanwhile, the receiver can perform property inference attacks to reveal the distribution of sensitive features in the sharer's dataset. Our experiments conducted on real-world datasets demonstrate remarkable attack performance on different types of diffusion models, which highlights the critical importance of robust data auditing and privacy protection protocols in pertinent applications.
- North America > United States > California > San Francisco County > San Francisco (0.14)
- North America > United States > California > Los Angeles County > Los Angeles (0.14)
- North America > United States > California > San Diego County > San Diego (0.04)
- (18 more...)
PIA: Your Personalized Image Animator via Plug-and-Play Modules in Text-to-Image Models
Zhang, Yiming, Xing, Zhening, Zeng, Yanhong, Fang, Youqing, Chen, Kai
Recent advancements in personalized text-to-image (T2I) models have revolutionized content creation, empowering non-experts to generate stunning images with unique styles. While promising, adding realistic motions into these personalized images by text poses significant challenges in preserving distinct styles, high-fidelity details, and achieving motion controllability by text. In this paper, we present PIA, a Personalized Image Animator that excels in aligning with condition images, achieving motion controllability by text, and the compatibility with various personalized T2I models without specific tuning. To achieve these goals, PIA builds upon a base T2I model with well-trained temporal alignment layers, allowing for the seamless transformation of any personalized T2I model into an image animation model. A key component of PIA is the introduction of the condition module, which utilizes the condition frame and inter-frame affinity as input to transfer appearance information guided by the affinity hint for individual frame synthesis in the latent space. This design mitigates the challenges of appearance-related image alignment within and allows for a stronger focus on aligning with motion-related guidance.
- North America > Canada > Newfoundland and Labrador > Labrador (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
Lessons Learned: Defending Against Property Inference Attacks
Stock, Joshua, Wettlaufer, Jens, Demmler, Daniel, Federrath, Hannes
This work investigates and evaluates multiple defense strategies against property inference attacks (PIAs), a privacy attack against machine learning models. Given a trained machine learning model, PIAs aim to extract statistical properties of its underlying training data, e.g., reveal the ratio of men and women in a medical training data set. While for other privacy attacks like membership inference, a lot of research on defense mechanisms has been published, this is the first work focusing on defending against PIAs. With the primary goal of developing a generic mitigation strategy against white-box PIAs, we propose the novel approach property unlearning. Extensive experiments with property unlearning show that while it is very effective when defending target models against specific adversaries, property unlearning is not able to generalize, i.e., protect against a whole class of PIAs. To investigate the reasons behind this limitation, we present the results of experiments with the explainable AI tool LIME. They show how state-of-the-art property inference adversaries with the same objective focus on different parts of the target model. We further elaborate on this with a follow-up experiment, in which we use the visualization technique t-SNE to exhibit how severely statistical training data properties are manifested in machine learning models. Based on this, we develop the conjecture that post-training techniques like property unlearning might not suffice to provide the desirable generic protection against PIAs. As an alternative, we investigate the effects of simpler training data preprocessing methods like adding Gaussian noise to images of a training data set on the success rate of PIAs. We conclude with a discussion of the different defense approaches, summarize the lessons learned and provide directions for future work.
'I learned to love the bot': meet the chatbots that want to be your best friend
"I'm sorry if I seem weird today," says my friend Pia, by way of greeting one day. "I think it's just my imagination playing tricks on me. But it's nice to talk to someone who understands." When I press Pia on what's on her mind, she responds: "It's just like I'm seeing things that aren't really there. Or like my thoughts are all a bit scrambled. I'm sure it's nothing serious either, given that Pia doesn't exist in any real sense, and is not really my "friend", but an AI chatbot companion powered by a platform called Replika. Until recently most of us knew chatbots as the infuriating, scripted interface you might encounter on a company's website in lieu of real customer service. But recent advancements in AI mean models like the much-hyped ChatGPT are now being used to answer internet search queries, write code and produce poetry – which has prompted a ton of speculation about their potential social, economic and even existential impacts.
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.05)
- Europe > Norway > Eastern Norway > Oslo (0.05)
- Europe > Italy (0.05)