atm
Stable Diffusion is Unstable
Recently, text-to-image models have been thriving. Despite their powerful generative capacity, our research has uncovered a lack of robustness in this generation process. Specifically, the introduction of small perturbations to the text prompts can result in the blending of primary subjects with other categories or their complete disappearance in the generated images.
Split Happens: Combating Advanced Threats with Split Learning and Function Secret Sharing
Khan, Tanveer, Budzys, Mindaugas, Michalas, Antonis
--Split Learning (SL) - splits a model into two distinct parts to help protect client data while enhancing Machine Learning (ML) processes. Though promising, SL has proven vulnerable to different attacks, thus raising concerns about how effective it may be in terms of data privacy. Recent works have shown promising results for securing SL through the use of a novel paradigm, named Function Secret Sharing (FSS), in which servers obtain shares of a function they compute and operate on a public input hidden with a random mask. However, these works fall short in addressing the rising number of attacks which exist on SL. In SplitHappens, we expand the combination of FSS and SL to U-shaped SL. Similarly to other works, we are able to make use of the benefits of SL by reducing the communication and computational costs of FSS. However, a U-shaped SL provides a higher security guarantee than previous works, allowing a client to keep the labels of the training data secret, without having to share them with the server . Through this, we are able to generalize the security analysis of previous works and expand it to different attack vectors, such as modern model inversion attacks as well as label inference attacks. We tested our approach for two different convolutional neural networks on different datasets. These experiments show the effectiveness of our approach in reducing the training time as well as the communication costs when compared to simply using FSS while matching prior accuracy.
- Europe > Finland > Pirkanmaa > Tampere (0.04)
- North America > United States (0.04)
- Asia > China > Hong Kong (0.04)
- (2 more...)
Stable Diffusion is Unstable
Recently, text-to-image models have been thriving. Despite their powerful generative capacity, our research has uncovered a lack of robustness in this generation process. Specifically, the introduction of small perturbations to the text prompts can result in the blending of primary subjects with other categories or their complete disappearance in the generated images. In this paper, we propose Auto-attack on Text-to-image Models (ATM), a gradient-based approach, to effectively and efficiently generate such perturbations. By learning a Gumbel Softmax distribution, we can make the discrete process of word replacement or extension continuous, thus ensuring the differentiability of the perturbation generation.
Enhancing Precision of Automated Teller Machines Network Quality Assessment: Machine Learning and Multi Classifier Fusion Approaches
Safarzadeh, Alireza, Jamali, Mohammad Reza, Moshiri, Behzad
The performance of these machines is therefore not only vital to ensuring customer satisfaction but also to maintain efficiency in operations by financial institutions. The Key Performance Indicators (KPIs) involved are availability, reliability, Mean Time to Failure (MTTF), Mean Time to Repair (MTTR), among others, which are important in determining the quality and reliability of the entire ATM network to assist the banking managers. Availability is the proportion of time an ATM network remains'in-service' compared to'out-of-service', indicating its operational uptime. Reliability, given by the expression exp( t/MMTF), is the likelihood that one of the ATMs will operate without failure for some period t. MTTF is the total in-service time divided by the number of out-of-service occurrences. MTTR stands for the mean time to repair and put back an ATM in service. Banking managers base decisions on how well these KPIs are forecast. However, some significant decisional limitations may be brought about by errors in the measurement of ATM status--when an ATM is either out of service and the system does not detect so, or, in the case of a false alarm, if it is functioning properly but the method has signaled it as out of order. These may lead to quite unnecessary maintenance interventions, higher operational costs, and reduced machine availability--affecting customer trust and financial performance.
- Asia > Middle East > Iran > Tehran Province > Tehran (0.04)
- North America > United States > Massachusetts > Suffolk County > Boston (0.04)
- North America > Canada > Ontario > Waterloo Region > Waterloo (0.04)
- Banking & Finance (1.00)
- Health & Medicine > Therapeutic Area (0.69)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.94)
- Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.94)
ATM: Improving Model Merging by Alternating Tuning and Merging
Zhou, Luca, Solombrino, Daniele, Crisostomi, Donato, Bucarelli, Maria Sofia, Silvestri, Fabrizio, Rodolà, Emanuele
Model merging has recently emerged as a cost-efficient paradigm for multi-task learning. Among current approaches, task arithmetic stands out for its simplicity and effectiveness. In this paper, we motivate the effectiveness of task vectors by linking them to multi-task gradients. We show that in a single-epoch scenario, task vectors are mathematically equivalent to the gradients obtained via gradient descent in a multi-task setting, and still approximate these gradients in subsequent epochs. Furthermore, we show that task vectors perform optimally when equality is maintained, and their effectiveness is largely driven by the first epoch's gradient. Building on this insight, we propose viewing model merging as a single step in an iterative process that Alternates between Tuning and Merging (ATM). This method acts as a bridge between model merging and multi-task gradient descent, achieving state-of-the-art results with the same data and computational requirements. We extensively evaluate ATM across diverse settings, achieving up to 20% higher accuracy in computer vision and NLP tasks, compared to the best baselines. Finally, we provide both empirical and theoretical support for its effectiveness, demonstrating increased orthogonality between task vectors and proving that ATM minimizes an upper bound on the loss obtained by jointly finetuning all tasks.
- North America > United States > Ohio > Franklin County > Columbus (0.04)
- North America > United States > Maryland > Baltimore (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- (2 more...)
Optimal Transportation by Orthogonal Coupling Dynamics
Sadr, Mohsen, Esfehani, Peyman Mohajerin, Gorji, Hossein
Many numerical algorithms and learning tasks rest on solution of the Monge-Kantorovich problem and corresponding Wasserstein distances. While the natural approach is to treat the problem as an infinite-dimensional linear programming, such a methodology severely limits the computational performance due to the polynomial scaling with respect to the sample size along with intensive memory requirements. We propose a novel alternative framework to address the Monge-Kantorovich problem based on a projection type gradient descent scheme. The micro-dynamics is built on the notion of the conditional expectation, where the connection with the opinion dynamics is explored and leveraged to build compact numerical schemes. We demonstrate that the devised dynamics recovers random maps with favourable computational performance. Along with the theoretical insight, the provided dynamics paves the way for innovative approaches to construct numerical schemes for computing optimal transport maps as well as Wasserstein distances.
- Europe > Netherlands > South Holland > Delft (0.04)
- Europe > Switzerland (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- (2 more...)
- Research Report > Promising Solution (0.34)
- Overview > Innovation (0.34)
Distilling Reasoning Ability from Large Language Models with Adaptive Thinking
Chen, Xiaoshu, Zhou, Sihang, Liang, Ke, Liu, Xinwang
Chain of thought finetuning (cot-finetuning) aims to endow small language models (SLM) with reasoning ability to improve their performance towards specific tasks by allowing them to imitate the reasoning procedure of large language models (LLM) beyond simply predicting the answers. Most existing cot-finetuning methods adopt a pre-thinking mechanism, allowing the SLM to generate a rationale before providing an answer. This mechanism enables SLM to analyze and think about complex questions, but it also makes answer correctness highly sensitive to minor errors in rationale. Therefore, we propose a robust post-thinking mechanism to generate answers before rationale. Thanks to this answer-first setting, 1) the answer can escape from the adverse effects caused by minor errors in the rationale; 2) the rationale serves as an error amplifier to the answer, which makes the SLM focus on learning hard samples; 3) the inferring efficiency can also benefit from the setting since users can stop the generation right after answers are outputted when inference is conducted. However, although the post-thinking mechanism brings many advantages and improves the overall performance of SLM on specific tasks, it may lose the ability to think about the questions and decompose complex questions into simple sub-questions compared to pre-thinking mechanism. Therefore, a plug-and-play adaptive-thinking mechanism is proposed with the aid of the soft prompt tuning to integrate the merits of the pre-thinking mechanism and post-thinking mechanism, in which a perception module is introduced to adaptively prompt SLM answer or think first based on perceiving the complexity of the questions. Extensive experiments are conducted across 12 reasoning tasks and 2 representative language models to demonstrate the effectiveness of the proposed mechanism.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- North America > Canada > Ontario > Toronto (0.05)
- North America > United States > Florida > Palm Beach County (0.04)
- (15 more...)
Detecting Anomalous Network Communication Patterns Using Graph Convolutional Networks
Vaisman, Yizhak, Katz, Gilad, Elovici, Yuval, Shabtai, Asaf
To protect an organizations' endpoints from sophisticated cyberattacks, advanced detection methods are required. In this research, we present GCNetOmaly-- a graph convolutional network (GCN)-based variational autoencoder (VAE) anomaly detector trained on data that include connection events among internal and external machines. As input, the proposed GCN-based VAE model receives two matrices: (i) the normalized adjacency matrix, which represents the connections among the machines, and (ii) the feature matrix, which includes various features (demographic, statistical, process-related, and Node2vec structural features) that are used to profile the individual nodes/machines. After training the model on data collected for a predefined time window, the model is applied on the same data; the reconstruction score obtained by the model for a given machine then serves as the machine's anomaly score. GCNetOmaly was evaluated on real, large-scale data logged by Carbon Black EDR from a large financial organization's automated teller machines (ATMs) as well as communication with Active Directory (AD) servers in two setups: unsupervised and supervised. The results of our evaluation demonstrate GCNetOmaly's effectiveness in detecting anomalous behavior of machines on unsupervised data.
- Information Technology > Security & Privacy (1.00)
- Government > Military > Cyberwarfare (0.49)