Eau Claire
Gumbel Counterfactual Generation From Language Models
Ravfogel, Shauli, Svete, Anej, Snรฆbjarnarson, Vรฉsteinn, Cotterell, Ryan
Understanding and manipulating the causal generation mechanisms in language models is essential for controlling their behavior. Previous work has primarily relied on techniques such as representation surgery -- e.g., model ablations or manipulation of linear subspaces tied to specific concepts -- to \emph{intervene} on these models. To understand the impact of interventions precisely, it is useful to examine counterfactuals -- e.g., how a given sentence would have appeared had it been generated by the model following a specific intervention. We highlight that counterfactual reasoning is conceptually distinct from interventions, as articulated in Pearl's causal hierarchy. Based on this observation, we propose a framework for generating true string counterfactuals by reformulating language models as a structural equation model using the Gumbel-max trick, which we called Gumbel counterfactual generation. This reformulation allows us to model the joint distribution over original strings and their counterfactuals resulting from the same instantiation of the sampling noise. We develop an algorithm based on hindsight Gumbel sampling that allows us to infer the latent noise variables and generate counterfactuals of observed strings. Our experiments demonstrate that the approach produces meaningful counterfactuals while at the same time showing that commonly used intervention techniques have considerable undesired side effects.
Exploration and Evaluation of Bias in Cyberbullying Detection with Machine Learning
Root, Andrew, Jakubowski, Liam, Vanamala, Mounika
It is well known that the usefulness of a machine learning model is due to its ability to generalize to unseen data. This study uses three popular cyberbullying datasets to explore the effects of data, how it's collected, and how it's labeled, on the resulting machine learning models. The bias introduced from differing definitions of cyberbullying and from data collection is discussed in detail. An emphasis is made on the impact of dataset expansion methods, which utilize current data points to fetch and label new ones. Furthermore, explicit testing is performed to evaluate the ability of a model to generalize to unseen datasets through cross-dataset evaluation. As hypothesized, the models have a significant drop in the Macro F1 Score, with an average drop of 0.222. As such, this study effectively highlights the importance of dataset curation and cross-dataset testing for creating models with real-world applicability. The experiments and other code can be found at https://github.com/rootdrew27/cyberbullying-ml.
The Role of Emotions in Informational Support Question-Response Pairs in Online Health Communities: A Multimodal Deep Learning Approach
Jozani, Mohsen, Williams, Jason A., Aleroud, Ahmed, Bhagat, Sarbottam
This study explores the relationship between informational support seeking questions, responses, and helpfulness ratings in online health communities. We created a labeled data set of question-response pairs and developed multimodal machine learning and deep learning models to reliably predict informational support questions and responses. We employed explainable AI to reveal the emotions embedded in informational support exchanges, demonstrating the importance of emotion in providing informational support. This complex interplay between emotional and informational support has not been previously researched. The study refines social support theory and lays the groundwork for the development of user decision aids. Further implications are discussed.
VALID: a Validated Algorithm for Learning in Decentralized Networks with Possible Adversarial Presence
Bakshi, Mayank, Ghasvarianjahromi, Sara, Yakimenka, Yauhen, Beemer, Allison, Kosut, Oliver, Kliewer, Joerg
We introduce the paradigm of validated decentralized learning for undirected networks with heterogeneous data and possible adversarial infiltration. We require (a) convergence to a global empirical loss minimizer when adversaries are absent, and (b) either detection of adversarial presence or convergence to an admissible consensus model in their presence. This contrasts sharply with the traditional byzantine-robustness requirement of convergence to an admissible consensus irrespective of the adversarial configuration. A distinctive aspect of our study is a heterogeneity metric based on the norms of individual agents' gradients computed at the global empirical loss minimizer. Machine learning is increasingly reliant on data from a variety of distributed sources. As such, it may be difficult to ensure that the data which originates from these sources is trustworthy. Thus, there is a need to develop distributed and decentralized learning strategies that can respond to bad or even malicious data. However, worst-case or Byzantine resilience is an extremely strong requirement, that performance be maintained if a malicious adversary controls a subset of the processing nodes and takes any conceivable action. In practice, an adversary launching such an attack against a learning process requires tremendous resources which may not be worth the cost to influence the learned model. Thus, even though malicious adversaries are a threat, for the vast majority of the time, they are not present. An algorithm that maintains Byzantine robustness necessarily sacrifices performance when no adversaries are present.
Recent Advancements In The Field Of Deepfake Detection
Krueger, Natalie, Vanamala, Dr. Mounika, Dave, Dr. Rushit
A deepfake is a photo or video of a person whose image has been digitally altered or partially replaced with an image of someone else. Deepfakes have the potential to cause a variety of problems and are often used maliciously. A common usage is altering videos of prominent political figures and celebrities. These deepfakes can portray them making offensive, problematic, and/or untrue statements. Current deepfakes can be very realistic, and when used in this way, can spread panic and even influence elections and political opinions. There are many deepfake detection strategies currently in use but finding the most comprehensive and universal method is critical. So, in this survey we will address the problems of malicious deepfake creation and the lack of universal deepfake detection methods. Our objective is to survey and analyze a variety of current methods and advances in the field of deepfake detection.
RELDEC: Reinforcement Learning-Based Decoding of Moderate Length LDPC Codes
Habib, Salman, Beemer, Allison, Kliewer, Joerg
In this work we propose RELDEC, a novel approach for sequential decoding of moderate length low-density parity-check (LDPC) codes. The main idea behind RELDEC is that an optimized decoding policy is subsequently obtained via reinforcement learning based on a Markov decision process (MDP). In contrast to our previous work, where an agent learns to schedule only a single check node (CN) within a group (cluster) of CNs per iteration, in this work we train the agent to schedule all CNs in a cluster, and all clusters in every iteration. That is, in each learning step of RELDEC an agent learns to schedule CN clusters sequentially depending on a reward associated with the outcome of scheduling a particular cluster. We also modify the state space representation of the MDP, enabling RELDEC to be suitable for larger block length LDPC codes than those studied in our previous work. Furthermore, to address decoding under varying channel conditions, we propose agile meta-RELDEC (AM-RELDEC) that employs meta-reinforcement learning. The proposed RELDEC scheme significantly outperforms standard flooding and random sequential decoding for a variety of LDPC codes, including codes designed for 5G new radio.
Hybrid Deepfake Detection Utilizing MLP and LSTM
Mallet, Jacob, Krueger, Natalie, Vanamala, Mounika, Dave, Rushit
The growing reliance of society on social media for authentic information has done nothing but increase over the past years. This has only raised the potential consequences of the spread of misinformation. One of the growing methods in popularity is to deceive users using a deepfake. A deepfake is an invention that has come with the latest technological advancements, which enables nefarious online users to replace their face with a computer generated, synthetic face of numerous powerful members of society. Deepfake images and videos now provide the means to mimic important political and cultural figures to spread massive amounts of false information. Models that can detect these deepfakes to prevent the spread of misinformation are now of tremendous necessity. In this paper, we propose a new deepfake detection schema utilizing two deep learning algorithms: long short term memory and multilayer perceptron. We evaluate our model using a publicly available dataset named 140k Real and Fake Faces to detect images altered by a deepfake with accuracies achieved as high as 74.7%
Leveraging Deep Learning Approaches for Deepfake Detection: A Review
Tiwari, Aniruddha, Dave, Rushit, Vanamala, Mounika
Conspicuous progression in the field of machine learning and deep learning have led the jump of highly realistic fake media, these media oftentimes referred as deepfakes. Deepfakes are fabricated media which are generated by sophisticated AI that are at times very difficult to set apart from the real media. So far, this media can be uploaded to the various social media platforms, hence advertising it to the world got easy, calling for an efficacious countermeasure. Thus, one of the optimistic counter steps against deepfake would be deepfake detection. To undertake this threat, researchers in the past have created models to detect deepfakes based on ML/DL techniques like Convolutional Neural Networks. This paper aims to explore different methodologies with an intention to achieve a cost-effective model with a higher accuracy with different types of the datasets, which is to address the generalizability of the dataset.
Machine Learning Based Approach to Recommend MITRE ATT&CK Framework for Software Requirements and Design Specifications
Lasky, Nicholas, Hallis, Benjamin, Vanamala, Mounika, Dave, Rushit, Seliya, Jim
Engineering more secure software has become a critical challenge in the cyber world. It is very important to develop methodologies, techniques, and tools for developing secure software. To develop secure software, software developers need to think like an attacker through mining software repositories. These aim to analyze and understand the data repositories related to software development. The main goal is to use these software repositories to support the decision-making process of software development. There are different vulnerability databases like Common Weakness Enumeration (CWE), Common Vulnerabilities and Exposures database (CVE), and CAPEC. We utilized a database called MITRE. MITRE ATT&CK tactics and techniques have been used in various ways and methods, but tools for utilizing these tactics and techniques in the early stages of the software development life cycle (SDLC) are lacking. In this paper, we use machine learning algorithms to map requirements to the MITRE ATT&CK database and determine the accuracy of each mapping depending on the data split.