Overview
FairAIED: Navigating Fairness, Bias, and Ethics in Educational AI Applications
Chinta, Sribala Vidyadhari, Wang, Zichong, Yin, Zhipeng, Hoang, Nhat, Gonzalez, Matthew, Quy, Tai Le, Zhang, Wenbin
The integration of Artificial Intelligence (AI) into education has transformative potential, providing tailored learning experiences and creative instructional approaches. However, the inherent biases in AI algorithms hinder this improvement by unintentionally perpetuating prejudice against specific demographics, especially in human-centered applications like education. This survey delves deeply into the developing topic of algorithmic fairness in educational contexts, providing a comprehensive evaluation of the diverse literature on fairness, bias, and ethics in AI-driven educational applications. It identifies the common forms of biases, such as data-related, algorithmic, and user-interaction, that fundamentally undermine the accomplishment of fairness in AI teaching aids. By outlining existing techniques for mitigating these biases, ranging from varied data gathering to algorithmic fairness interventions, the survey emphasizes the critical role of ethical considerations and legal frameworks in shaping a more equitable educational environment. Furthermore, it guides readers through the complexities of fairness measurements, methods, and datasets, shedding light on the way to bias reduction. Despite these gains, this survey highlights long-standing issues, such as achieving a balance between fairness and accuracy, as well as the need for diverse datasets. Overcoming these challenges and ensuring the ethical and fair use of AI's promise in education call for a collaborative, interdisciplinary approach.
Blockchain for Large Language Model Security and Safety: A Holistic Survey
Geren, Caleb, Board, Amanda, Dagher, Gaby G., Andersen, Tim, Zhuang, Jun
With the advent of accessible interfaces for interacting with large language models, there has been an associated explosion in both their commercial and academic interest. Consequently, there has also been an sudden burst of novel attacks associated with large language models, jeopardizing user data on a massive scale. Situated at a comparable crossroads in its development, and equally prolific to LLMs in its rampant growth, blockchain has emerged in recent years as a disruptive technology with the potential to redefine how we approach data handling. In particular, and due to its strong guarantees about data immutability and irrefutability as well as inherent data provenance assurances, blockchain has attracted significant attention as a means to better defend against the array of attacks affecting LLMs and further improve the quality of their responses. In this survey, we holistically evaluate current research on how blockchains are being used to help protect against LLM vulnerabilities, as well as analyze how they may further be used in novel applications. To better serve these ends, we introduce a taxonomy of blockchain for large language models (BC4LLM) and also develop various definitions to precisely capture the nature of different bodies of research in these areas. Moreover, throughout the paper, we present frameworks to contextualize broader research efforts, and in order to motivate the field further, we identify future research goals as well as challenges present in the blockchain for large language model (BC4LLM) space.
Trusting Your AI Agent Emotionally and Cognitively: Development and Validation of a Semantic Differential Scale for AI Trust
Shang, Ruoxi, Hsieh, Gary, Shah, Chirag
However, a critical gap exists in the lack of generalizable and accurate specialized measurement tools Trust plays a crucial role not only in fostering cooperation, for assessing affective trust in the context of AI, especially efficiency, and productivity in human relationships (Brainov with the enhanced and nuanced capabilities of LLMs. This and Sandholm 1999) but also is essential for the effective highlights a need for a better measurement scale for affective use and acceptance of computing and automated systems, trust to gain a deeper understanding of how trust dynamics including computers (Madsen and Gregor 2000), automation function, particularly in the context of emotionally intelligent (Lee and See 2004), robots (Hancock et al. 2011), and AI. AI technologies (Kumar 2021), with a deficit in trust potentially In this paper, we introduce a 27-item semantic differential causing rejection of these technologies (Glikson and scale for assessing cognitive and affective trust in AI, Woolley 2020). The two-dimensional model of trust, encompassing aiding researchers and designers in understanding and improving both cognitive and affective dimensions proposed human-AI interactions. Our motivation and scale and studied in interpersonal relationship studies (McAllister development process is based on a long strand of prior research 1995; Johnson and Grayson 2005; Parayitam and Dooley on the cognitive-affective construct of trust that has 2009; Morrow Jr, Hansen, and Pearson 2004), have been shown to be important in interpersonal trust in organizations, been adopted in studying trust in human-computer interactions, human trust in conventional technology and automation, particularly with human-like technologies (Hu, Lu and more recently in trust towards AI.
Privacy Threats and Countermeasures in Federated Learning for Internet of Things: A Systematic Review
Federated Learning (FL) in the Internet of Things (IoT) environments can enhance machine learning by utilising decentralised data, but at the same time, it might introduce significant privacy and security concerns due to the constrained nature of IoT devices. This represents a research challenge that we aim to address in this paper. We systematically analysed recent literature to identify privacy threats in FL within IoT environments, and evaluate the defensive measures that can be employed to mitigate these threats. Using a Systematic Literature Review (SLR) approach, we searched five publication databases (Scopus, IEEE Xplore, Wiley, ACM, and Science Direct), collating relevant papers published between 2017 and April 2024, a period which spans from the introduction of FL until now. Guided by the PRISMA protocol, we selected 49 papers to focus our systematic review on. We analysed these papers, paying special attention to the privacy threats and defensive measures -- specifically within the context of IoT -- using inclusion and exclusion criteria tailored to highlight recent advances and critical insights. We identified various privacy threats, including inference attacks, poisoning attacks, and eavesdropping, along with defensive measures such as Differential Privacy and Secure Multi-Party Computation. These defences were evaluated for their effectiveness in protecting privacy without compromising the functional integrity of FL in IoT settings. Our review underscores the necessity for robust and efficient privacy-preserving strategies tailored for IoT environments. Notably, there is a need for strategies against replay, evasion, and model stealing attacks. Exploring lightweight defensive measures and emerging technologies such as blockchain may help improve the privacy of FL in IoT, leading to the creation of FL models that can operate under variable network conditions.
Fairness Definitions in Language Models Explained
Doan, Thang Viet, Chu, Zhibo, Wang, Zichong, Zhang, Wenbin
Language Models (LMs) have demonstrated exceptional performance across various Natural Language Processing (NLP) tasks. Despite these advancements, LMs can inherit and amplify societal biases related to sensitive attributes such as gender and race, limiting their adoption in real-world applications. Therefore, fairness has been extensively explored in LMs, leading to the proposal of various fairness notions. However, the lack of clear agreement on which fairness definition to apply in specific contexts (\textit{e.g.,} medium-sized LMs versus large-sized LMs) and the complexity of understanding the distinctions between these definitions can create confusion and impede further progress. To this end, this paper proposes a systematic survey that clarifies the definitions of fairness as they apply to LMs. Specifically, we begin with a brief introduction to LMs and fairness in LMs, followed by a comprehensive, up-to-date overview of existing fairness notions in LMs and the introduction of a novel taxonomy that categorizes these concepts based on their foundational principles and operational distinctions. We further illustrate each definition through experiments, showcasing their practical implications and outcomes. Finally, we discuss current research challenges and open questions, aiming to foster innovative ideas and advance the field. The implementation and additional resources are publicly available at https://github.com/LavinWong/Fairness-in-Large-Language-Models/tree/main/definitions.
Mathematical theory of deep learning
Petersen, Philipp, Zech, Jakob
It is designed to help students and researchers to quickly familiarize themselves with the area and to provide a foundation for the development of university courses on the mathematics of deep learning. Our main goal in the composition of this book was to present various rigorous, but easy to grasp, results that help to build an understanding of fundamental mathematical concepts in deep learning. To achieve this, we prioritize simplicity over generality. As a mathematical introduction to deep learning, this book does not aim to give an exhaustive survey of the entire (and rapidly growing) field, and some important research directions are missing. In particular, we have favored mathematical results over empirical research, even though an accurate account of the theory of deep learning requires both.
iNNspector: Visual, Interactive Deep Model Debugging
Spinner, Thilo, Fรผrst, Daniel, El-Assady, Mennatallah
Deep learning model design, development, and debugging is a process driven by best practices, guidelines, trial-and-error, and the personal experiences of model developers. At multiple stages of this process, performance and internal model data can be logged and made available. However, due to the sheer complexity and scale of this data and process, model developers often resort to evaluating their model performance based on abstract metrics like accuracy and loss. We argue that a structured analysis of data along the model's architecture and at multiple abstraction levels can considerably streamline the debugging process. Such a systematic analysis can further connect the developer's design choices to their impacts on the model behavior, facilitating the understanding, diagnosis, and refinement of deep learning models. Hence, in this paper, we (1) contribute a conceptual framework structuring the data space of deep learning experiments. Our framework, grounded in literature analysis and requirements interviews, captures design dimensions and proposes mechanisms to make this data explorable and tractable. To operationalize our framework in a ready-to-use application, we (2) present the iNNspector system. iNNspector enables tracking of deep learning experiments and provides interactive visualizations of the data on all levels of abstraction from multiple models to individual neurons. Finally, we (3) evaluate our approach with three real-world use-cases and a user study with deep learning developers and data analysts, proving its effectiveness and usability.
Mpox Detection Advanced: Rapid Epidemic Response Through Synthetic Data
Kularathne, Yudara, Janitha, Prathapa, Ambepitiya, Sithira, Sothyrajah, Prarththanan, Ahamed, Thanveer, Wijesundara, Dinuka
Rapid development of disease detection models using computer vision is crucial in responding to medical emergencies, such as epidemics or bioterrorism events. Traditional data collection methods are often too slow in these scenarios, requiring innovative approaches for quick, reliable model generation from minimal data. Our study introduces a novel approach by constructing a comprehensive computer vision model to detect Mpox lesions using only synthetic data. Initially, these models generated a diverse set of synthetic images representing Mpox lesions on various body parts (face, back, chest, leg, neck, arm) across different skin tones as defined by the Fitzpatrick scale (fair, brown, dark skin). Subsequently, we trained and tested a vision model with this synthetic dataset to evaluate the diffusion models' efficacy in producing high-quality training data and its impact on the vision model's medical image recognition performance. The results were promising; the vision model achieved a 97% accuracy rate, with 96% precision and recall for Mpox cases, and similarly high metrics for normal and other skin disorder cases, demonstrating its ability to correctly identify true positives and minimize false positives. The model achieved an F1-Score of 96% for Mpox cases and 98% for normal and other skin disorders, reflecting a balanced precision-recall relationship, thus ensuring reliability and robustness in its predictions. Our proposed SynthVision methodology indicates the potential to develop accurate computer vision models with minimal data input for future medical emergencies.
Enhancing Model Performance: Another Approach to Vision-Language Instruction Tuning
Vedanshu, null, Tripathi, MM, Jaint, Bhavnesh
The integration of large language models (LLMs) with vision-language (VL) tasks has been a transformative development in the realm of artificial intelligence, highlighting the potential of LLMs as a versatile general-purpose chatbot. However, the current trend in this evolution focuses on the integration of vision and language to create models that can operate in more diverse and real-world contexts. We present a novel approach, termed Bottleneck Adapter, specifically crafted for enhancing the multimodal functionalities of these complex models, enabling joint optimization of the entire multimodal LLM framework through a process known as Multimodal Model Tuning (MMT). Our approach utilizes lightweight adapters to connect the image encoder and LLM without the need for large, complex neural networks. Unlike the conventional modular training schemes, our approach adopts an end-to-end optimization regime, which, when combined with the adapters, facilitates the joint optimization using a significantly smaller parameter set. Our method exhibits robust performance with 90.12\% accuracy, outperforming both human-level performance (88.4\%) and LaVIN-7B (89.41\%).
Causal Deepsets for Off-policy Evaluation under Spatial or Spatio-temporal Interferences
Dai, Runpeng, Wang, Jianing, Zhou, Fan, Luo, Shikai, Qin, Zhiwei, Shi, Chengchun, Zhu, Hongtu
Off-policy evaluation (OPE) is widely applied in sectors such as pharmaceuticals and e-commerce to evaluate the efficacy of novel products or policies from offline datasets. This paper introduces a causal deepset framework that relaxes several key structural assumptions, primarily the mean-field assumption, prevalent in existing OPE methodologies that handle spatio-temporal interference. These traditional assumptions frequently prove inadequate in real-world settings, thereby restricting the capability of current OPE methods to effectively address complex interference effects. In response, we advocate for the implementation of the permutation invariance (PI) assumption. This innovative approach enables the data-driven, adaptive learning of the mean-field function, offering a more flexible estimation method beyond conventional averaging. Furthermore, we present novel algorithms that incorporate the PI assumption into OPE and thoroughly examine their theoretical foundations. Our numerical analyses demonstrate that this novel approach yields significantly more precise estimations than existing baseline algorithms, thereby substantially improving the practical applicability and effectiveness of OPE methodologies.