classi
FairFLRep: Fairness aware fault localization and repair of Deep Neural Networks
Openja, Moses, Arcaini, Paolo, Khomh, Foutse, Ishikawa, Fuyuki
Deep neural networks (DNNs) are being utilized in various aspects of our daily lives, including high-stakes decision-making applications that impact individuals. However, these systems reflect and amplify bias from the data used during training and testing, potentially resulting in biased behavior and inaccurate decisions. For instance, having different misclassification rates between white and black sub-populations. However, effectively and efficiently identifying and correcting biased behavior in DNNs is a challenge. This paper introduces FairFLRep, an automated fairness-aware fault localization and repair technique that identifies and corrects potentially bias-inducing neurons in DNN classifiers. FairFLRep focuses on adjusting neuron weights associated with sensitive attributes, such as race or gender, that contribute to unfair decisions. By analyzing the input-output relationships within the network, FairFLRep corrects neurons responsible for disparities in predictive quality parity. We evaluate FairFLRep on four image classification datasets using two DNN classifiers, and four tabular datasets with a DNN model. The results show that FairFLRep consistently outperforms existing methods in improving fairness while preserving accuracy. An ablation study confirms the importance of considering fairness during both fault localization and repair stages. Our findings also show that FairFLRep is more efficient than the baseline approaches in repairing the network.
- Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
- Oceania > Australia > Victoria > Melbourne (0.14)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- (17 more...)
- Law (1.00)
- Health & Medicine (1.00)
- Education (0.67)
- Government > Regional Government (0.45)
ForeCal: Random Forest-based Calibration for DNNs
Deep neural network(DNN) based classifiers do extremely well in discriminating between observations, resulting in higher ROC AUC and accuracy metrics, but their outputs are often miscalibrated with respect to true event likelihoods. Post-hoc calibration algorithms are often used to calibrate the outputs of these classifiers. Methods like Isotonic regression, Platt scaling, and Temperature scaling have been shown to be effective in some cases but are limited by their parametric assumptions and/or their inability to capture complex non-linear relationships. We propose ForeCal - a novel post-hoc calibration algorithm based on Random forests. ForeCal exploits two unique properties of Random forests: the ability to enforce weak monotonicity and range-preservation. It is more powerful in achieving calibration than current state-of-the-art methods, is non-parametric, and can incorporate exogenous information as features to learn a better calibration function. Through experiments on 43 diverse datasets from the UCI ML repository, we show that ForeCal outperforms existing methods in terms of Expected Calibration Error(ECE) with minimal impact on the discriminative power of the base DNN as measured by AUC.
- North America > United States > New York > New York County > New York City (0.04)
- Asia > India > Maharashtra > Mumbai (0.04)
- Oceania > Australia > New South Wales > Sydney (0.04)
- Europe > Finland > Uusimaa > Helsinki (0.04)
Open Set Recognition for Random Forest
Feng, Guanchao, Desai, Dhruv, Pasquali, Stefano, Mehta, Dhagash
In the open-set settings, classi ers are required to not only accurately classify new instances of known In many real-world classi cation or recognition tasks, it is often classes (whose samples are observed during training) but also e ectively di cult to collect training examples that exhaust all possible classes recognize the samples from unknown classes. In a nutshell, due to, for example, incomplete knowledge during training or ever open-set classi ers are capable of making the "none of the above" changing regimes. Therefore, samples from unknown/novel classes decision with respect to known classes. This is known as open-set may be encountered in testing/deployment. In such scenarios, the recognition (OSR) [38] and has received signi cant attention in classi ers should be able to i) perform classi cation on known recent years [11, 47]. Since many learning tasks in nance are naturally classes, and at the same time, ii) identify samples from unknown classi cation tasks, for instance, company classi cations using classes. This is known as open-set recognition. Although random Global Industry Classi cation Standard (GICS), fund categorization, forest has been an extremely successful framework as a generalpurpose risk pro ling, economic scenario classi cations, etc., where often a classi cation (and regression) method, in practice, it usually new company, fund or economic scenario may not belong to any operates under the closed-set assumption and is not able to identify of the existing categories, casting these recognition tasks as OSR samples from new classes when run out of the box. In this work, we instead of traditional closed-set classi cation tasks is more appropriate.
- North America > United States > New York > New York County > New York City (0.05)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)
- Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.54)
- (2 more...)
A3Rank: Augmentation Alignment Analysis for Prioritizing Overconfident Failing Samples for Deep Learning Models
Wei, Zhengyuan, Wang, Haipeng, Zhou, Qilin, Chan, W. K.
Wrong predictions can lead to various problems in di erent application domains, e.g., improper medical diagnosis [25] and tra c accidents [16]. Enhancing the DL application systems by reducing wrong predictions of DL models in producing outputs is desirable. Studies [9, 51, 52] have shown that DL models are vulnerable to operational input samples that can lead them to produce incorrect predictions in natural scenarios [52], and the prediction con dences of many such failing samples exceed those well-intended guarding con dence levels [54]. For example, strong sunshine may cause the camera of a self-driving car to capture an image full of white pixels, resulting in a prediction failure with high con dence. A major bottleneck in developing DL applications is detecting these overcon dent failures from their deployed DL application systems. To reduce unreliable predictions, many real-world machine-learning-based application systems are equipped with rejectors to discard uncertain decisions [17]. In DL application systems, many existing techniques [6, 17, 45] construct their rejectors for DL models to address the incorrect prediction problem. For example, many recent studies [2, 8, 42, 49] have been conducted to enhance the defense ability of DL models against out-of-distribution (OOD) samples from unknown classes or arti cial examples that are very likely to guide DL models to yield failures.
- North America > Canada > Ontario > Toronto (0.14)
- North America > United States > New York > New York County > New York City (0.05)
- Asia > China > Hong Kong > Kowloon (0.04)
- (10 more...)
- Transportation > Ground > Road (0.48)
- Transportation > Passenger (0.34)
- Information Technology > Robotics & Automation (0.34)
Advancing Anomaly Detection: Non-Semantic Financial Data Encoding with LLMs
Bakumenko, Alexander, Hlaváčková-Schindler, Kateřina, Plant, Claudia, Hubig, Nina C.
Detecting anomalies in general ledger data is of utmost importance to ensure trustworthiness of financial records. Financial audits increasingly rely on machine learning (ML) algorithms to identify irregular or potentially fraudulent journal entries, each characterized by a varying number of transactions. In machine learning, heterogeneity in feature dimensions adds significant complexity to data analysis. In this paper, we introduce a novel approach to anomaly detection in financial data using Large Language Models (LLMs) embeddings. To encode non-semantic categorical data from real-world financial records, we tested 3 pre-trained general purpose sentence-transformer models. For the downstream classification task, we implemented and evaluated 5 optimized ML models including Logistic Regression, Random Forest, Gradient Boosting Machines, Support Vector Machines, and Neural Networks. Our experiments demonstrate that LLMs contribute valuable information to anomaly detection as our models outperform the baselines, in selected settings even by a large margin. The findings further underscore the effectiveness of LLMs in enhancing anomaly detection in financial journal entries, particularly by tackling feature sparsity. We discuss a promising perspective on using LLM embeddings for non-semantic data in the financial context and beyond.
- Europe > Austria > Vienna (0.14)
- North America > United States > South Carolina > Charleston County > Charleston (0.04)
- North America > United States > New York (0.04)
- (4 more...)
- Research Report > Promising Solution (0.49)
- Research Report > New Finding (0.48)
- Overview > Innovation (0.34)
- Banking & Finance (1.00)
- Information Technology > Security & Privacy (0.68)
Predicting challenge moments from students' discourse: A comparison of GPT-4 to two traditional natural language processing approaches
Suraworachet, Wannapon, Seon, Jennifer, Cukurova, Mutlu
Effective collaboration requires groups to strategically regulate themselves to overcome challenges. Research has shown that groups may fail to regulate due to differences in members' perceptions of challenges which may benefit from external support. In this study, we investigated the potential of leveraging three distinct natural language processing models: an expert knowledge rule-based model, a supervised machine learning (ML) model and a Large Language model (LLM), in challenge detection and challenge dimension identification (cognitive, metacognitive, emotional and technical/other challenges) from student discourse, was investigated. The results show that the supervised ML and the LLM approaches performed considerably well in both tasks, in contrast to the rule-based approach, whose efficacy heavily relies on the engineered features by experts. The paper provides an extensive discussion of the three approaches' performance for automated detection and support of students' challenge moments in collaborative learning activities. It argues that, although LLMs provide many advantages, they are unlikely to be the panacea to issues of the detection and feedback provision of socially shared regulation of learning due to their lack of reliability, as well as issues of validity evaluation, privacy and confabulation. We conclude the paper with a discussion on additional considerations, including model transparency to explore feasible and meaningful analytical feedback for students and educators using LLMs.
- Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.05)
- South America > Uruguay > Maldonado > Maldonado (0.04)
- North America > United States > Texas > Travis County > Austin (0.04)
- (4 more...)
- Research Report > New Finding (1.00)
- Instructional Material (1.00)
Learning to Find Pictures of People
Finding articulated objects, like people, in pictures present.s a par(cid:173) ticularly difficult object. We show how t.o find people by finding putative body segments, and then construct.(cid:173) Since a reasonable model of a person requires at. Instead, the search can be pruned by using projected versions of a classifier that accepts groups corresponding to people. We describe an efficient projection algorithm for one popular classi(cid:173) fier, and demonstrate that our approach can be used to determine whether images of real scenes contain people.
Boosted Dyadic Kernel Discriminants
We introduce a novel learning algorithm for binary classi(cid:12)cation with hyperplane discriminants based on pairs of training points from opposite classes (dyadic hypercuts). This algorithm is further extended to nonlinear discriminants using kernel functions satisfy- ing Mercer's conditions. An ensemble of simple dyadic hypercuts is learned incrementally by means of a con(cid:12)dence-rated version of Ad- aBoost, which provides a sound strategy for searching through the (cid:12)nite set of hypercut hypotheses. In experiments with real-world datasets from the UCI repository, the generalization performance of the hypercut classi(cid:12)ers was found to be comparable to that of SVMs and k-NN classi(cid:12)ers. Furthermore, the computational cost of classi(cid:12)cation (at run time) was found to be similar to, or bet- ter than, that of SVM.
Semi-supervised Learning on Directed Graphs
Given a directed graph in which some of the nodes are labeled, we inves- tigate the question of how to exploit the link structure of the graph to infer the labels of the remaining unlabeled nodes. To that extent we propose a regularization framework for functions de(cid:2)ned over nodes of a directed graph that forces the classi(cid:2)cation function to change slowly on densely linked subgraphs. A powerful, yet computationally simple classi(cid:2)cation algorithm is derived within the proposed framework. The experimental evaluation on real-world Web classi(cid:2)cation problems demonstrates en- couraging results that validate our approach.
Learning Monotonic Transformations for Classification
A discriminative method is proposed for learning monotonic transforma- tions of the training data while jointly estimating a large-margin classi(cid:12)er. In many domains such as document classi(cid:12)cation, image histogram classi(cid:12)- cation and gene microarray experiments, (cid:12)xed monotonic transformations can be useful as a preprocessing step. However, most classi(cid:12)ers only explore these transformations through manual trial and error or via prior domain knowledge. The proposed method learns monotonic transformations auto- matically while training a large-margin classi(cid:12)er without any prior knowl- edge of the domain. A monotonic piecewise linear function is learned which transforms data for subsequent processing by a linear hyperplane classi(cid:12)er.