AITopics

2508.08151

Country:

Europe (1.00)
Asia (1.00)
Oceania > Australia (0.67)
North America > United States > California (0.67)

Genre: Research Report > New Finding (1.00)

Industry:

Law (1.00)
Health & Medicine (1.00)
Education (0.67)
Government > Regional Government (0.45)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceSep-4-2024

ForeCal: Random Forest-based Calibration for DNNs

Nigam, Dhruv

Deep neural network(DNN) based classifiers do extremely well in discriminating between observations, resulting in higher ROC AUC and accuracy metrics, but their outputs are often miscalibrated with respect to true event likelihoods. Post-hoc calibration algorithms are often used to calibrate the outputs of these classifiers. Methods like Isotonic regression, Platt scaling, and Temperature scaling have been shown to be effective in some cases but are limited by their parametric assumptions and/or their inability to capture complex non-linear relationships. We propose ForeCal - a novel post-hoc calibration algorithm based on Random forests. ForeCal exploits two unique properties of Random forests: the ability to enforce weak monotonicity and range-preservation. It is more powerful in achieving calibration than current state-of-the-art methods, is non-parametric, and can incorporate exogenous information as features to learn a better calibration function. Through experiments on 43 diverse datasets from the UCI ML repository, we show that ForeCal outperforms existing methods in terms of Expected Calibration Error(ECE) with minimal impact on the discriminative power of the base DNN as measured by AUC.

calibration, calibration function, probability, (14 more...)

2409.02446

Country:

North America > United States > New York > New York County > New York City (0.04)
Asia > India > Maharashtra > Mumbai (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
Europe > Finland > Uusimaa > Helsinki (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.95)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (0.85)

Feng, Guanchao, Desai, Dhruv, Pasquali, Stefano, Mehta, Dhagash

Open Set Recognition for Random Forest

arXiv.org Machine LearningAug-1-2024

In the open-set settings, classi ers are required to not only accurately classify new instances of known In many real-world classi cation or recognition tasks, it is often classes (whose samples are observed during training) but also e ectively di cult to collect training examples that exhaust all possible classes recognize the samples from unknown classes. In a nutshell, due to, for example, incomplete knowledge during training or ever open-set classi ers are capable of making the "none of the above" changing regimes. Therefore, samples from unknown/novel classes decision with respect to known classes. This is known as open-set may be encountered in testing/deployment. In such scenarios, the recognition (OSR) [38] and has received signi cant attention in classi ers should be able to i) perform classi cation on known recent years [11, 47]. Since many learning tasks in nance are naturally classes, and at the same time, ii) identify samples from unknown classi cation tasks, for instance, company classi cations using classes. This is known as open-set recognition. Although random Global Industry Classi cation Standard (GICS), fund categorization, forest has been an extremely successful framework as a generalpurpose risk pro ling, economic scenario classi cations, etc., where often a classi cation (and regression) method, in practice, it usually new company, fund or economic scenario may not belong to any operates under the closed-set assumption and is not able to identify of the existing categories, casting these recognition tasks as OSR samples from new classes when run out of the box. In this work, we instead of traditional closed-set classi cation tasks is more appropriate.

classi, recognition, unknown class, (13 more...)

arXiv.org Machine Learning

2408.02684

Country:

North America > United States > New York > New York County > New York City (0.05)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Genre: Research Report (1.00)

Industry: Banking & Finance (0.31)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Decision Tree Learning (0.67)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.54)
(2 more...)

arXiv.org Artificial IntelligenceJul-19-2024

A3Rank: Augmentation Alignment Analysis for Prioritizing Overconfident Failing Samples for Deep Learning Models

Wei, Zhengyuan, Wang, Haipeng, Zhou, Qilin, Chan, W. K.

Wrong predictions can lead to various problems in di erent application domains, e.g., improper medical diagnosis [25] and tra c accidents [16]. Enhancing the DL application systems by reducing wrong predictions of DL models in producing outputs is desirable. Studies [9, 51, 52] have shown that DL models are vulnerable to operational input samples that can lead them to produce incorrect predictions in natural scenarios [52], and the prediction con dences of many such failing samples exceed those well-intended guarding con dence levels [54]. For example, strong sunshine may cause the camera of a self-driving car to capture an image full of white pixels, resulting in a prediction failure with high con dence. A major bottleneck in developing DL applications is detecting these overcon dent failures from their deployed DL application systems. To reduce unreliable predictions, many real-world machine-learning-based application systems are equipped with rejectors to discard uncertain decisions [17]. In DL application systems, many existing techniques [6, 17, 45] construct their rejectors for DL models to address the incorrect prediction problem. For example, many recent studies [2, 8, 42, 49] have been conducted to enhance the defense ability of DL models against out-of-distribution (OOD) samples from unknown classes or arti cial examples that are very likely to guide DL models to yield failures.

prediction class, prediction con dence, subtle sample, (12 more...)

2407.14114

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > New York > New York County > New York City (0.05)
Asia > China > Hong Kong > Kowloon (0.04)
(10 more...)

Genre: Research Report (1.00)

Industry:

Transportation > Ground > Road (0.48)
Transportation > Passenger (0.34)
Information Technology > Robotics & Automation (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Bakumenko, Alexander, Hlaváčková-Schindler, Kateřina, Plant, Claudia, Hubig, Nina C.

Advancing Anomaly Detection: Non-Semantic Financial Data Encoding with LLMs

arXiv.org Artificial IntelligenceJun-5-2024

Detecting anomalies in general ledger data is of utmost importance to ensure trustworthiness of financial records. Financial audits increasingly rely on machine learning (ML) algorithms to identify irregular or potentially fraudulent journal entries, each characterized by a varying number of transactions. In machine learning, heterogeneity in feature dimensions adds significant complexity to data analysis. In this paper, we introduce a novel approach to anomaly detection in financial data using Large Language Models (LLMs) embeddings. To encode non-semantic categorical data from real-world financial records, we tested 3 pre-trained general purpose sentence-transformer models. For the downstream classification task, we implemented and evaluated 5 optimized ML models including Logistic Regression, Random Forest, Gradient Boosting Machines, Support Vector Machines, and Neural Networks. Our experiments demonstrate that LLMs contribute valuable information to anomaly detection as our models outperform the baselines, in selected settings even by a large margin. The findings further underscore the effectiveness of LLMs in enhancing anomaly detection in financial journal entries, particularly by tackling feature sparsity. We discuss a promising perspective on using LLM embeddings for non-semantic data in the financial context and beyond.

anomaly detection, dataset, llm, (13 more...)

2406.03614

Country:

Europe > Austria > Vienna (0.14)
North America > United States > South Carolina > Charleston County > Charleston (0.04)
North America > United States > New York (0.04)
(4 more...)

Genre:

Research Report > Promising Solution (0.49)
Research Report > New Finding (0.48)
Overview > Innovation (0.34)

Industry:

Banking & Finance (1.00)
Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Data Science > Data Mining > Anomaly Detection (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
(2 more...)

Suraworachet, Wannapon, Seon, Jennifer, Cukurova, Mutlu

Predicting challenge moments from students' discourse: A comparison of GPT-4 to two traditional natural language processing approaches

arXiv.org Artificial IntelligenceJan-3-2024

Effective collaboration requires groups to strategically regulate themselves to overcome challenges. Research has shown that groups may fail to regulate due to differences in members' perceptions of challenges which may benefit from external support. In this study, we investigated the potential of leveraging three distinct natural language processing models: an expert knowledge rule-based model, a supervised machine learning (ML) model and a Large Language model (LLM), in challenge detection and challenge dimension identification (cognitive, metacognitive, emotional and technical/other challenges) from student discourse, was investigated. The results show that the supervised ML and the LLM approaches performed considerably well in both tasks, in contrast to the rule-based approach, whose efficacy heavily relies on the engineered features by experts. The paper provides an extensive discussion of the three approaches' performance for automated detection and support of students' challenge moments in collaborative learning activities. It argues that, although LLMs provide many advantages, they are unlikely to be the panacea to issues of the detection and feedback provision of socially shared regulation of learning due to their lack of reliability, as well as issues of validity evaluation, privacy and confabulation. We conclude the paper with a discussion on additional considerations, including model transparency to explore feasible and meaningful analytical feedback for students and educators using LLMs.

dimension, discourse, student, (17 more...)

doi: 10.1145/3636555.3636905

2401.01692

Country:

Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.05)
South America > Uruguay > Maldonado > Maldonado (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
(4 more...)

Genre:

Research Report > New Finding (1.00)
Instructional Material (1.00)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Neural Information Processing SystemsApr-6-2023, 17:27:09 GMT

Learning to Find Pictures of People

Finding articulated objects, like people, in pictures present.s a par(cid:173) ticularly difficult object. We show how t.o find people by finding putative body segments, and then construct.(cid:173) Since a reasonable model of a person requires at. Instead, the search can be pruned by using projected versions of a classifier that accepts groups corresponding to people. We describe an efficient projection algorithm for one popular classi(cid:173) fier, and demonstrate that our approach can be used to determine whether images of real scenes contain people.

cid, classi, learning, (1 more...)

Technology: Information Technology > Artificial Intelligence (0.67)

Neural Information Processing SystemsApr-6-2023, 16:23:39 GMT

Boosted Dyadic Kernel Discriminants

We introduce a novel learning algorithm for binary classi(cid:12)cation with hyperplane discriminants based on pairs of training points from opposite classes (dyadic hypercuts). This algorithm is further extended to nonlinear discriminants using kernel functions satisfy- ing Mercer's conditions. An ensemble of simple dyadic hypercuts is learned incrementally by means of a con(cid:12)dence-rated version of Ad- aBoost, which provides a sound strategy for searching through the (cid:12)nite set of hypercut hypotheses. In experiments with real-world datasets from the UCI repository, the generalization performance of the hypercut classi(cid:12)ers was found to be comparable to that of SVMs and k-NN classi(cid:12)ers. Furthermore, the computational cost of classi(cid:12)cation (at run time) was found to be similar to, or bet- ter than, that of SVM.

boosted dyadic kernel discriminant, cid, classi, (4 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Neural Information Processing SystemsApr-6-2023, 15:48:14 GMT

Semi-supervised Learning on Directed Graphs

Given a directed graph in which some of the nodes are labeled, we inves- tigate the question of how to exploit the link structure of the graph to infer the labels of the remaining unlabeled nodes. To that extent we propose a regularization framework for functions de(cid:2)ned over nodes of a directed graph that forces the classi(cid:2)cation function to change slowly on densely linked subgraphs. A powerful, yet computationally simple classi(cid:2)cation algorithm is derived within the proposed framework. The experimental evaluation on real-world Web classi(cid:2)cation problems demonstrates en- couraging results that validate our approach.

classi, directed graph, semi-supervised learning, (2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Unsupervised or Indirectly Supervised Learning (0.40)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.40)

Neural Information Processing SystemsApr-6-2023, 14:47:32 GMT

Learning Monotonic Transformations for Classification

A discriminative method is proposed for learning monotonic transforma- tions of the training data while jointly estimating a large-margin classi(cid:12)er. In many domains such as document classi(cid:12)cation, image histogram classi(cid:12)- cation and gene microarray experiments, (cid:12)xed monotonic transformations can be useful as a preprocessing step. However, most classi(cid:12)ers only explore these transformations through manual trial and error or via prior domain knowledge. The proposed method learns monotonic transformations auto- matically while training a large-margin classi(cid:12)er without any prior knowl- edge of the domain. A monotonic piecewise linear function is learned which transforms data for subsequent processing by a linear hyperplane classi(cid:12)er.

cid, classi, learning monotonic transformation, (2 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.43)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.42)