AITopics

2412.07051

Country:

Asia > Middle East > UAE (0.14)
North America > United States > Vermont > Chittenden County > Burlington (0.14)
Europe > Czechia (0.14)
(78 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Therapeutic Area > Pulmonary/Respiratory Diseases (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Biomedical Informatics > Translational Bioinformatics (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Wang, Tingfang, Boden, Joseph M., Biswas, Swati, Choudhary, Pankaj K.

Absolute Risk Prediction for Cannabis Use Disorder Using Bayesian Machine Learning

arXiv.org Machine LearningJan-15-2025

Introduction: Substance use disorders (SUDs) have emerged as a pressing public health crisis in the United States, with adolescent substance use often leading to SUDs in adulthood. Effective strategies are needed to prevent this progression. To help in filling this need, we develop a novel and the first-ever absolute risk prediction model for cannabis use disorder (CUD) for adolescent or young adult cannabis users. Methods: We train a Bayesian machine learning model that provides a personalized CUD absolute risk for adolescent or young adult cannabis users using data from the National Longitudinal Study of Adolescent to Adult Health. Model performance is assessed using 5-fold cross-validation (CV) with area under the curve (AUC) and ratio of the expected to observed number of cases (E/O). External validation of the final model is conducted using two independent datasets. Results: The proposed model has five risk factors: biological sex, delinquency, and scores on personality traits of conscientiousness, neuroticism, and openness. For predicting CUD risk within five years of first cannabis use, AUC and E/O, computed via 5-fold CV, were 0.68 and 0.95, respectively. For the same type of prediction in external validation, AUC values were 0.64 and 0.75, with E/O values of 0.98 and 1, indicating good discrimination and calibration performances of the model. Discussion and Conclusion: The proposed model is the first absolute risk prediction model for an SUD. It can aid clinicians in identifying adolescent/youth substance users at a high risk of developing CUD in future for clinically appropriate interventions.

cannabis, prediction, risk factor, (15 more...)

arXiv.org Machine Learning

2501.09156

Country:

Oceania > New Zealand > South Island > Canterbury Region > Christchurch (0.04)
North America > United States > Texas > Dallas County > Richardson (0.04)
North America > United States > District of Columbia > Washington (0.04)
(3 more...)

Genre: Research Report > Experimental Study (0.93)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology > Addiction Disorder (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis (0.34)

Ryckewaert, Maxime, Marcos, Diego, Botella, Christophe, Servajean, Maximilien, Bonnet, Pierre, Joly, Alexis

Applying the maximum entropy principle to neural networks enhances multi-species distribution models

arXiv.org Machine LearningJan-15-2025

The rapid expansion of citizen science initiatives has led to a significant growth of biodiversity databases, and particularly presence-only (PO) observations. PO data are invaluable for understanding species distributions and their dynamics, but their use in a Species Distribution Model (SDM) is curtailed by sampling biases and the lack of information on absences. Poisson point processes are widely used for SDMs, with Maxent being one of the most popular methods. Maxent maximises the entropy of a probability distribution across sites as a function of predefined transformations of variables, called features. In contrast, neural networks and deep learning have emerged as a promising technique for automatic feature extraction from complex input variables. Arbitrarily complex transformations of input variables can be learned from the data efficiently through backpropagation and stochastic gradient descent (SGD). In this paper, we propose DeepMaxent, which harnesses neural networks to automatically learn shared features among species, using the maximum entropy principle. To do so, it employs a normalised Poisson loss where for each species, presence probabilities across sites are modelled by a neural network. We evaluate DeepMaxent on a benchmark dataset known for its spatial sampling biases, using PO data for calibration and presence-absence (PA) data for validation across six regions with different biological groups and covariates. Our results indicate that DeepMaxent performs better than Maxent and other leading SDMs across all regions and taxonomic groups. The method performs particularly well in regions of uneven sampling, demonstrating substantial potential to increase SDM performances. In particular, our approach yields more accurate predictions than traditional single-species models, which opens up new possibilities for methodological enhancement.

batch size, correction, deepmaxent, (14 more...)

arXiv.org Machine Learning

2412.19217

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > France > Occitanie > Hérault > Montpellier (0.04)
Oceania > Australia > New South Wales (0.04)
(6 more...)

Genre:

Research Report > New Finding (0.48)
Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceJan-15-2025

Free-Knots Kolmogorov-Arnold Network: On the Analysis of Spline Knots and Advancing Stability

Zheng, Liangwewi Nathan, Zhang, Wei Emma, Yue, Lin, Xu, Miao, Maennel, Olaf, Chen, Weitong

Kolmogorov-Arnold Neural Networks (KANs) have gained significant attention in the machine learning community. However, their implementation often suffers from poor training stability and heavy trainable parameter. Furthermore, there is limited understanding of the behavior of the learned activation functions derived from B-splines. In this work, we analyze the behavior of KANs through the lens of spline knots and derive the lower and upper bound for the number of knots in B-spline-based KANs. To address existing limitations, we propose a novel Free Knots KAN that enhances the performance of the original KAN while reducing the number of trainable parameters to match the trainable parameter scale of standard Multi-Layer Perceptrons (MLPs). Additionally, we introduce new a training strategy to ensure $C^2$ continuity of the learnable spline, resulting in smoother activation compared to the original KAN and improve the training stability by range expansion. The proposed method is comprehensively evaluated on 8 datasets spanning various domains, including image, text, time series, multimodal, and function approximation tasks. The promising results demonstrates the feasibility of KAN-based network and the effectiveness of proposed method.

artificial intelligence, knot, machine learning, (15 more...)

2501.09283

Country: Oceania > Australia (0.46)

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Perceptrons (0.54)

arXiv.org Artificial IntelligenceJan-15-2025

AI-based Identity Fraud Detection: A Systematic Review

Zhang, Chuo Jun, Gill, Asif Q., Liu, Bo, Anwar, Memoona J.

With the rapid development of digital services, a large volume of personally identifiable information (PII) is stored online and is subject to cyberattacks such as Identity fraud. Most recently, the use of Artificial Intelligence (AI) enabled deep fake technologies has significantly increased the complexity of identity fraud. Fraudsters may use these technologies to create highly sophisticated counterfeit personal identification documents, photos and videos. These advancements in the identity fraud landscape pose challenges for identity fraud detection and society at large. There is a pressing need to review and understand identity fraud detection methods, their limitations and potential solutions. This research aims to address this important need by using the well-known systematic literature review method. This paper reviewed a selected set of 43 papers across 4 major academic literature databases. In particular, the review results highlight the two types of identity fraud prevention and detection methods, in-depth and open challenges. The results were also consolidated into a taxonomy of AI-based identity fraud detection and prevention methods including key insights and trends. Overall, this paper provides a foundational knowledge base to researchers and practitioners for further research and development in this important area of digital identity fraud.

data mining, identity fraud, machine learning, (18 more...)

2501.09239

Country:

Asia (0.67)
Oceania > Australia (0.47)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Law Enforcement & Public Safety > Fraud (1.00)
Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (0.34)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
(5 more...)

Neural Information Processing SystemsJan-14-2025, 22:25:47 GMT

Data-Driven Network Neuroscience: On Data Collection and Benchmark

This paper presents a comprehensive and quality collection of functional human brain network data for potential research in the intersection of neuroscience, machine learning, and graph analytics. Anatomical and functional MRI images have been used to understand the functional connectivity of the human brain and are particularly important in identifying underlying neurodegenerative conditions such as Alzheimer's, Parkinson's, and Autism. Recently, the study of the brain in the form of brain networks using machine learning and graph analytics has become increasingly popular, especially to predict the early onset of these conditions. A brain network, represented as a graph, retains rich structural and positional information that traditional examination methods are unable to capture. However, the lack of publicly accessible brain network data prevents researchers from data-driven explorations.

data collection and benchmark, data-driven network neuroscience, mri image, (4 more...)

Neural Information Processing Systems

Country: Oceania > New Zealand > North Island > Auckland Region > Auckland (0.07)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (0.83)
Information Technology > Artificial Intelligence > Cognitive Science (0.66)
Information Technology > Data Science > Data Mining (0.63)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.63)

Efficient Traffic Prediction Through Spatio-Temporal Distillation

Zhang, Qianru, Gao, Xinyi, Wang, Haixin, Yiu, Siu-Ming, Yin, Hongzhi

Graph neural networks (GNNs) have gained considerable attention in recent years for traffic flow prediction due to their ability to learn spatio-temporal pattern representations through a graph-based message-passing framework. Although GNNs have shown great promise in handling traffic datasets, their deployment in real-life applications has been hindered by scalability constraints arising from high-order message passing. Additionally, the over-smoothing problem of GNNs may lead to indistinguishable region representations as the number of layers increases, resulting in performance degradation. To address these challenges, we propose a new knowledge distillation paradigm termed LightST that transfers spatial and temporal knowledge from a high-capacity teacher to a lightweight student. Specifically, we introduce a spatio-temporal knowledge distillation framework that helps student MLPs capture graph-structured global spatio-temporal patterns while alleviating the over-smoothing effect with adaptive knowledge distillation. Extensive experiments verify that LightST significantly speeds up traffic flow predictions by 5X to 40X compared to state-of-the-art spatio-temporal GNNs, all while maintaining superior accuracy.

data mining, distillation, machine learning, (17 more...)

2501.10459

Country:

North America > United States > California (0.04)
Asia > China > Hong Kong (0.04)
Oceania > Australia > Queensland (0.04)
(2 more...)

Genre: Research Report > New Finding (0.46)

Industry:

Transportation (0.47)
Information Technology (0.46)
Education (0.32)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Krishnamoorthy, Mahesh Vaijainthymala, Palavesam, Kuppusamy Vellamadam, Arcot, Siva Venkatesh, Kuppuswami, Rajarajeswari Chinniah

DNN-Powered MLOps Pipeline Optimization for Large Language Models: A Framework for Automated Deployment and Resource Management

The exponential growth in the size and complexity of Large Language Models (LLMs) has introduced unprecedented challenges in their deployment and operational management. Traditional MLOps approaches often fail to efficiently handle the scale, resource requirements, and dynamic nature of these models. This research presents a novel framework that leverages Deep Neural Networks (DNNs) to optimize MLOps pipelines specifically for LLMs. Our approach introduces an intelligent system that automates deployment decisions, resource allocation, and pipeline optimization while maintaining optimal performance and cost efficiency. Through extensive experimentation across multiple cloud environments and deployment scenarios, we demonstrate significant improvements: 40% enhancement in resource utilization, 35% reduction in deployment latency, and 30% decrease in operational costs compared to traditional MLOps approaches. The framework's ability to adapt to varying workloads and automatically optimize deployment strategies represents a significant advancement in automated MLOps management for large-scale language models. Our framework introduces several novel components including a multi-stream neural architecture for processing heterogeneous operational metrics, an adaptive resource allocation system that continuously learns from deployment patterns, and a sophisticated deployment orchestration mechanism that automatically selects optimal strategies based on model characteristics and environmental conditions. The system demonstrates robust performance across various deployment scenarios, including multi-cloud environments, high-throughput production systems, and cost-sensitive deployments. Through rigorous evaluation using production workloads from multiple organizations, we validate our approach's effectiveness in reducing operational complexity while improving system reliability and cost efficiency.

large language model, machine learning, natural language, (17 more...)

2501.14802

Country:

South America (0.04)
Oceania > Australia (0.04)
North America > United States > Texas > Tarrant County > Fort Worth (0.04)
(2 more...)

Genre: Research Report (1.00)

Industry: Information Technology > Services (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Enhancing the De-identification of Personally Identifiable Information in Educational Data

Shen, Y., Ji, Z., Lin, J., Koedginer, K. R.

Protecting Personally Identifiable Information (PII), such as names, is a critical requirement in learning technologies to safeguard student and teacher privacy and maintain trust. Accurate PII detection is an essential step toward anonymizing sensitive information while preserving the utility of educational data. Motivated by recent advancements in artificial intelligence, our study investigates the GPT-4o-mini model as a cost-effective and efficient solution for PII detection tasks. We explore both prompting and fine-tuning approaches and compare GPT-4o-mini's performance against established frameworks, including Microsoft Presidio and Azure AI Language. Our evaluation on two public datasets, CRAPII and TSCC, demonstrates that the fine-tuned GPT-4o-mini model achieves superior performance, with a recall of 0.9589 on CRAPII. Additionally, fine-tuned GPT-4o-mini significantly improves precision scores (a threefold increase) while reducing computational costs to nearly one-tenth of those associated with Azure AI Language. Furthermore, our bias analysis reveals that the fine-tuned GPT-4o-mini model consistently delivers accurate results across diverse cultural backgrounds and genders. The generalizability analysis using the TSCC dataset further highlights its robustness, achieving a recall of 0.9895 with minimal additional training data from TSCC. These results emphasize the potential of fine-tuned GPT-4o-mini as an accurate and cost-effective tool for PII detection in educational data. It offers robust privacy protection while preserving the data's utility for research and pedagogical analysis. Our code is available on GitHub: https://github.com/AnonJD/PrivacyAI

large language model, machine learning, natural language, (21 more...)

2501.09765

Country:

North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
Asia > Singapore (0.04)
Asia > China > Hong Kong (0.04)
(7 more...)

Genre:

Instructional Material (1.00)
Research Report > New Finding (0.92)
Research Report > Experimental Study (0.67)

Industry:

Information Technology > Security & Privacy (1.00)
Education > Educational Setting (1.00)
Education > Curriculum > Subject-Specific Education (0.93)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Andronic, Marta, Li, Jiawen, Constantinides, George A.

PolyLUT: Ultra-low Latency Polynomial Inference with Hardware-Aware Structured Pruning

Standard deep neural network inference involves the computation of interleaved linear maps and nonlinear activation functions. Prior work for ultra-low latency implementations has hardcoded these operations inside FPGA lookup tables (LUTs). However, FPGA LUTs can implement a much greater variety of functions. In this paper, we propose a novel approach to training DNNs for FPGA deployment using multivariate polynomials as the basic building block. Our method takes advantage of the flexibility offered by the soft logic, hiding the polynomial evaluation inside the LUTs with minimal overhead. By using polynomial building blocks, we achieve the same accuracy using considerably fewer layers of soft logic than by using linear functions, leading to significant latency and area improvements. LUT-based implementations also face a significant challenge: the LUT size grows exponentially with the number of inputs. Prior work relies on a priori fixed sparsity, with results heavily dependent on seed selection. To address this, we propose a structured pruning strategy using a bespoke hardware-aware group regularizer that encourages a particular sparsity pattern that leads to a small number of inputs per neuron. We demonstrate the effectiveness of PolyLUT on three tasks: network intrusion detection, jet identification at the CERN Large Hadron Collider, and MNIST.

accuracy, neural network, polylut, (16 more...)

2501.08043

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Oceania > Australia > Australian Capital Territory > Canberra (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(9 more...)

Genre: Research Report > New Finding (0.68)

Industry: Information Technology > Security & Privacy (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)