AITopics

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.36)

Neural Information Processing SystemsFeb-16-2026, 06:38:07 GMT

83e77607638c4fb17fba4a9b7844800c-Paper-Conference.pdf

artificial intelligence, machine learning, natural language, (20 more...)

Country:

North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry: Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.95)
(3 more...)

Neural Information Processing SystemsOct-10-2025, 08:04:57 GMT

Great Minds Think Alike: The Universal Convergence Trend of Input Salience

Uncertainty is introduced in optimized DNNs through stochastic algorithms, forming specific distributions.

Country:

North America > United States > Indiana > Tippecanoe County > West Lafayette (0.04)
North America > United States > Indiana > Tippecanoe County > Lafayette (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry: Government > Regional Government > North America Government > United States Government (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.95)
(3 more...)

Yang, Mingchuan, Huang, Ziyuan

Enhancing Clinical Text Classification via Fine-Tuned DRAGON Longformer Models

arXiv.org Artificial IntelligenceJul-15-2025

This study explores the optimization of the DRAGON Longformer base model for clinical text classification, specifically targeting the binary classification of medical case descriptions. A dataset of 500 clinical cases containing structured medical observations was used, with 400 cases for training and 100 for validation. Enhancements to the pre - trained joeranbosma/dragon - longformer - base - mixed - domain model included hyperparameter tuning, domain - specific preprocessing, and architectural adjustments. Key modifications involved increasing sequence length from 512 to 1024 tokens, adjusting learning rates from 1e - 05 to 5e - 06, extending training epochs from 5 to 8, and incorporating specialized medical terminology. The optimized model achieved notable performance gains: accuracy improved from 72.0% to 85.2%, precision from 68.0% to 84.1%, recall from 75.0% to 86.3%, and F1 - score from 71.0% to 85.2%. Statistical analysis confirmed the significance of these improvements (p < .001). The model demonstrated enhanced capability in interpreting medical terminology, anatomical measurements, and clinical observations. These findings contribute to domain - specific language model research and offer practical implications for clinical natural language processing applications. The optimized model ' s strong performance across diverse medical conditions underscores its potential for broad use in healthcare settings. Enhancing Clinical Text Classification via Fine - Tuned DRAGON Longformer Models Introduction Natural language processing (NLP) in healthcare has continued to advance rapidly, revolutionizing the ability to analyze clinical texts and automate the extraction of valuable insights from massive amounts of medical documentation (Khurana, Koli, Khatter, & Singh, 2023). Over the past few years, large language models (LLMs) have emerged as powerful tools for gaining insight from and processing clinical narratives, creating capabilities that have never been seen before in medical text classification, entity recognition, and clinical decision support (Wang et al., 2018). The DRAGON (Deep Representation Analysis for General - domain Ontology Networks) framework was a specialized version of medical text processing out of all these models (Bosma et al., 2025). Beltagy, Peters, and Cohan (2020) state that the DRAGON longformer model, built on top of the Longformer architecture, addresses the quadratic computational complexity issue of traditional transformer models by processing long sequences.

large language model, machine learning, natural language, (21 more...)

2507.0947

Country: North America > United States (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry:

Information Technology > Security & Privacy (0.68)
Health & Medicine > Health Care Technology (0.46)
Health & Medicine > Diagnostic Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Classification (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
(3 more...)

Neural Information Processing SystemsMay-27-2025, 07:12:59 GMT

Great Minds Think Alike: The Universal Convergence Trend of Input Salience

input salience, optimized model, universal convergence trend, (4 more...)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.40)

arXiv.org Artificial IntelligenceMar-19-2025

Enhancing High-Quality Code Generation in Large Language Models with Comparative Prefix-Tuning

Jiang, Yuan, Zhang, Yujian, Lu, Liang, Treude, Christoph, Su, Xiaohong, Huang, Shan, Wang, Tiantian

Large Language Models (LLMs) have been widely adopted in commercial code completion engines, significantly enhancing coding efficiency and productivity. However, LLMs may generate code with quality issues that violate coding standards and best practices, such as poor code style and maintainability, even when the code is functionally correct. This necessitates additional effort from developers to improve the code, potentially negating the efficiency gains provided by LLMs. To address this problem, we propose a novel comparative prefix-tuning method for controllable high-quality code generation. Our method introduces a single, property-specific prefix that is prepended to the activations of the LLM, serving as a lightweight alternative to fine-tuning. Unlike existing methods that require training multiple prefixes, our approach trains only one prefix and leverages pairs of high-quality and low-quality code samples, introducing a sequence-level ranking loss to guide the model's training. This comparative approach enables the model to better understand the differences between high-quality and low-quality code, focusing on aspects that impact code quality. Additionally, we design a data construction pipeline to collect and annotate pairs of high-quality and low-quality code, facilitating effective training. Extensive experiments on the Code Llama 7B model demonstrate that our method improves code quality by over 100% in certain task categories, while maintaining functional correctness. We also conduct ablation studies and generalization experiments, confirming the effectiveness of our method's components and its strong generalization capability.

large language model, machine learning, natural language, (19 more...)

2503.0902

Country:

Asia > Singapore (0.04)
Asia > China > Heilongjiang Province > Harbin (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)

Genre: Research Report > New Finding (0.92)

Industry: Education (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Afroosheh, Sajjad, Askari, Mohammadreza

Fusion of Deep Learning and GIS for Advanced Remote Sensing Image Analysis

arXiv.org Artificial IntelligenceDec-25-2024

This paper presents an innovative framework for remote sensing image analysis by fusing deep learning techniques, specifically Convolutional Neural Networks (CNNs) and Long Short-Term Memory (LSTM) networks, with Geographic Information Systems (GIS). The primary objective is to enhance the accuracy and efficiency of spatial data analysis by overcoming challenges associated with high dimensionality, complex patterns, and temporal data processing. We implemented optimization algorithms, namely Particle Swarm Optimization (PSO) and Genetic Algorithms (GA), to fine-tune model parameters, resulting in improved performance metrics. Our findings reveal a significant increase in classification accuracy from 78% to 92% and a reduction in prediction error from 12% to 6% after optimization. Additionally, the temporal accuracy of the models improved from 75% to 88%, showcasing the frameworks capability to monitor dynamic changes effectively. The integration of GIS not only enriched the spatial analysis but also facilitated a deeper understanding of the relationships between geographical features. This research demonstrates that combining advanced deep learning methods with GIS and optimization strategies can significantly advance remote sensing applications, paving the way for future developments in environmental monitoring, urban planning, and resource management.

accuracy, artificial intelligence, machine learning, (15 more...)

2412.19856

Country: North America > United States > Ohio (0.28)

Genre: Research Report (1.00)

Industry:

Energy > Renewable > Geothermal > Geothermal Energy Exploration and Development > Geophysical Analysis & Survey (0.97)
Transportation > Ground > Road (0.70)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Gopali, Saroj, Siami-Namini, Sima, Abri, Faranak, Namin, Akbar Siami

The Performance of the LSTM-based Code Generated by Large Language Models (LLMs) in Forecasting Time Series Data

arXiv.org Artificial IntelligenceNov-27-2024

As an intriguing case is the goodness of the machine and deep learning models generated by these LLMs in conducting automated scientific data analysis, where a data analyst may not have enough expertise in manually coding and optimizing complex deep learning models and codes and thus may opt to leverage LLMs to generate the required models. This paper investigates and compares the performance of the mainstream LLMs, such as ChatGPT, PaLM, LLama, and Falcon, in generating deep learning models for analyzing time series data, an important and popular data type with its prevalent applications in many application domains including financial and stock market. This research conducts a set of controlled experiments where the prompts for generating deep learning-based models are controlled with respect to sensitivity levels of four criteria including 1) Clarify and Specificity, 2) Objective and Intent, 3) Contextual Information, and 4) Format and Style. While the results are relatively mix, we observe some distinct patterns. We notice that using LLMs, we are able to generate deep learning-based models with executable codes for each dataset seperatly whose performance are comparable with the manually crafted and optimized LSTM models for predicting the whole time series dataset. We also noticed that ChatGPT outperforms the other LLMs in generating more accurate models. Furthermore, we observed that the goodness of the generated models vary with respect to the ``temperature'' parameter used in configuring LLMS. The results can be beneficial for data analysts and practitioners who would like to leverage generative AIs to produce good prediction models with acceptable goodness.

large language model, machine learning, natural language, (17 more...)

2411.18731

Country:

North America > United States > Texas (0.04)
North America > United States > New York > New York County > New York City (0.04)
Asia > Japan (0.04)
(3 more...)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.93)

Industry:

Information Technology (1.00)
Banking & Finance > Trading (1.00)
Education (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Camelia, Tanjina Sultana, Fahim, Faizur Rahman, Anwar, Md. Musfique

A Regularized LSTM Method for Detecting Fake News Articles

arXiv.org Artificial IntelligenceNov-16-2024

Nowadays, the rapid diffusion of fake news poses a significant problem, as it can spread misinformation and confusion. This paper aims to develop an advanced machine learning solution for detecting fake news articles. Leveraging a comprehensive dataset of news articles, including 23,502 fake news articles and 21,417 accurate news articles, we implemented and evaluated three machine-learning models. Our dataset, curated from diverse sources, provides rich textual content categorized into title, text, subject, and Date features. These features are essential for training robust classification models to distinguish between fake and authentic news articles. The initial model employed a Long Short-Term Memory (LSTM) network, achieving an accuracy of 94%. The second model improved upon this by incorporating additional regularization techniques and fine-tuning hyperparameters, resulting in a 97% accuracy. The final model combined the strengths of previous architectures with advanced optimization strategies, achieving a peak accuracy of 98%. These results demonstrate the effectiveness of our approach in identifying fake news with high precision. Implementing these models showcases significant advancements in natural language processing and machine learning techniques, contributing valuable tools for combating misinformation. Our work highlights the potential for deploying such models in real-world applications, providing a reliable method for automated fake news detection and enhancing the credibility of news dissemination.

artificial intelligence, machine learning, news article, (15 more...)

2411.10713

Country:

Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.05)
North America > United States > New York (0.05)

Genre: Research Report > New Finding (0.34)

Industry: Media > News (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Hasanpoor, Yasin, Rostami, Amin, Tarvirdizadeh, Bahram, Alipour, Khalil, Ghamari, Mohammad

Real-Time Stress Detection via Photoplethysmogram Signals: Implementation of a Combined Continuous Wavelet Transform and Convolutional Neural Network on Resource-Constrained Microcontrollers

arXiv.org Artificial IntelligenceOct-14-2024

This paper introduces a robust stress detection system utilizing a Convolutional Neural Network (CNN) designed for the analysis of Photoplethysmogram (PPG) signals. Employing the WESAD dataset, we applied Continuous Wavelet Transform (CWT) to extract informative features from wrist PPG signals, demonstrating enhanced stress detection and learning compared to conventional techniques. Notably, the CNN achieved an impressive accuracy of 93.7% after five epochs, post-implementation on a resource-constrained microcontroller. The optimization process, including pruning and Post-Train Quantization, was crucial to reduce the model size to 1.6 megabytes, overcoming the microcontroller's limited resources of 2 megabytes of Flash memory and 512 kilobytes of RAM. This optimized model not only addresses resource constraints but also outperforms traditional signal processing methods, positioning it as a promising solution for real-time stress monitoring on wearable devices.

artificial intelligence, machine learning, microcontroller, (15 more...)

doi: 10.1109/ICEE63041.2024.10668302

2410.19776

Country:

Asia > Middle East > Iran > Tehran Province > Tehran (0.08)
North America > United States > Michigan > Genesee County > Flint (0.04)

Genre: Research Report > Promising Solution (0.67)

Industry:

Health & Medicine > Therapeutic Area > Hematology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)