AITopics

From Idea to Implementation: Evaluating the Influence of Large Language Models in Software Development -- An Opinion Paper

Yadav, Sargam, Qureshi, Asifa Mehmood, Kaushik, Abhishek, Sharma, Shubham, Loughran, Roisin, Kazhuparambil, Subramaniam, Shaw, Andrew, Sabry, Mohammed, Lynch, Niamh St John, Singh, . Nikhil, O'Hara, Padraic, Jaiswal, Pranay, Chandru, Roshan, Lillis, David

The introduction of transformer architecture was a turning point in Natural Language Processing (NLP). Models based on the transformer architecture such as Bidirectional Encoder Representations from Transformers (BERT) and Generative Pre-Trained Transformer (GPT) have gained widespread popularity in various applications such as software development and education. The availability of Large Language Models (LLMs) such as ChatGPT and Bard to the general public has showcased the tremendous potential of these models and encouraged their integration into various domains such as software development for tasks such as code generation, debugging, and documentation generation. In this study, opinions from 11 experts regarding their experience with LLMs for software development have been gathered and analysed to draw insights that can guide successful and responsible integration. The overall opinion of the experts is positive, with the experts identifying advantages such as increase in productivity and reduced coding time. Potential concerns and challenges such as risk of over-dependence and ethical considerations have also been highlighted.

large language model, machine learning, natural language, (17 more...)

2503.0745

Country:

Europe > Ireland (0.15)
North America > United States (0.14)

Genre:

Overview (0.93)
Research Report > New Finding (0.66)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)
Energy > Oil & Gas (0.68)
Education > Curriculum > Subject-Specific Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

A practical guide to machine learning interatomic potentials -- Status and future

Jacobs, Ryan, Morgan, Dane, Attarian, Siamak, Meng, Jun, Shen, Chen, Wu, Zhenghao, Xie, Clare Yijia, Yang, Julia H., Artrith, Nongnuch, Blaiszik, Ben, Ceder, Gerbrand, Choudhary, Kamal, Csanyi, Gabor, Cubuk, Ekin Dogus, Deng, Bowen, Drautz, Ralf, Fu, Xiang, Godwin, Jonathan, Honavar, Vasant, Isayev, Olexandr, Johansson, Anders, Kozinsky, Boris, Martiniani, Stefano, Ong, Shyue Ping, Poltavsky, Igor, Schmidt, KJ, Takamoto, So, Thompson, Aidan, Westermayr, Julia, Wood, Brandon M.

The rapid development and large body of literature on machine learning interatomic potentials (MLIPs) can make it difficult to know how to proceed for researchers who are not experts but wish to use these tools. The spirit of this review is to help such researchers by serving as a practical, accessible guide to the state-of-the-art in MLIPs. This review paper covers a broad range of topics related to MLIPs, including (i) central aspects of how and why MLIPs are enablers of many exciting advancements in molecular modeling, (ii) the main underpinnings of different types of MLIPs, including their basic structure and formalism, (iii) the potentially transformative impact of universal MLIPs for both organic and inorganic systems, including an overview of the most recent advances, capabilities, downsides, and potential applications of this nascent class of MLIPs, (iv) a practical guide for estimating and understanding the execution speed of MLIPs, including guidance for users based on hardware availability, type of MLIP used, and prospective simulation size and time, (v) a manual for what MLIP a user should choose for a given application by considering hardware resources, speed requirements, energy and force accuracy requirements, as well as guidance for choosing pre-trained potentials or fitting a new potential from scratch, (vi) discussion around MLIP infrastructure, including sources of training data, pre-trained potentials, and hardware resources for training, (vii) summary of some key limitations of present MLIPs and current approaches to mitigate such limitations, including methods of including long-range interactions, handling magnetic systems, and treatment of excited states, and finally (viii) we finish with some more speculative thoughts on what the future holds for the development and application of MLIPs over the next 3-10+ years.

large language model, machine learning, natural language, (21 more...)

doi: 10.1016/j.cossms.2025.101214

2503.09814

Country:

North America > United States > Pennsylvania (0.28)
Europe > United Kingdom > England (0.27)
North America > United States > New York (0.14)
(6 more...)

Genre:

Overview (1.00)
Research Report > New Finding (0.45)

Industry:

Materials > Chemicals (1.00)
Education (1.00)
Government > Regional Government > North America Government > United States Government (0.67)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.67)

Elnoor, Mohamed, Weerakoon, Kasun, Seneviratne, Gershom, Liang, Jing, Rajagopal, Vignesh, Manocha, Dinesh

Vi-LAD: Vision-Language Attention Distillation for Socially-Aware Robot Navigation in Dynamic Environments

-- We introduce Vision-Language Attention Distillation (Vi-LAD), a novel approach for distilling socially compliant navigation knowledge from a large Vision-Language Model (VLM) into a lightweight transformer model for real-time robotic navigation. Unlike traditional methods that rely on expert demonstrations or human-annotated datasets, Vi-LAD performs knowledge distillation and fine-tuning at the intermediate layer representation level (i.e., attention maps) by leveraging the backbone of a pre-trained vision-action model. These attention maps highlight key navigational regions in a given scene, which serve as implicit guidance for socially aware motion planning. Vi-LAD fine-tunes a transformer-based model using intermediate attention maps extracted from the pre-trained vision-action model, combined with attention-like semantic maps constructed from a large VLM. T o achieve this, we introduce a novel attention-level distillation loss that fuses knowledge from both sources, generating augmented attention maps with enhanced social awareness. These refined attention maps are then utilized as a traversability costmap within a socially aware model predictive controller (MPC) for navigation. We validate our approach through real-world experiments on a Husky wheeled robot, demonstrating significant improvements over state-of-the-art (SOT A) navigation methods. Our results show up to 14.2% - 50% improvement in success rate, which highlights the effectiveness of Vi-LAD in enabling socially compliant and efficient robot navigation. I NTRODUCTION As autonomous robots become increasingly integrated into human-centered environments, ensuring safe, efficient, and socially compliant navigation is a critical challenge [1].

large language model, machine learning, natural language, (18 more...)

2503.0982

Country: North America > United States > Maryland (0.29)

Genre:

Research Report > New Finding (0.54)
Research Report > Promising Solution (0.34)
Overview > Innovation (0.34)

Industry: Energy > Oil & Gas (0.36)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Taxonomy, Opportunities, and Challenges of Representation Engineering for Large Language Models

Wehner, Jan, Abdelnabi, Sahar, Tan, Daniel, Krueger, David, Fritz, Mario

Representation Engineering (RepE) is a novel paradigm for controlling the behavior of LLMs. Unlike traditional approaches that modify inputs or fine-tune the model, RepE directly manipulates the model's internal representations. As a result, it may offer more effective, interpretable, data-efficient, and flexible control over models' behavior. We present the first comprehensive survey of RepE for LLMs, reviewing the rapidly growing literature to address key questions: What RepE methods exist and how do they differ? For what concepts and problems has RepE been applied? What are the strengths and weaknesses of RepE compared to other methods? To answer these, we propose a unified framework describing RepE as a pipeline comprising representation identification, operationalization, and control. We posit that while RepE methods offer significant potential, challenges remain, including managing multiple concepts, ensuring reliability, and preserving models' performance. Towards improving RepE, we identify opportunities for experimental and methodological improvements and construct a guide for best practices.

large language model, machine learning, natural language, (18 more...)

2502.19649

Country:

South America > Colombia > Meta Department > Villavicencio (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > Canada > Quebec > Montreal (0.04)
(3 more...)

Genre:

Overview (1.00)
Research Report > Promising Solution (0.45)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Foundation Models for Spatio-Temporal Data Science: A Tutorial and Survey

Liang, Yuxuan, Wen, Haomin, Xia, Yutong, Jin, Ming, Yang, Bin, Salim, Flora, Wen, Qingsong, Pan, Shirui, Cong, Gao

Spatio-Temporal (ST) data science, which includes sensing, managing, and mining large-scale data across space and time, is fundamental to understanding complex systems in domains such as urban computing, climate science, and intelligent transportation. Traditional deep learning approaches have significantly advanced this field, particularly in the stage of ST data mining. However, these models remain task-specific and often require extensive labeled data. Inspired by the success of Foundation Models (FM), especially large language models, researchers have begun exploring the concept of Spatio-Temporal Foundation Models (STFMs) to enhance adaptability and generalization across diverse ST tasks. Unlike prior architectures, STFMs empower the entire workflow of ST data science, ranging from data sensing, management, to mining, thereby offering a more holistic and scalable approach. Despite rapid progress, a systematic study of STFMs for ST data science remains lacking. This survey aims to provide a comprehensive review of STFMs, categorizing existing methodologies and identifying key research directions to advance ST general intelligence.

arxiv preprint arxiv, foundation model, zhang, (13 more...)

2503.13502

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > District of Columbia > Washington (0.05)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
(4 more...)

Genre:

Overview (1.00)
Instructional Material > Course Syllabus & Notes (0.50)

Industry:

Information Technology (0.93)
Transportation (0.68)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(3 more...)

A Survey of Direct Preference Optimization

Liu, Shunyu, Fang, Wenkai, Hu, Zetian, Zhang, Junjie, Zhou, Yang, Zhang, Kongcheng, Tu, Rongcheng, Lin, Ting-En, Huang, Fei, Song, Mingli, Li, Yongbin, Tao, Dacheng

Large Language Models (LLMs) have demonstrated unprecedented generative capabilities, yet their alignment with human values remains critical for ensuring helpful and harmless deployments. While Reinforcement Learning from Human Feedback (RLHF) has emerged as a powerful paradigm for aligning LLMs with human preferences, its reliance on complex reward modeling introduces inherent trade-offs in computational efficiency and training stability. In this context, Direct Preference Optimization (DPO) has recently gained prominence as a streamlined alternative that directly optimizes LLMs using human preferences, thereby circumventing the need for explicit reward modeling. Owing to its theoretical elegance and computational efficiency, DPO has rapidly attracted substantial research efforts exploring its various implementations and applications. However, this field currently lacks systematic organization and comparative analysis. In this survey, we conduct a comprehensive overview of DPO and introduce a novel taxonomy, categorizing previous works into four key dimensions: data strategy, learning framework, constraint mechanism, and model property. We further present a rigorous empirical analysis of DPO variants across standardized benchmarks. Additionally, we discuss real-world applications, open challenges, and future directions for DPO. This work delivers both a conceptual framework for understanding DPO and practical guidance for practitioners, aiming to advance robust and generalizable alignment paradigms. All collected resources are available and will be continuously updated at https://github.com/liushunyu/awesome-direct-preference-optimization.

arxiv, optimization, preference optimization, (17 more...)

2503.11701

Country:

Asia > China (0.04)
Asia > Singapore (0.04)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Education (0.67)
Media (0.46)
Health & Medicine (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Das, Bikram, Sutradhar, Rupchand, Dalal, D C

Exploration of Hepatitis B Virus Infection Dynamics through Virology-Informed Neural Network: A Novel Artificial Intelligence Approach

In this work, we introduce Virology-Informed Neural Networks (VINNs), a powerful tool for capturing the intricate dynamics of viral infection when data of some compartments of the model are not available. VINNs, an extension of the widely known Physics-Informed Neural Networks (PINNs), offer an alternative approach to traditional numerical methods for solving system of differential equations. We apply this VINN technique on a recently proposed hepatitis B virus (HBV) infection dynamics model to predict the transmission of the infection within the liver more accurately. This model consists of four compartments, namely uninfected and infected hepatocytes, rcDNA-containing capsids, and free viruses, along with the consideration of capsid recycling. Leveraging the power of VINNs, we study the impacts of variations in parameter range, experimental noise, data variability, network architecture, and learning rate in this work. In order to demonstrate the robustness and effectiveness of VINNs, we employ this approach on the data collected from nine HBV-infceted chimpanzees, and it is observed that VINNs can effectively estimate the model parameters. VINNs reliably capture the dynamics of infection spread and accurately predict their future progression using real-world data. Furthermore, VINNs efficiently identify the most influential parameters in HBV dynamics based solely on experimental data from the capsid component. It is also expected that this framework can be extended beyond viral dynamics, providing a powerful tool for uncovering hidden patterns and complex interactions across various scientific and engineering domains.

concentration, infection, neural network, (13 more...)

2503.10708

Country:

Asia > India > Assam > Guwahati (0.14)
North America > United States > Rocky Mountains (0.04)
North America > Canada > Rocky Mountains (0.04)
(2 more...)

Genre:

Research Report (1.00)
Overview (0.92)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Therapeutic Area > Immunology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Saif, Sarwar, Islam, Md Jahirul, Jahangir, Md. Zihad Bin, Biswas, Parag, Rashid, Abdur, Nasim, MD Abdullah Al, Gupta, Kishor Datta

A Comprehensive Review on Understanding the Decentralized and Collaborative Approach in Machine Learning

The arrival of Machine Learning (ML) completely changed how we can unlock valuable information from data. Traditional methods, where everything was stored in one place, had big problems with keeping information private, handling large amounts of data, and avoiding unfair advantages. Machine Learning has become a powerful tool that uses Artificial Intelligence (AI) to overcome these challenges. We started by learning the basics of Machine Learning, including the different types like supervised, unsupervised, and reinforcement learning. We also explored the important steps involved, such as preparing the data, choosing the right model, training it, and then checking its performance. Next, we examined some key challenges in Machine Learning, such as models learning too much from specific examples (overfitting), not learning enough (underfitting), and reflecting biases in the data used. Moving beyond centralized systems, we looked at decentralized Machine Learning and its benefits, like keeping data private, getting answers faster, and using a wider variety of data sources. We then focused on a specific type called federated learning, where models are trained without directly sharing sensitive information. Real-world examples from healthcare and finance were used to show how collaborative Machine Learning can solve important problems while still protecting information security. Finally, we discussed challenges like communication efficiency, dealing with different types of data, and security. We also explored using a Zero Trust framework, which provides an extra layer of protection for collaborative Machine Learning systems. This approach is paving the way for a bright future for this groundbreaking technology.

federated learning, learning, machine learning, (14 more...)

2503.09833

Country:

Asia > Japan > Honshū > Kantō > Ibaraki Prefecture > Tsukuba (0.04)
North America > United States > California (0.04)
Asia > Middle East > Jordan (0.04)
Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.04)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Loh, Johnson, Dudchenko, Lyubov, Viga, Justus, Gemmeke, Tobias

Towards Hardware Supported Domain Generalization in DNN-Based Edge Computing Devices for Health Monitoring

Deep neural network (DNN) models have shown remarkable success in many real-world scenarios, such as object detection and classification. Unfortunately, these models are not yet widely adopted in health monitoring due to exceptionally high requirements for model robustness and deployment in highly resource-constrained devices. In particular, the acquisition of biosignals, such as electrocardiogram (ECG), is subject to large variations between training and deployment, necessitating domain generalization (DG) for robust classification quality across sensors and patients. The continuous monitoring of ECG also requires the execution of DNN models in convenient wearable devices, which is achieved by specialized ECG accelerators with small form factor and ultra-low power consumption. However, combining DG capabilities with ECG accelerators remains a challenge. This article provides a comprehensive overview of ECG accelerators and DG methods and discusses the implication of the combination of both domains, such that multi-domain ECG monitoring is enabled with emerging algorithm-hardware co-optimized systems. Within this context, an approach based on correction layers is proposed to deploy DG capabilities on the edge. Here, the DNN fine-tuning for unknown domains is limited to a single layer, while the remaining DNN model remains unmodified. Thus, computational complexity (CC) for DG is reduced with minimal memory overhead compared to conventional fine-tuning of the whole DNN model. The DNN model-dependent CC is reduced by more than 2.5x compared to DNN fine-tuning at an average increase of F1 score by more than 20% on the generalized target domain. In summary, this article provides a novel perspective on robust DNN classification on the edge for health monitoring applications.

classification, domain shift, generalization, (15 more...)

doi: 10.1109/TBCAS.2024.3418085

2503.09661

Country:

Europe > United Kingdom > England > Greater London > London (0.04)
Europe > Germany > North Rhine-Westphalia > Cologne Region > Aachen (0.04)

Genre:

Research Report (1.00)
Overview (0.86)

Industry:

Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Diagnostic Medicine (1.00)
Health & Medicine > Consumer Health (1.00)

Technology:

Information Technology > Hardware (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)