AITopics

This submission to the binary AI detection task is based on a modular stylometric pipeline, where: public spaCy models are used for text preprocessing (including tokenisation, named entity recognition, dependency parsing, part-of-speech tagging, and morphology annotation) and extracting several thousand features (frequencies of n-grams of the above linguistic annotations); light-gradient boosting machines are used as the classifier. We collect a large corpus of more than 500 000 machine-generated texts for the classifier's training. We explore several parameter options to increase the classifier's capacity and take advantage of that training set. Our approach follows the non-neural, computationally inexpensive but explainable approach found effective previously.

large language model, machine learning, natural language, (21 more...)

2507.12064

Country:

Europe > Poland (0.29)
Europe > Spain (0.28)

Genre:

Research Report (0.50)
Overview (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
(2 more...)

Kagrecha, Anmol, Marklund, Henrik, Manakul, Potsawee, Zeckhauser, Richard, Van Roy, Benjamin

Granular feedback merits sophisticated aggregation

Human feedback is increasingly used across diverse applications like training AI models, developing recommender systems, and measuring public opinion -- with granular feedback often being preferred over binary feedback for its greater informativeness. While it is easy to accurately estimate a population's distribution of feedback given feedback from a large number of individuals, cost constraints typically necessitate using smaller groups. A simple method to approximate the population distribution is regularized averaging: compute the empirical distribution and regularize it toward a prior. Can we do better? As we will discuss, the answer to this question depends on feedback granularity. Suppose one wants to predict a population's distribution of feedback using feedback from a limited number of individuals. We show that, as feedback granularity increases, one can substantially improve upon predictions of regularized averaging by combining individuals' feedback in ways more sophisticated than regularized averaging. Our empirical analysis using questions on social attitudes confirms this pattern. In particular, with binary feedback, sophistication barely reduces the number of individuals required to attain a fixed level of performance. By contrast, with five-point feedback, sophisticated methods match the performance of regularized averaging with about half as many individuals.

large language model, machine learning, natural language, (20 more...)

2507.12041

Country: North America > United States (0.46)

Genre:

Questionnaire & Opinion Survey (1.00)
Overview (1.00)
Research Report > Experimental Study (0.93)
Research Report > New Finding (0.68)

Industry:

Health & Medicine > Therapeutic Area (1.00)
Banking & Finance (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Improving Data and Parameter Efficiency of Neural Language Models Using Representation Analysis

Jukić, Josip

This thesis addresses challenges related to data and parameter efficiency in neural language models, with a focus on representation analysis and the introduction of new optimization techniques. The first part examines the properties and dynamics of language representations within neural models, emphasizing their significance in enhancing robustness and generalization. It proposes innovative approaches based on representation smoothness, including regularization strategies that utilize Jacobian and Hessian matrices to stabilize training and mitigate sensitivity to input perturbations. The second part focuses on methods to significantly enhance data and parameter efficiency by integrating active learning strategies with parameter-efficient fine-tuning, guided by insights from representation smoothness analysis. It presents smoothness-informed early-stopping techniques designed to eliminate the need for labeled validation sets and proposes innovative combinations of active learning and parameter-efficient fine-tuning to reduce labeling efforts and computational resources. Extensive experimental evaluations across various NLP tasks demonstrate that these combined approaches substantially outperform traditional methods in terms of performance, stability, and efficiency. The third part explores weak supervision techniques enhanced by in-context learning to effectively utilize unlabeled data, further reducing dependence on extensive labeling. It shows that using in-context learning as a mechanism for weak supervision enables models to better generalize from limited labeled data by leveraging unlabeled examples more effectively during training. Comprehensive empirical evaluations confirm significant gains in model accuracy, adaptability, and robustness, especially in low-resource settings and dynamic data environments.

large language model, machine learning, natural language, (27 more...)

2507.12004

Country:

Europe (1.00)
Asia (1.00)
North America > United States > Minnesota (0.27)

Genre:

Research Report > Promising Solution (1.00)
Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Leisure & Entertainment (1.00)
Media > Film (0.92)
Law (0.67)
Education > Curriculum (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
(6 more...)

Akram, Waseem, Din, Muhayy Ud, Soud, Lyes Saad, Hussain, Irfan

A Review of Generative AI in Aquaculture: Foundations, Applications, and Future Directions for Smart and Sustainable Farming

Generative Artificial Intelligence (GAI) has rapidly emerged as a transformative force in aquaculture, enabling intelligent synthesis of multimodal data, including text, images, audio, and simulation outputs for smarter, more adaptive decision-making. As the aquaculture industry shifts toward data-driven, automation and digital integration operations under the Aquaculture 4.0 paradigm, GAI models offer novel opportunities across environmental monitoring, robotics, disease diagnostics, infrastructure planning, reporting, and market analysis. This review presents the first comprehensive synthesis of GAI applications in aquaculture, encompassing foundational architectures (e.g., diffusion models, transformers, and retrieval augmented generation), experimental systems, pilot deployments, and real-world use cases. We highlight GAI's growing role in enabling underwater perception, digital twin modeling, and autonomous planning for remotely operated vehicle (ROV) missions. We also provide an updated application taxonomy that spans sensing, control, optimization, communication, and regulatory compliance. Beyond technical capabilities, we analyze key limitations, including limited data availability, real-time performance constraints, trust and explainability, environmental costs, and regulatory uncertainty. This review positions GAI not merely as a tool but as a critical enabler of smart, resilient, and environmentally aligned aquaculture systems.

large language model, machine learning, real time system, (17 more...)

2507.11974

Country:

Europe (0.67)
North America > United States (0.45)
Asia > Middle East > UAE (0.28)

Genre:

Overview (1.00)
Instructional Material (0.92)
Research Report > Promising Solution (0.67)
Research Report > New Finding (0.46)

Industry:

Law (1.00)
Information Technology > Security & Privacy (1.00)
Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
(8 more...)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
(2 more...)

Hassanin, Mohammed, Alsheikh, Mohammad Abu, Kuhn, Carlos C. N., Herath, Damith, Hoang, Dinh Thai, Radwan, Ibrahim

Towards Autonomous Riding: A Review of Perception, Planning, and Control in Intelligent Two-Wheelers

The rapid adoption of micromobility solutions, particularly two-wheeled vehicles like e-scooters and e-bikes, has created an urgent need for reliable autonomous riding (AR) technologies. While autonomous driving (AD) systems have matured significantly, AR presents unique challenges due to the inherent instability of two-wheeled platforms, limited size, limited power, and unpredictable environments, which pose very serious concerns about road users' safety. This review provides a comprehensive analysis of AR systems by systematically examining their core components, perception, planning, and control, through the lens of AD technologies. We identify critical gaps in current AR research, including a lack of comprehensive perception systems for various AR tasks, limited industry and government support for such developments, and insufficient attention from the research community. The review analyses the gaps of AR from the perspective of AD to highlight promising research directions, such as multimodal sensor techniques for lightweight platforms and edge deep learning architectures. By synthesising insights from AD research with the specific requirements of AR, this review aims to accelerate the development of safe, efficient, and scalable autonomous riding systems for future urban mobility.

data mining, machine learning, natural language, (18 more...)

2507.11852

Country:

Europe (1.00)
Oceania > Australia (0.46)

Genre:

Overview (1.00)
Research Report > New Finding (0.67)
Research Report > Promising Solution (0.67)

Industry:

Transportation > Passenger (1.00)
Transportation > Infrastructure & Services (1.00)
Transportation > Ground > Road (1.00)
(3 more...)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(4 more...)

The Evolving Role of Large Language Models in Scientific Innovation: Evaluator, Collaborator, and Scientist

Zhang, Haoxuan, Li, Ruochi, Zhang, Yang, Xiao, Ting, Chen, Jiangping, Ding, Junhua, Chen, Haihua

Scientific innovation is undergoing a paradigm shift driven by the rapid advancement of Large Language Models (LLMs). As science faces mounting challenges including information overload, disciplinary silos, and diminishing returns on conventional research methods, LLMs are emerging as powerful agents capable not only of enhancing scientific workflows but also of participating in and potentially leading the innovation process. Existing surveys mainly focus on different perspectives, phrases, and tasks in scientific research and discovery, while they have limitations in understanding the transformative potential and role differentiation of LLM. This survey proposes a comprehensive framework to categorize the evolving roles of LLMs in scientific innovation across three hierarchical levels: Evaluator, Collaborator, and Scientist. We distinguish between LLMs' contributions to structured scientific research processes and open-ended scientific discovery, thereby offering a unified taxonomy that clarifies capability boundaries, evaluation criteria, and human-AI interaction patterns at each level. Through an extensive analysis of current methodologies, benchmarks, systems, and evaluation metrics, this survey delivers an in-depth and systematic synthesis on LLM-driven scientific innovation. We present LLMs not only as tools for automating existing processes, but also as catalysts capable of reshaping the epistemological foundations of science itself. This survey offers conceptual clarity, practical guidance, and theoretical foundations for future research, while also highlighting open challenges and ethical considerations in the pursuit of increasingly autonomous AI-driven science. Resources related to this survey can be accessed on GitHub at: https://github.com/haoxuan-unt2024/llm4innovation.

artificial intelligence, large language model, machine learning, (19 more...)

2507.1181

Country:

North America > United States > Texas (0.46)
North America > United States > Illinois (0.27)

Genre:

Research Report > Experimental Study (1.00)
Overview (1.00)
Research Report > Promising Solution (0.92)
Research Report > New Finding (0.67)

Industry:

Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Education (0.93)
Government (0.67)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Bezerra, Wesley dos Reis, Bezerra, Lais Machado, Westphall, Carlos Becker

Challenges in GenAI and Authentication: a scoping review

Authentication and authenticity have been a security challenge since the beginning of information sharing, especially in the context of digital information. With the advancement of generative artificial intelligence, these challenges have evolved, demanding a more up-to-date analysis of their impacts on society and system security. This work presents a scoping review that analyzed 88 documents from the IEEExplorer, Scopus, and ACM databases, promoting an analysis of the resulting portfolio through six guiding questions focusing on the most relevant work, challenges, attack surfaces, threats, proposed solutions, and gaps. Finally, the portfolio articles are analyzed through this guiding research lens and also receive individualized analysis. The results consistently outline the challenges, gaps, and threats related to images, text, audio, and video, thereby supporting new research in the areas of authentication and generative artificial intelligence.

artificial intelligence, machine learning, natural language, (17 more...)

2507.11775

Country: South America > Brazil (0.28)

Genre:

Research Report (1.00)
Overview (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (1.00)
(2 more...)

ClarifAI: Enhancing AI Interpretability and Transparency through Case-Based Reasoning and Ontology-Driven Approach for Improved Decision-Making

Vemula, Srikanth

This study introduces Clarity and Reasoning Interface for Artificial Intelligence (ClarifAI), a novel approach designed to augment the transparency and interpretability of artificial intelligence (AI) in the realm of improved decision making. Leveraging the Case-Based Reasoning (CBR) methodology and integrating an ontology-driven approach, ClarifAI aims to meet the intricate explanatory demands of various stakeholders involved in AI-powered applications. The paper elaborates on ClarifAI's theoretical foundations, combining CBR and ontologies to furnish exhaustive explanation mechanisms. It further elaborates on the design principles and architectural blueprint, highlighting ClarifAI's potential to enhance AI interpretability across different sectors and its applicability in high-stake environments.

artificial intelligence, clarifai, machine learning, (16 more...)

2507.11733

Country: North America > United States (0.68)

Genre:

Overview > Innovation (0.48)
Research Report > Promising Solution (0.34)

Industry:

Law (0.68)
Government (0.47)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Case-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Memory-Based Learning (1.00)

A Review of Generative AI in Computer Science Education: Challenges and Opportunities in Accuracy, Authenticity, and Assessment

Reihanian, Iman, Hou, Yunfei, Chen, Yu, Zheng, Yifei

This paper surveys the use of Generative AI tools, such as ChatGPT and Claude, in computer science education, focusing on key aspects of accuracy, authenticity, and assessment. Through a literature review, we highlight both the challenges and opportunities these AI tools present. While Generative AI improves efficiency and supports creative student work, it raises concerns such as AI hallucinations, error propagation, bias, and blurred lines between AI-assisted and student-authored content. Human oversight is crucial for addressing these concerns. Existing literature recommends adopting hybrid assessment models that combine AI with human evaluation, developing bias detection frameworks, and promoting AI literacy for both students and educators. Our findings suggest that the successful integration of AI requires a balanced approach, considering ethical, pedagogical, and technical factors. Future research may explore enhancing AI accuracy, preserving academic integrity, and developing adaptive models that balance creativity with precision.

large language model, machine learning, natural language, (17 more...)

2507.11543

Country: North America > United States > California (0.14)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Education > Educational Setting (1.00)
Education > Curriculum > Subject-Specific Education (0.90)
Education > Assessment & Standards (0.68)
Education > Educational Technology > Educational Software (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (1.00)

Feys, Thomas, Van der Perre, Liesbet, Rottenberg, François

Learning to Quantize and Precode in Massive MIMO Systems for Energy Reduction: a Graph Neural Network Approach

arXiv.org Machine LearningJul-16-2025

Massive MIMO systems are moving toward increased numbers of radio frequency chains, higher carrier frequencies and larger bandwidths. As such, digital-to-analog converters (DACs) are becoming a bottleneck in terms of hardware complexity and power consumption. In this work, non-linear precoding for coarsely quantized downlink massive MIMO is studied. Given the NP-hard nature of this problem, a graph neural network (GNN) is proposed that directly outputs the precoded quantized vector based on the channel matrix and the intended transmit symbols. The model is trained in a self-supervised manner, by directly maximizing the achievable rate. To overcome the non-differentiability of the objective function, introduced due to the non-differentiable DAC functions, a straight-through Gumbel-softmax estimation of the gradient is proposed. The proposed method achieves a significant increase in achievable sum rate under coarse quantization. For instance, in the single-user case, the proposed method can achieve the same sum rate as maximum ratio transmission (MRT) by using one-bit DAC's as compared to 3 bits for MRT. This reduces the DAC's power consumption by a factor 4-7 and 3 for baseband and RF DACs respectively. This, however, comes at the cost of increased digital signal processing power consumption. When accounting for this, the reduction in overall power consumption holds for a system bandwidth up to 3.5 MHz for baseband DACs, while the RF DACs can maintain a power reduction of 2.9 for higher bandwidths. Notably, indirect effects, which further reduce the power consumption, such as a reduced fronthaul consumption and reduction in other components, are not considered in this analysis.

artificial intelligence, machine learning, power consumption, (18 more...)

arXiv.org Machine Learning

2507.10634

Country: Europe > Belgium > Flanders > Flemish Brabant > Leuven (0.04)

Genre:

Research Report (0.64)
Overview (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)