AITopics

This paper presents a comprehensive overview on the applications of artificial intelligence (AI) in mathematical research, highlighting the transformative role AI has begun to play in this domain. Traditionally, AI advancements have heavily relied on theoretical foundations provided by mathematics and statistics. However, recent developments in AI, particularly in reinforcement learning (RL) and large language models (LLMs), have demonstrated the potential for AI to contribute back to mathematics by offering flexible algorithmic frameworks and powerful inductive reasoning capabilities that support various aspects of mathematical research. This survey aims to establish a bridge between AI and mathematics, providing insights into the mutual benefits and fostering deeper interdisciplinary understanding. In particular, we argue that while current AI and LLMs may struggle with complex deductive reasoning, their "inherent creativity", the ability to generate outputs at high throughput based on recognition of shallow patterns, holds significant potential to support and inspire mathematical research. This creative capability, often overlooked, could be the key to unlocking new perspectives and methodologies in mathematics. Furthermore, we address the lack of cross-disciplinary communication: mathematicians may not fully comprehend the latest advances in AI, while AI researchers frequently prioritize benchmark performance over real-world applications in frontier mathematical research. This paper seeks to close that gap, offering a detailed exploration of AI fundamentals, its strengths, and its emerging applications in the mathematical sciences.

arxiv preprint arxiv, large language model, machine learning, (19 more...)

2412.16543

Country:

North America > United States (0.93)
North America > Canada > Alberta (0.28)

Genre:

Overview (1.00)
Instructional Material > Course Syllabus & Notes (0.46)
Research Report > Promising Solution (0.46)

Industry:

Health & Medicine (1.00)
Education > Educational Setting (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(2 more...)

Baichuan Alignment Technical Report

Lin, Mingan, Yang, Fan, Shen, Yanjun, Sun, Haoze, Li, Tianpeng, Zhang, Tao, Zhu, Chenzheng, Zhang, Tao, Zheng, Miao, Li, Xu, Zhou, Yijie, Chen, Mingyang, Qin, Yanzhao, Li, Youquan, Liang, Hao, Li, Fei, Li, Yadong, Wang, Mang, Dong, Guosheng, Fang, Kun, Xu, Jianhua, Cui, Bin, Zhang, Wentao, Zhou, Zenan, Chen, Weipeng

We introduce Baichuan Alignment, a detailed analysis of the alignment techniques employed in the Baichuan series of models. This represents the industry's first comprehensive account of alignment methodologies, offering valuable insights for advancing AI research. We investigate the critical components that enhance model performance during the alignment process, including optimization methods, data strategies, capability enhancements, and evaluation processes. The process spans three key stages: Prompt Augmentation System(PAS), Supervised Fine-Tuning(SFT), and Preference Alignment. The problems encountered, the solutions applied, and the improvements made are thoroughly recorded. Through comparisons across well-established benchmarks, we highlight the technological advancements enabled by Baichuan Alignment. Baichuan-Instruct is an internal model, while Qwen2-Nova-72B and Llama3-PBM-Nova-70B are instruct versions of the Qwen2-72B and Llama-3-70B base models, optimized through Baichuan Alignment. Baichuan-Instruct demonstrates significant improvements in core capabilities, with user experience gains ranging from 17% to 28%, and performs exceptionally well on specialized benchmarks. In open-source benchmark evaluations, both Qwen2-Nova-72B and Llama3-PBM-Nova-70B consistently outperform their respective official instruct versions across nearly all datasets. This report aims to clarify the key technologies behind the alignment process, fostering a deeper understanding within the community. Llama3-PBM-Nova-70B model is available at https://huggingface.co/PKU-Baichuan-MLSystemLab/Llama3-PBM-Nova-70B.

arxiv preprint arxiv, large language model, machine learning, (20 more...)

2410.1494

Country: North America > United States (0.28)

Genre:

Research Report (1.00)
Overview (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Mitrai, Ilias, Daoutidis, Prodromos

Accelerating process control and optimization via machine learning: A review

The design and operation of chemical processes depend on An alternative approach is to accelerate the solution process decisions spanning a wide range of scales, from the molecular itself by 1) selecting a solution strategy (algorithm selection) up to the enterprise-wide, and constrained by multiple physical and 2) tuning it (algorithm configuration) such that a desired and chemical phenomena [1, 2, 3, 4]. Process control and optimization performance function like solution time is minimized. The acceleration methods provide a systematic framework to identify is usually achieved by exploiting some underlying the best possible decisions in designing and operating a process, property of the decision-making problem. An example is the subject to constraints that emerge from physics or design case of structured decision-making problems, where the structure and operational considerations. Over the last few decades, there can be used as the basis of decomposition-based optimization have been significant advances in both theory and algorithm development algorithms, which are usually faster than monolithic algorithms regarding the control of nonlinear and constrained for large-scale problems [24]. Although this approach process systems [5, 6, 7, 8, 9, 10], as well as the solution of does not compromise solution quality, selecting and tuning a broad classes of optimization problems [11, 12, 13, 14, 15].

data mining, machine learning, natural language, (21 more...)

2412.18529

Country:

North America > United States > Texas (0.28)
North America > United States > Minnesota > Hennepin County > Minneapolis (0.28)

Genre:

Overview (0.93)
Research Report (0.64)

Industry: Energy (1.00)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(4 more...)

Atitallah, Safa Ben, Rabah, Chaima Ben, Driss, Maha, Boulila, Wadii, Koubaa, Anis

Exploring Graph Mamba: A Comprehensive Survey on State-Space Models for Graph Learning

Graph Mamba, a powerful graph embedding technique, has emerged as a cornerstone in various domains, including bioinformatics, social networks, and recommendation systems. This survey represents the first comprehensive study devoted to Graph Mamba, to address the critical gaps in understanding its applications, challenges, and future potential. We start by offering a detailed explanation of the original Graph Mamba architecture, highlighting its key components and underlying mechanisms. Subsequently, we explore the most recent modifications and enhancements proposed to improve its performance and applicability. To demonstrate the versatility of Graph Mamba, we examine its applications across diverse domains. A comparative analysis of Graph Mamba and its variants is conducted to shed light on their unique characteristics and potential use cases. Furthermore, we identify potential areas where Graph Mamba can be applied in the future, highlighting its potential to revolutionize data analysis in these fields. Finally, we address the current limitations and open research questions associated with Graph Mamba. By acknowledging these challenges, we aim to stimulate further research and development in this promising area. This survey serves as a valuable resource for both newcomers and experienced researchers seeking to understand and leverage the power of Graph Mamba.

graph mamba, machine learning, real time system, (23 more...)

2412.18322

Country:

North America > United States (0.45)
Africa > Middle East > Tunisia (0.14)
Asia > Middle East > Saudi Arabia (0.14)

Genre:

Research Report > Promising Solution (1.00)
Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Information Technology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Health Care Technology (1.00)
(3 more...)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
(9 more...)

GDM4MMIMO: Generative Diffusion Models for Massive MIMO Communications

Jin, Zhenzhou, You, Li, Zhou, Huibin, Wang, Yuanshuo, Liu, Xiaofeng, Gong, Xinrui, Gao, Xiqi, Ng, Derrick Wing Kwan, Xia, Xiang-Gen

Massive multiple-input multiple-output (MIMO) offers significant advantages in spectral and energy efficiencies, positioning it as a cornerstone technology of fifth-generation (5G) wireless communication systems and a promising solution for the burgeoning data demands anticipated in sixth-generation (6G) networks. In recent years, with the continuous advancement of artificial intelligence (AI), a multitude of task-oriented generative foundation models (GFMs) have emerged, achieving remarkable performance in various fields such as computer vision (CV), natural language processing (NLP), and autonomous driving. As a pioneering force, these models are driving the paradigm shift in AI towards generative AI (GenAI). Among them, the generative diffusion model (GDM), as one of state-of-the-art families of generative models, demonstrates an exceptional capability to learn implicit prior knowledge and robust generalization capabilities, thereby enhancing its versatility and effectiveness across diverse applications. In this paper, we delve into the potential applications of GDM in massive MIMO communications. Specifically, we first provide an overview of massive MIMO communication, the framework of GFMs, and the working mechanism of GDM. Following this, we discuss recent research advancements in the field and present a case study of near-field channel estimation based on GDM, demonstrating its promising potential for facilitating efficient ultra-dimensional channel statement information (CSI) acquisition in the context of massive MIMO communications. Finally, we highlight several pressing challenges in future mobile communications and identify promising research directions surrounding GDM.

communication, machine learning, natural language, (18 more...)

2412.18281

Country: North America > United States (0.28)

Genre:

Research Report (1.00)
Overview (0.86)

Industry:

Telecommunications (1.00)
Information Technology > Networks (0.34)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.67)

GenAI Content Detection Task 2: AI vs. Human -- Academic Essay Authenticity Challenge

Chowdhury, Shammur Absar, Almerekhi, Hind, Kutlu, Mucahid, Keles, Kaan Efe, Ahmad, Fatema, Mohiuddin, Tasnim, Mikros, George, Alam, Firoj

This paper presents a comprehensive overview of the first edition of the Academic Essay Authenticity Challenge, organized as part of the GenAI Content Detection shared tasks collocated with COLING 2025. This challenge focuses on detecting machine-generated vs. human-authored essays for academic purposes. The task is defined as follows: "Given an essay, identify whether it is generated by a machine or authored by a human.'' The challenge involves two languages: English and Arabic. During the evaluation phase, 25 teams submitted systems for English and 21 teams for Arabic, reflecting substantial interest in the task. Finally, seven teams submitted system description papers. The majority of submissions utilized fine-tuned transformer-based models, with one team employing Large Language Models (LLMs) such as Llama 2 and Llama 3. This paper outlines the task formulation, details the dataset construction process, and explains the evaluation framework. Additionally, we present a summary of the approaches adopted by participating teams. Nearly all submitted systems outperformed the n-gram-based baseline, with the top-performing systems achieving F1 scores exceeding 0.98 for both languages, indicating significant progress in the detection of machine-generated text.

large language model, machine learning, natural language, (19 more...)

2412.18274

Country:

North America (0.68)
Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.15)

Genre: Overview (1.00)

Industry:

Education (1.00)
Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Molaeian, Hossein, Karamjani, Kaveh, Teimouri, Sina, Roshani, Saeed, Roshani, Sobhan

The Potential of Convolutional Neural Networks for Cancer Detection

ABSTRACT: Early detection of cancer is critical in improving treatment outcomes and increasing survival rates, particularly for common cancers such as lung, breast and prostate which collectively contribute to a significant global mortality burden. With advancements in imaging technologies and data processing, Convolutional Neural Networks (CNNs) have emerged as a powerful tool for analyzing and classifying medical images, enabling more precise cancer detection. This paper provides a comprehensive review of recent studies leveraging CNN models for detecting ten different types of cancer. Each study employs distinct CNN architectures to identify patterns associated with these cancers, utilizing diverse datasets. Key differences and strengths of these architectures are meticulously compared and analyzed, highlighting their efficacy in improving early detection. Beyond reviewing the performance and limitations of CNN-based cancer detection methods, this study explores the feasibility of integrating CNNs into clinical settings as an early detection tool, potentially complementing or replacing traditional methods. Despite significant progress, challenges remain, including data diversity, result interpretation, and ethical considerations. By identifying the best-performing CNN architectures and providing a comparative analysis, this study aims to contribute a comprehensive perspective on the application of CNNs in cancer detection and their role in advancing diagnostic capabilities in healthcare. I. INTRODUCTION Cancer is one of the most complex and deadly diseases of the present century, and due to its increasing prevalence, it has become a global crisis. This disease is characterized by the uncontrolled growth of cells, which can spread to other parts of the body, leading to disability and death. The exact causes of cancer are highly diverse and are a combination of genetic, environmental, and lifestyle factors. 2 In this study, we focus on some of the most common types of cancer, including prostate cancer, blood cancers (leukemia and lymphoma), bladder cancer, skin cancer (melanoma and non-melanoma), colorectal cancer, liver cancer, breast cancer, ovarian cancer, thyroid cancer, and lung cancer. These cancers are of particular significance due to their high prevalence and considerable impact on public health. Global data indicate that the cancer burden is increasing annually.

artificial intelligence, machine learning, survey article, (16 more...)

2412.17155

Genre:

Overview (1.00)
Research Report > Experimental Study (0.46)
Research Report > New Finding (0.34)

Industry: Health & Medicine > Therapeutic Area > Oncology > Skin Cancer (0.74)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Calimeri, Francesco, Ianni, Giovambattista, Pacenza, Francesco, Perri, Simona, Zangari, Jessica

ASP-based Multi-shot Reasoning via DLV2 with Incremental Grounding

DLV2 is an AI tool for Knowledge Representation and Reasoning which supports Answer Set Programming (ASP) - a logic-based declarative formalism, successfully used in both academic and industrial applications. Given a logic program modelling a computational problem, an execution of DLV2 produces the so-called answer sets that correspond one-to-one to the solutions to the problem at hand. The computational process of DLV2 relies on the typical Ground & Solve approach where the grounding step transforms the input program into a new, equivalent ground program, and the subsequent solving step applies propositional algorithms to search for the answer sets. Recently, emerging applications in contexts such as stream reasoning and event processing created a demand for multi-shot reasoning: here, the system is expected to be reactive while repeatedly executed over rapidly changing data. In this work, we present a new incremental reasoner obtained from the evolution of DLV2 towards iterated reasoning. Rather than restarting the computation from scratch, the system remains alive across repeated shots, and it incrementally handles the internal grounding process. At each shot, the system reuses previous computations for building and maintaining a large, more general ground program, from which a smaller yet equivalent portion is determined and used for computing answer sets. Notably, the incremental process is performed in a completely transparent fashion for the user. We describe the system, its usage, its applicability and performance in some practically relevant domains. Under consideration in Theory and Practice of Logic Programming (TPLP).

artificial intelligence, incremental-dlv 2, logic & formal reasoning, (16 more...)

2412.17143

Country:

Europe (1.00)
North America > United States (0.67)

Genre:

Overview (0.93)
Workflow (0.68)

Industry:

Leisure & Entertainment > Games > Computer Games (0.46)
Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Logic & Formal Reasoning (1.00)

Explainable AI for Multivariate Time Series Pattern Exploration: Latent Space Visual Analytics with Temporal Fusion Transformer and Variational Autoencoders in Power Grid Event Diagnosis

Xu, Haowen, Boyaci, Ali, Lian, Jianming, Wilson, Aaron

Detecting and analyzing complex patterns in multivariate time-series data is crucial for decision-making in urban and environmental system operations. However, challenges arise from the high dimensionality, intricate complexity, and interconnected nature of complex patterns, which hinder the understanding of their underlying physical processes. Existing AI methods often face limitations in interpretability, computational efficiency, and scalability, reducing their applicability in real-world scenarios. This paper proposes a novel visual analytics framework that integrates two generative AI models, Temporal Fusion Transformer (TFT) and Variational Autoencoders (VAEs), to reduce complex patterns into lower-dimensional latent spaces and visualize them in 2D using dimensionality reduction techniques such as PCA, t-SNE, and UMAP with DBSCAN. These visualizations, presented through coordinated and interactive views and tailored glyphs, enable intuitive exploration of complex multivariate temporal patterns, identifying patterns' similarities and uncover their potential correlations for a better interpretability of the AI outputs. The framework is demonstrated through a case study on power grid signal data, where it identifies multi-label grid event signatures, including faults and anomalies with diverse root causes. Additionally, novel metrics and visualizations are introduced to validate the models and evaluate the performance, efficiency, and consistency of latent maps generated by TFT and VAE under different configurations. These analyses provide actionable insights for model parameter tuning and reliability improvements. Comparative results highlight that TFT achieves shorter run times and superior scalability to diverse time-series data shapes compared to VAE. This work advances fault diagnosis in multivariate time series, fostering explainable AI to support critical system operations.

artificial intelligence, machine learning, natural language, (20 more...)

2412.16098

Country: North America > United States (1.00)

Genre:

Overview (1.00)
Research Report > New Finding (0.68)

Industry:

Government > Regional Government > North America Government > United States Government (1.00)
Energy > Renewable (1.00)
Energy > Power Industry (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.49)

Dani, Asang, Sathe, Shailesh R

A Review of the Marathi Natural Language Processing

Marathi is one of the most widely used languages in the world. One might expect that the latest advances in NLP research in languages like English reach such a large community. However, NLP advancements in English didn't immediately reach Indian languages like Marathi. There were several reasons for this. They included diversity of scripts used, lack of (publicly available) resources like tokenization strategies, high quality datasets \& benchmarks, and evaluation metrics. In addition to this, the morphologically rich nature of Marathi, made NLP tasks challenging. Advances in Neural Network (NN) based models and tools since the early 2000s helped improve this situation and make NLP research more accessible. In the past 10 years, significant efforts were made to improve language resources for all 22 scheduled languages of India. This paper presents a broad overview of evolution of NLP research in Indic languages with a focus on Marathi and state-of-the-art resources and tools available to the research community. It also provides an overview of tools \& techniques associated with Marathi NLP tasks.

computational linguistic, machine learning, natural language, (17 more...)

2412.15471

Country:

North America > United States (1.00)
Europe (1.00)
Asia > Middle East (0.68)

Genre:

Overview (0.86)
Research Report (0.64)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)