AITopics

2501.18901

Country:

North America > Canada > Ontario > Toronto (0.14)
North America > United States > Texas > Travis County > Austin (0.04)
Asia > Vietnam > Hanoi > Hanoi (0.04)
Asia > Middle East > Israel (0.04)

Genre:

Research Report > New Finding (0.46)
Overview > Innovation (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Christensen, Sören, Strauch, Claudia, Trottner, Lukas

Beyond Fixed Horizons: A Theoretical Framework for Adaptive Denoising Diffusions

arXiv.org Machine LearningJan-31-2025

Akeylimitationofthesemodels,however,istheirrelianceon a fixed time horizon, which introduces an artificial time dependency in the drift function of the backward process. As a result, the generative denoising process follows a predefined number of steps, regardless of the actual level of noise present along the generated path. To overcome this limitation, we introduce a novel class of diffusion models that dynamically adapt to the state of the denoising process. By replacing the fixed deterministic time horizon with a random one and conditioning the forward process to terminate at a predefined target distribution, our approach achieves greater flexibility and state awareness. The foundation of our method lies in Doob's h-transforms with respect to underlying exponential times. While the theoretical groundwork for this concept exists, its explicit application and detailed exploration - particularly in comparison to deterministic time horizons - remains underrepresented in the literature. A key feature of our model is its inherent adaptability: the number of denoising steps dynamically adjusts based on the noise level in the data, introducing a stochastic element. This randomness not only enhances the generation process, but also allows denoising to start from partially noisy data, naturally incorporating conditioning.

artificial intelligence, diffusion model, machine learning, (16 more...)

2501.19373

Country:

North America > United States > New York (0.05)
North America > United States > Florida > Palm Beach County > Boca Raton (0.04)
Europe > United Kingdom > England > West Midlands > Birmingham (0.04)
(4 more...)

Genre:

Overview (0.46)
Research Report (0.40)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceJan-31-2025

A Comprehensive Review: Applicability of Deep Neural Networks in Business Decision Making and Market Prediction Investment

Trinh, Viet

Big data, both in its structured and unstructured formats, have brought in unforeseen challenges in economics and business. How to organize, classify, and then analyze such data to obtain meaningful insights are the ever-going research topics for business leaders and academic researchers. This paper studies recent applications of deep neural networks in decision making in economical business and investment; especially in risk management, portfolio optimization, and algorithmic trading. Set aside limitation in data privacy and cross-market analysis, the article establishes that deep neural networks have performed remarkably in financial classification and prediction. Moreover, the study suggests that by compositing multiple neural networks, spanning different data type modalities, a more robust, efficient, and scalable financial prediction framework can be constructed.

artificial intelligence, machine learning, neural network, (14 more...)

2502.00151

Country:

Asia > Vietnam (0.05)
Asia > Japan (0.04)
Asia > China > Anhui Province > Hefei (0.04)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Information Technology > Security & Privacy (1.00)
Banking & Finance > Trading (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Quaedvlieg, Lars C. P. M.

Optimizing Job Allocation using Reinforcement Learning with Graph Neural Networks

arXiv.org Artificial IntelligenceJan-31-2025

Efficient job allocation in complex scheduling problems poses significant challenges in real-world applications. In this report, we propose a novel approach that leverages the power of Reinforcement Learning (RL) and Graph Neural Networks (GNNs) to tackle the Job Allocation Problem (JAP). The JAP involves allocating a maximum set of jobs to available resources while considering several constraints. Our approach enables learning of adaptive policies through trial-and-error interactions with the environment while exploiting the graph-structured data of the problem. By leveraging RL, we eliminate the need for manual annotation, a major bottleneck in supervised learning approaches. Experimental evaluations on synthetic and real-world data demonstrate the effectiveness and generalizability of our proposed approach, outperforming baseline algorithms and showcasing its potential for optimizing job allocation in complex scheduling problems.

artificial intelligence, machine learning, reinforcement learning, (10 more...)

2501.19063

Country: Europe > Switzerland (0.04)

Genre:

Research Report (1.00)
Overview (1.00)

Industry: Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

arXiv.org Artificial IntelligenceJan-31-2025

Capturing Temporal Dynamics in Large-Scale Canopy Tree Height Estimation

Pauls, Jan, Zimmer, Max, Turan, Berkant, Saatchi, Sassan, Ciais, Philippe, Pokutta, Sebastian, Gieseke, Fabian

With the rise in global greenhouse gas emissions, accurate large-scale tree canopy height maps are essential for understanding forest structure, estimating above-ground biomass, and monitoring ecological disruptions. To this end, we present a novel approach to generate large-scale, high-resolution canopy height maps over time. Our model accurately predicts canopy height over multiple years given Sentinel-2 time series satellite data. Using GEDI LiDAR data as the ground truth for training the model, we present the first 10m resolution temporal canopy height map of the European continent for the period 2019-2022. As part of this product, we also offer a detailed canopy height map for 2020, providing more precise estimates than previous studies. Our pipeline and the resulting temporal height map are publicly available, enabling comprehensive large-scale monitoring of forests and, hence, facilitating future research and ecological analyses. For an interactive viewer, see https://europetreemap.projects.earthengine.app/view/temporalcanopyheight.

artificial intelligence, capturing temporal dynamic, machine learning, (17 more...)

2501.19328

Country:

Europe > France (0.04)
North America > United States > California (0.04)
Europe > United Kingdom > England > Tyne and Wear > Sunderland (0.04)
(5 more...)

Genre:

Research Report (1.00)
Overview > Innovation (0.34)

Industry:

Government (1.00)
Energy (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.47)

Quarteroni, Alfio, Gervasio, Paola, Regazzoni, Francesco

Combining physics-based and data-driven models: advancing the frontiers of research with Scientific Machine Learning

arXiv.org Artificial IntelligenceJan-30-2025

Scientific Machine Learning (SciML) is a recently emerged research field which combines physics-based and data-driven models for the numerical approximation of differential problems. Physics-based models rely on the physical understanding of the problem at hand, subsequent mathematical formulation, and numerical approximation. Data-driven models instead aim to extract relations between input and output data without arguing any causality principle underlining the available data distribution. In recent years, data-driven models have been rapidly developed and popularized. Such a diffusion has been triggered by a huge availability of data (the so-called big data), an increasingly cheap computing power, and the development of powerful machine learning algorithms. SciML leverages the physical awareness of physics-based models and, at the same time, the efficiency of data-driven algorithms. With SciML, we can inject physics and mathematical knowledge into machine learning algorithms. Yet, we can rely on data-driven algorithms' capability to discover complex and non-linear patterns from data and improve the descriptive capacity of physics-based models. After recalling the mathematical foundations of digital modelling and machine learning algorithms, and presenting the most popular machine learning architectures, we discuss the great potential of a broad variety of SciML strategies in solving complex problems governed by partial differential equations. Finally, we illustrate the successful application of SciML to the simulation of the human cardiac function, a field of significant socio-economic importance that poses numerous challenges on both the mathematical and computational fronts. The corresponding mathematical model is a complex system of non-linear ordinary and partial differential equations describing the electromechanics, valve dynamics, blood circulation, perfusion in the coronary tree, and torso potential. Despite the robustness and accuracy of physics-based models, certain aspects, such as unveiling constitutive laws for cardiac cells and myocardial material properties, as well as devising efficient reduced order models to dominate the extraordinary computational complexity, have been successfully tackled by leveraging data-driven models.

artificial intelligence, cardiac electromechanical model, machine learning, (21 more...)

2501.18708

Country:

North America > United States (1.00)
Europe (1.00)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Information Technology (1.00)
Health & Medicine > Therapeutic Area > Cardiology/Vascular Diseases (1.00)
Health & Medicine > Health Care Technology (1.00)
(5 more...)

Renggli, Cedric, Ilyas, Ihab F., Rekatsinas, Theodoros

Fundamental Challenges in Evaluating Text2SQL Solutions and Detecting Their Limitations

arXiv.org Artificial IntelligenceJan-30-2025

In this work, we dive into the fundamental challenges of evaluating Text2SQL solutions and highlight potential failure causes and the potential risks of relying on aggregate metrics in existing benchmarks. We identify two largely unaddressed limitations in current open benchmarks: (1) data quality issues in the evaluation data, mainly attributed to the lack of capturing the probabilistic nature of translating a natural language description into a structured query (e.g., NL ambiguity), and (2) the bias introduced by using different match functions as approximations for SQL equivalence. To put both limitations into context, we propose a unified taxonomy of all Text2SQL limitations that can lead to both prediction and evaluation errors. We then motivate the taxonomy by providing a survey of Text2SQL limitations using state-of-the-art Text2SQL solutions and benchmarks. We describe the causes of limitations with real-world examples and propose potential mitigation solutions for each category in the taxonomy. We conclude by highlighting the open challenges encountered when deploying such mitigation strategies or attempting to automatically apply the taxonomy.

large language model, machine learning, natural language, (22 more...)

2501.18197

Country:

North America > Canada > Alberta > Census Division No. 6 > Calgary Metropolitan Region > Calgary (0.04)
North America > United States > Oklahoma (0.04)
Europe > United Kingdom > England (0.04)
Asia > Middle East > Jordan (0.04)

Genre:

Research Report (1.00)
Overview (0.66)

Industry:

Leisure & Entertainment (1.00)
Media > Film (0.46)

Technology:

Information Technology > Data Science > Data Quality (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.48)

Shen, Macheng, Cheng, Chen

Neural SDEs as a Unified Approach to Continuous-Domain Sequence Modeling

arXiv.org Machine LearningJan-30-2025

Inspired by the ubiquitous use of differential equations to model continuous dynamics across diverse scientific and engineering domains, we propose a novel and intuitive approach to continuous sequence modeling. Our method interprets time-series data as \textit{discrete samples from an underlying continuous dynamical system}, and models its time evolution using Neural Stochastic Differential Equation (Neural SDE), where both the flow (drift) and diffusion terms are parameterized by neural networks. We derive a principled maximum likelihood objective and a \textit{simulation-free} scheme for efficient training of our Neural SDE model. We demonstrate the versatility of our approach through experiments on sequence modeling tasks across both embodied and generative AI. Notably, to the best of our knowledge, this is the first work to show that SDE-based continuous-time modeling also excels in such complex scenarios, and we hope that our work opens up new avenues for research of SDE models in high-dimensional and temporally intricate domains.

artificial intelligence, machine learning, natural language, (16 more...)

2501.18871

Country:

Asia > China > Shanghai > Shanghai (0.04)
North America > United States > Pennsylvania (0.04)

Genre:

Research Report (0.64)
Overview (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.34)

arXiv.org Artificial IntelligenceJan-30-2025

Text Data Augmentation for Large Language Models: A Comprehensive Survey of Methods, Challenges, and Opportunities

Chai, Yaping, Xie, Haoran, Qin, Joe S.

The increasing size and complexity of pre-trained language models have demonstrated superior performance in many applications, but they usually require large training datasets to be adequately trained. Insufficient training sets could unexpectedly make the model overfit and fail to cope with complex tasks. Large language models (LLMs) trained on extensive corpora have prominent text generation capabilities, which improve the quality and quantity of data and play a crucial role in data augmentation. Specifically, distinctive prompt templates are given in personalised tasks to guide LLMs in generating the required content. Recent promising retrieval-based techniques further improve the expressive performance of LLMs in data augmentation by introducing external knowledge to enable them to produce more grounded-truth data. This survey provides an in-depth analysis of data augmentation in LLMs, classifying the techniques into Simple Augmentation, Prompt-based Augmentation, Retrieval-based Augmentation and Hybrid Augmentation. We summarise the post-processing approaches in data augmentation, which contributes significantly to refining the augmented data and enabling the model to filter out unfaithful content. Then, we provide the common tasks and evaluation metrics. Finally, we introduce existing challenges and future opportunities that could bring further improvement to data augmentation.

large language model, machine learning, natural language, (15 more...)

2501.18845

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > Singapore (0.04)
North America > Canada > Ontario > Toronto (0.04)
(16 more...)

Genre: Overview (1.00)

Industry:

Education (0.67)
Media (0.48)
Health & Medicine (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Machine LearningJan-30-2025

Adaptivity and Convergence of Probability Flow ODEs in Diffusion Generative Models

Tang, Jiaqi, Yan, Yuling

Score-based generative models, which transform noise into data by learning to reverse a diffusion process, have become a cornerstone of modern generative AI. This paper contributes to establishing theoretical guarantees for the probability flow ODE, a widely used diffusion-based sampler known for its practical efficiency. While a number of prior works address its general convergence theory, it remains unclear whether the probability flow ODE sampler can adapt to the low-dimensional structures commonly present in natural image data. We demonstrate that, with accurate score function estimation, the probability flow ODE sampler achieves a convergence rate of $O(k/T)$ in total variation distance (ignoring logarithmic factors), where $k$ is the intrinsic dimension of the target distribution and $T$ is the number of iterations. This dimension-free convergence rate improves upon existing results that scale with the typically much larger ambient dimension, highlighting the ability of the probability flow ODE sampler to exploit intrinsic low-dimensional structures in the target distribution for faster sampling.

artificial intelligence, machine learning, natural language, (13 more...)

2501.18863

Country:

North America > United States > Wisconsin > Dane County > Madison (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
Europe > Austria (0.04)

Genre:

Overview (0.67)
Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Generation (0.90)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.66)