AITopics

Marine heatwaves (MHWs), an extreme climate phenomenon, pose significant challenges to marine ecosystems and industries, with their frequency and intensity increasing due to climate change. This study introduces an integrated deep learning approach to forecast short-to-long-term MHWs on a global scale. The approach combines graph representation for modeling spatial properties in climate data, imbalanced regression to handle skewed data distributions, and temporal diffusion to enhance forecast accuracy across various lead times. To the best of our knowledge, this is the first study that synthesizes three spatiotemporal anomaly methodologies to predict MHWs. Additionally, we introduce a method for constructing graphs that avoids isolated nodes and provide a new publicly available sea surface temperature anomaly graph dataset. We examine the trade-offs in the selection of loss functions and evaluation metrics for MHWs. We analyze spatial patterns in global MHW predictability by focusing on historical hotspots, and our approach demonstrates better performance compared to traditional numerical models in regions such as the middle south Pacific, equatorial Atlantic near Africa, south Atlantic, and high-latitude Indian Ocean. We highlight the potential of temporal diffusion to replace the conventional sliding window approach for long-term forecasts, achieving improved prediction up to six months in advance. These insights not only establish benchmarks for machine learning applications in MHW forecasting but also enhance understanding of general climate forecasting methodologies.

artificial intelligence, deep learning, machine learning, (15 more...)

2412.04475

Country:

Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
North America > United States > California (0.04)
South America > Peru (0.04)
(16 more...)

Genre: Research Report > New Finding (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Hudson, Dan, Hoogen, Jurgen van den, Atzmueller, Martin

Tailoring the Hyperparameters of a Wide-Kernel Convolutional Neural Network to Fit Different Bearing Fault Vibration Datasets

State-of-the-art algorithms are reported to be almost perfect at distinguishing the vibrations arising from healthy and damaged machine bearings, according to benchmark datasets at least. However, what about their application to new data? In this paper, we are able to confirm that neural networks for bearing fault detection can be crippled by incorrect hyperparameterisation, and also that the correct hyperparameter settings can actually change when transitioning to new data. The paper weaves together multiple methods to explain the behaviour of the hyperparameters of a wide-kernel convolutional neural network and how to set them. Since guidance already exists for generic hyperparameters like minibatch size, we focus on how to set architecture-specific hyperparameters such as the width of the convolutional kernels, a topic which might otherwise be obscure. We reflect different data properties by fusing information from seven different benchmark datasets, and our results show that the kernel size in the first layer in particular is sensitive to changes in the data. Looking deeper, we use manipulated copies of one dataset in an attempt to spot why the kernel size sometimes needs to change. The relevance of sampling rate is studied by using different levels of resampling, and spectral content is studied by increasingly filtering out high frequencies. At the end of our paper we conclude by stating clear guidance on how to set the hyperparameters of our neural network architecture.

artificial intelligence, deep learning, machine learning, (17 more...)

2411.15191

Country:

Europe > Germany > Lower Saxony (0.14)
Asia > China > Shaanxi Province > Xi'an (0.04)
South America > Argentina > Pampas > Buenos Aires F.D. > Buenos Aires (0.04)
(3 more...)

Genre: Research Report > New Finding (1.00)

Industry: Health & Medicine (0.67)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

BetterBench: Assessing AI Benchmarks, Uncovering Issues, and Establishing Best Practices

Reuel, Anka, Hardy, Amelia, Smith, Chandler, Lamparth, Max, Hardy, Malcolm, Kochenderfer, Mykel J.

AI models are increasingly prevalent in high-stakes environments, necessitating thorough assessment of their capabilities and risks. Benchmarks are popular for measuring these attributes and for comparing model performance, tracking progress, and identifying weaknesses in foundation and non-foundation models. They can inform model selection for downstream tasks and influence policy initiatives. However, not all benchmarks are the same: their quality depends on their design and usability. In this paper, we develop an assessment framework considering 46 best practices across an AI benchmark's lifecycle and evaluate 24 AI benchmarks against it. We find that there exist large quality differences and that commonly used benchmarks suffer from significant issues. We further find that most benchmarks do not report statistical significance of their results nor allow for their results to be easily replicated. To support benchmark developers in aligning with best practices, we provide a checklist for minimum quality assurance based on our assessment. We also develop a living repository of benchmark assessments to support benchmark comparability, accessible at betterbench.stanford.edu.

benchmark, large language model, machine learning, (21 more...)

2411.1299

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.24)
South America > Colombia > Meta Department > Villavicencio (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(4 more...)

Genre:

Research Report > Experimental Study (1.00)
Overview (1.00)

Industry:

Law (1.00)
Government (1.00)
Information Technology > Security & Privacy (0.92)
(2 more...)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.92)
(2 more...)

Garuda, Nikhil, Wu, John F., Nelson, Dylan, Pillepich, Annalisa

Estimating Dark Matter Halo Masses in Simulated Galaxy Clusters with Graph Neural Networks

Galaxies grow and evolve in dark matter halos. Because dark matter is not visible, galaxies' halo masses ($\rm{M}_{\rm{halo}}$) must be inferred indirectly. We present a graph neural network (GNN) model for predicting $\rm{M}_{\rm{halo}}$ from stellar mass ($\rm{M}_{*}$) in simulated galaxy clusters using data from the IllustrisTNG simulation suite. Unlike traditional machine learning models like random forests, our GNN captures the information-rich substructure of galaxy clusters by using spatial and kinematic relationships between galaxy neighbour. A GNN model trained on the TNG-Cluster dataset and independently tested on the TNG300 simulation achieves superior predictive performance compared to other baseline models we tested. Future work will extend this approach to different simulations and real observational datasets to further validate the GNN model's ability to generalise.

artificial intelligence, machine learning, simulation, (15 more...)

2411.12629

Country:

Europe > Germany > Baden-Württemberg > Karlsruhe Region > Heidelberg (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > United States > Maryland > Baltimore (0.04)
(3 more...)

Genre: Research Report (0.64)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)

Gallego, Jhair S., Ramirez, Ricardo E.

Multilayer occupancy grid for obstacle avoidance in an autonomous ground vehicle using RGB-D camera

This work describes the process of integrating a depth camera into the navigation system of a self-driving ground vehicle (SDV) and the implementation of a multilayer costmap that enhances the vehicle's obstacle identification process by expanding its two-dimensional field of view, based on 2D LIDAR, to a three-dimensional perception system using an RGB-D camera. This approach lays the foundation for a robust vision-based navigation and obstacle detection system. A theoretical review is presented and implementation results are discussed for future work.

artificial intelligence, planning & scheduling, vehicle, (17 more...)

2411.12535

Country:

South America > Colombia > Bogotá D.C. > Bogotá (0.05)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (0.51)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.35)

Balancing Accuracy and Efficiency in Multi-Turn Intent Classification for LLM-Powered Dialog Systems in Production

Liu, Junhua, Tan, Yong Keat, Fu, Bin, Lim, Kwan Hui

Accurate multi-turn intent classification is essential for advancing conversational AI systems. However, challenges such as the scarcity of comprehensive datasets and the complexity of contextual dependencies across dialogue turns hinder progress. This paper presents two novel approaches leveraging Large Language Models (LLMs) to enhance scalability and reduce latency in production dialogue systems. First, we introduce Symbol Tuning, which simplifies intent labels to reduce task complexity and improve performance in multi-turn dialogues. Second, we propose C-LARA (Consistency-aware, Linguistics Adaptive Retrieval Augmentation), a framework that employs LLMs for data augmentation and pseudo-labeling to generate synthetic multi-turn dialogues. These enriched datasets are used to fine-tune a small, efficient model suitable for deployment. Experiments conducted on multilingual dialogue datasets demonstrate significant improvements in classification accuracy and resource efficiency. Our methods enhance multi-turn intent classification accuracy by 5.09%, reduce annotation costs by 40%, and enable scalable deployment in low-resource multilingual industrial systems, highlighting their practicality and impact.

classification, large language model, machine learning, (14 more...)

2411.12307

Country:

Asia > Singapore (0.04)
Asia > Indonesia (0.04)
South America > Brazil (0.04)
(8 more...)

Genre: Research Report > Experimental Study (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Multimodal large language model for wheat breeding: a new exploration of smart breeding

Yang, Guofeng, Li, Yu, He, Yong, Zhou, Zhenjiang, Ye, Lingzhen, Fang, Hui, Luo, Yiqi, Feng, Xuping

UAV remote sensing technology has become a key technology in crop breeding, which can achieve high-throughput and non-destructive collection of crop phenotyping data. However, the multidisciplinary nature of breeding has brought technical barriers and efficiency challenges to knowledge mining. Therefore, it is important to develop a smart breeding goal tool to mine cross-domain multimodal data. Based on different pre-trained open-source multimodal large language models (MLLMs) (e.g., Qwen-VL, InternVL, Deepseek-VL), this study used supervised fine-tuning (SFT), retrieval-augmented generation (RAG), and reinforcement learning from human feedback (RLHF) technologies to inject cross-domain knowledge into MLLMs, thereby constructing multiple multimodal large language models for wheat breeding (WBLMs). The above WBLMs were evaluated using the newly created evaluation benchmark in this study. The results showed that the WBLM constructed using SFT, RAG and RLHF technologies and InternVL2-8B has leading performance. Then, subsequent experiments were conducted using the WBLM. Ablation experiments indicated that the combination of SFT, RAG, and RLHF technologies can improve the overall generation performance, enhance the generated quality, balance the timeliness and adaptability of the generated answer, and reduce hallucinations and biases. The WBLM performed best in wheat yield prediction using cross-domain data (remote sensing, phenotyping, weather, germplasm) simultaneously, with R2 and RMSE of 0.821 and 489.254 kg/ha, respectively. Furthermore, the WBLM can generate professional decision support answers for phenotyping estimation, environmental stress assessment, target germplasm screening, cultivation technique recommendation, and seed price query tasks.

large language model, machine learning, natural language, (20 more...)

2411.15203

Country:

Europe > Switzerland > Basel-City > Basel (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)
North America > Canada > Saskatchewan (0.04)
(10 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine (1.00)
Food & Agriculture > Agriculture (1.00)
Energy (0.86)
Banking & Finance > Trading (0.67)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Kim, Kyoungmin, Hong, Kijae, Gulcehre, Caglar, Ailamaki, Anastasia

The Effect of Scheduling and Preemption on the Efficiency of LLM Inference Serving

The growing usage of Large Language Models (LLMs) highlights the demands and challenges in scalable LLM inference systems, affecting deployment and development processes. On the deployment side, there is a lack of comprehensive analysis on the conditions under which a particular scheduler performs better or worse, with performance varying substantially across different schedulers, hardware, models, and workloads. Manually testing each configuration on GPUs can be prohibitively expensive. On the development side, unpredictable performance and unknown upper limits can lead to inconclusive trial-and-error processes, consuming resources on ideas that end up ineffective. To address these challenges, we introduce INFERMAX, an analytical framework that uses inference cost models to compare various schedulers, including an optimal scheduler formulated as a constraint satisfaction problem (CSP) to establish an upper bound on performance. Our framework offers in-depth analysis and raises essential questions, challenging assumptions and exploring opportunities for more efficient scheduling. Notably, our findings indicate that preempting requests can reduce GPU costs by 30% compared to avoiding preemptions at all. We believe our methods and insights will facilitate the cost-effective deployment and development of scalable, efficient inference systems and pave the way for cost-based scheduling.

batch, scheduler, scheduling and preemption, (13 more...)

2411.07447

Country:

Europe > Austria > Vienna (0.14)
North America > United States > California > Santa Clara County > Santa Clara (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
(11 more...)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Constraint-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)

Manderscheid, Etienne, Lee, Matthias

Predicting Customer Satisfaction by Replicating the Survey Response Distribution

For many call centers, customer satisfaction (CSAT) is a key performance indicator (KPI). However, only a fraction of customers take the CSAT survey after the call, leading to a biased and inaccurate average CSAT value, and missed opportunities for coaching, follow-up, and rectification. Therefore, call centers can benefit from a model predicting customer satisfaction on calls where the customer did not complete the survey. Given that CSAT is a closely monitored KPI, it is critical to minimize any bias in the average predicted CSAT (pCSAT). In this paper, we introduce a method such that predicted CSAT (pCSAT) scores accurately replicate the distribution of survey CSAT responses for every call center with sufficient data in a live production environment. The method can be applied to many multiclass classification problems to improve the class balance and minimize its changes upon model updates.

call center, csat, threshold, (14 more...)

2411.12539

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
North America > Canada > Ontario > Toronto (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)

Genre:

Research Report (1.00)
Questionnaire & Opinion Survey (1.00)

Industry: Information Technology > Security & Privacy (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.69)

Zhang, Honghua, Wang, Benjie, Arenas, Marcelo, Broeck, Guy Van den

Restructuring Tractable Probabilistic Circuits

Probabilistic circuits (PCs) is a unifying representation for probabilistic models that support tractable inference. Numerous applications of PCs like controllable text generation depend on the ability to efficiently multiply two circuits. Existing multiplication algorithms require that the circuits respect the same structure, i.e. variable scopes decomposes according to the same vtree. In this work, we propose and study the task of restructuring structured(-decomposable) PCs, that is, transforming a structured PC such that it conforms to a target vtree. We propose a generic approach for this problem and show that it leads to novel polynomial-time algorithms for multiplying circuits respecting different vtrees, as well as a practical depth-reduction algorithm that preserves structured decomposibility. Our work opens up new avenues for tractable PC inference, suggesting the possibility of training with less restrictive PC structures while enabling efficient inference by changing their structures at inference time.

algorithm, node, van den broeck, (15 more...)

2411.12256

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
South America > Chile (0.04)
Europe > France > Grand Est > Meurthe-et-Moselle > Nancy (0.04)
Africa > South Sudan > Equatoria > Central Equatoria > Juba (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.30)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.30)