AITopics

2408.17053

Country:

Asia > China > Fujian Province > Xiamen (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report > Experimental Study (0.94)

Industry: Health & Medicine (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)

Satheesh, Athul Rasheeda, Knippertz, Peter, Fink, Andreas H.

Machine learning models for daily rainfall forecasting in Northern Tropical Africa using tropical wave predictors

Numerical weather prediction (NWP) models often underperform compared to simpler climatology-based precipitation forecasts in northern tropical Africa, even after statistical postprocessing. AI-based forecasting models show promise but have avoided precipitation due to its complexity. Synoptic-scale forcings like African easterly waves and other tropical waves (TWs) are important for predictability in tropical Africa, yet their value for predicting daily rainfall remains unexplored. This study uses two machine-learning models--gamma regression and a convolutional neural network (CNN)--trained on TW predictors from satellite-based GPM IMERG data to predict daily rainfall during the July-September monsoon season. Predictor variables are derived from the local amplitude and phase information of seven TW from the target and up-and-downstream neighboring grids at 1-degree spatial resolution. The ML models are combined with Easy Uncertainty Quantification (EasyUQ) to generate calibrated probabilistic forecasts and are compared with three benchmarks: Extended Probabilistic Climatology (EPC15), ECMWF operational ensemble forecast (ENS), and a probabilistic forecast from the ENS control member using EasyUQ (CTRL EasyUQ). The study finds that downstream predictor variables offer the highest predictability, with downstream tropical depression (TD)-type wave-based predictors being most important. Other waves like mixed-Rossby gravity (MRG), Kelvin, and inertio-gravity waves also contribute significantly but show regional preferences. ENS forecasts exhibit poor skill due to miscalibration. CTRL EasyUQ shows improvement over ENS and marginal enhancement over EPC15. Both gamma regression and CNN forecasts significantly outperform benchmarks in tropical Africa. This study highlights the potential of ML models trained on TW-based predictors to improve daily precipitation forecasts in tropical Africa.

forecast, grid point, predictor, (15 more...)

2408.16349

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Africa > Central Africa (0.05)
Africa > Niger > Niamey > Niamey (0.05)
(17 more...)

Genre:

Research Report > New Finding (0.46)
Research Report > Experimental Study (0.46)

Industry: Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Pascu, Octavian, Oneata, Dan, Cucu, Horia, Müller, Nicolas M.

Easy, Interpretable, Effective: openSMILE for voice deepfake detection

In this paper, we demonstrate that attacks in the latest ASVspoof5 dataset -- a de facto standard in the field of voice authenticity and deepfake detection -- can be identified with surprising accuracy using a small subset of very simplistic features. These are derived from the openSMILE library, and are scalar-valued, easy to compute, and human interpretable. For example, attack A10`s unvoiced segments have a mean length of 0.09 +- 0.02, while bona fide instances have a mean length of 0.18 +- 0.07. Using this feature alone, a threshold classifier achieves an Equal Error Rate (EER) of 10.3% for attack A10. Similarly, across all attacks, we achieve up to 0.8% EER, with an overall EER of 15.7 +- 6.0%. We explore the generalization capabilities of these features and find that some of them transfer effectively between attacks, primarily when the attacks originate from similar Text-to-Speech (TTS) architectures. This finding may indicate that voice anti-spoofing is, in part, a problem of identifying and remembering signatures or fingerprints of individual TTS systems. This allows to better understand anti-spoofing models and their challenges in real-world application.

detection, opensmile, proc, (17 more...)

2408.15775

Country:

Europe > Romania > București - Ilfov Development Region > Municipality of Bucharest > Bucharest (0.05)
Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)

Genre: Research Report (1.00)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.87)
Information Technology > Artificial Intelligence > Issues > Social & Ethical Issues (0.62)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.34)

Dasgupta, Subhasis, Sen, Jaydip

A Comparative Study of Hyperparameter Tuning Methods

The study emphasizes the challenge of finding the optimal trade-off between bias and variance, especially as hyperparameter optimization increases in complexity. Through empirical analysis, three hyperparameter tuning algorithms Tree-structured Parzen Estimator (TPE), Genetic Search, and Random Search are evaluated across regression and classification tasks. The results show that nonlinear models, with properly tuned hyperparameters, significantly outperform linear models. Interestingly, Random Search excelled in regression tasks, while TPE was more effective for classification tasks. This suggests that there is no one-size-fits-all solution, as different algorithms perform better depending on the task and model type. The findings underscore the importance of selecting the appropriate tuning method and highlight the computational challenges involved in optimizing machine learning models, particularly as search spaces expand.

algorithm, dataset, hyperparameter, (16 more...)

2408.16425

Country:

South America > Paraguay > Asunción > Asunción (0.04)
Oceania > Guam (0.04)
North America > United States > California > Orange County > Irvine (0.04)
(9 more...)

Genre: Research Report > New Finding (1.00)

Industry: Energy (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Search (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Evolutionary Systems (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Performance Analysis > Accuracy (0.68)
(2 more...)

Wang, Mingwei, Liu, Sitong

Machine Learning-Based Research on the Adaptability of Adolescents to Online Education

With the rapid advancement of internet technology, the adaptability of adolescents to online learning has emerged as a focal point of interest within the educational sphere. However, the academic community's efforts to develop predictive models for adolescent online learning adaptability require further refinement and expansion. Utilizing data from the "Chinese Adolescent Online Education Survey" spanning the years 2014 to 2016, this study implements five machine learning algorithms - logistic regression, K-nearest neighbors, random forest, XGBoost, and CatBoost - to analyze the factors influencing adolescent online learning adaptability and to determine the model best suited for prediction. The research reveals that the duration of courses, the financial status of the family, and age are the primary factors affecting students' adaptability in online learning environments. Additionally, age significantly impacts students' adaptive capacities. Among the predictive models, the random forest, XGBoost, and CatBoost algorithms demonstrate superior forecasting capabilities, with the random forest model being particularly adept at capturing the characteristics of students' adaptability.

adaptability, online education, student, (12 more...)

2408.16849

Country:

North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
Asia > China > Hubei Province > Wuhan (0.04)

Genre:

Research Report > New Finding (0.51)
Research Report > Experimental Study (0.36)

Industry:

Education > Educational Setting > Online (1.00)
Education > Educational Technology > Educational Software > Computer Based Training (0.68)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Ensemble Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Nearest Neighbor Methods (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.50)

Rodriguez, Charlotte, Degioanni, Laura, Kameni, Laetitia, Vidal, Richard, Neglia, Giovanni

Evaluating the Energy Consumption of Machine Learning: Systematic Literature Review and Experiments

arXiv.org Artificial IntelligenceAug-27-2024

Monitoring, understanding, and optimizing the energy consumption of Machine Learning (ML) are various reasons why it is necessary to evaluate the energy usage of ML. However, there exists no universal tool that can answer this question for all use cases, and there may even be disagreement on how to evaluate energy consumption for a specific use case. Tools and methods are based on different approaches, each with their own advantages and drawbacks, and they need to be mapped out and explained in order to select the most suitable one for a given situation. We address this challenge through two approaches. First, we conduct a systematic literature review of all tools and methods that permit to evaluate the energy consumption of ML (both at training and at inference), irrespective of whether they were originally designed for machine learning or general software. Second, we develop and use an experimental protocol to compare a selection of these tools and methods. The comparison is both qualitative and quantitative on a range of ML tasks of different nature (vision, language) and computational complexity. The systematic literature review serves as a comprehensive guide for understanding the array of tools and methods used in evaluating energy consumption of ML, for various use cases going from basic energy monitoring to consumption optimization. Two open-source repositories are provided for further exploration. The first one contains tools that can be used to replicate this work or extend the current review. The second repository houses the experimental protocol, allowing users to augment the protocol with new ML computing tasks and additional energy evaluation tools.

document title, energy consumption, estimation model, (14 more...)

2408.15128

Genre:

Research Report > New Finding (1.00)
Overview (1.00)
Research Report > Experimental Study (0.68)

Technology:

Information Technology > Software Engineering (1.00)
Information Technology > Software (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.93)

arXiv.org Artificial IntelligenceAug-27-2024

MiWaves Reinforcement Learning Algorithm

Ghosh, Susobhan, Guo, Yongyi, Hung, Pei-Yao, Coughlin, Lara, Bonar, Erin, Nahum-Shani, Inbal, Walton, Maureen, Murphy, Susan

The escalating prevalence of cannabis use poses a significant public health challenge globally. In the U.S., cannabis use is more prevalent among emerging adults (EAs) (ages 18-25) than any other age group, with legalization in the multiple states contributing to a public perception that cannabis is less risky than in prior decades. To address this growing concern, we developed MiWaves, a reinforcement learning (RL) algorithm designed to optimize the delivery of personalized intervention prompts to reduce cannabis use among EAs. MiWaves leverages domain expertise and prior data to tailor the likelihood of delivery of intervention messages. This paper presents a comprehensive overview of the algorithm's design, including key decisions and experimental outcomes. The finalized MiWaves RL algorithm was deployed in a clinical trial from March to May 2024.

algorithm, cannabis, participant, (16 more...)

2408.15076

Country:

North America > United States > Michigan (0.04)
North America > United States > Wisconsin > Dane County > Madison (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
Overview (1.00)

Industry: Health & Medicine > Therapeutic Area > Psychiatry/Psychology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Wakayama, Tomoya, Sugasawa, Shonosuke

Ensemble Prediction via Covariate-dependent Stacking

arXiv.org Machine LearningAug-27-2024

This study proposes a novel approach to ensemble prediction, called ``covariate-dependent stacking'' (CDST). Unlike traditional stacking methods, CDST allows model weights to vary flexibly as a function of covariates, thereby enhancing predictive performance in complex scenarios. We formulate the covariate-dependent weights through combinations of basis functions, estimate them by optimizing cross-validation, and develop an expectation-maximization algorithm, ensuring computational efficiency. To analyze the theoretical properties, we establish an oracle inequality regarding the expected loss to be minimized for estimating model weights. Through comprehensive simulation studies and an application to large-scale land price prediction, we demonstrate that the CDST consistently outperforms conventional model averaging methods, particularly on datasets where some models fail to capture the underlying complexity. Our findings suggest that the CDST is especially valuable for, but not limited to, spatio-temporal prediction problems, offering a powerful tool for researchers and practitioners in various data analysis fields.

covariate, prediction, regression, (17 more...)

arXiv.org Machine Learning

2408.09755

Country:

Europe > Austria > Vienna (0.14)
Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(6 more...)

Genre: Research Report > New Finding (1.00)

Industry: Banking & Finance (0.46)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.68)
(2 more...)

Kim, Taehyeong, Lee, Hyungu, Choi, Hayoung

Improved identification of breakpoints in piecewise regression and its applications

arXiv.org Machine LearningAug-27-2024

Identifying breakpoints in piecewise regression is critical in enhancing the reliability and interpretability of data fitting. In this paper, we propose novel algorithms based on the greedy algorithm to accurately and efficiently identify breakpoints in piecewise polynomial regression. The algorithm updates the breakpoints to minimize the error by exploring the neighborhood of each breakpoint. It has a fast convergence rate and stability to find optimal breakpoints. Moreover, it can determine the optimal number of breakpoints. The computational results for real and synthetic data show that its accuracy is better than any existing methods. The real-world datasets demonstrate that breakpoints through the proposed algorithm provide valuable data information.

breakpoint, piecewise regression, regression, (13 more...)

arXiv.org Machine Learning

2408.13751

Country:

Asia > South Korea > Daegu > Daegu (0.04)
North America > United States > New York (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (1.00)

Industry:

Energy (0.68)
Banking & Finance > Trading (0.46)
Health & Medicine > Therapeutic Area (0.30)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.70)

arXiv.org Artificial IntelligenceAug-27-2024

Time Series Analysis for Education: Methods, Applications, and Future Directions

Mao, Shengzhong, Zhang, Chaoli, Song, Yichi, Wang, Jindong, Zeng, Xiao-Jun, Xu, Zenglin, Wen, Qingsong

Recent advancements in the collection and analysis of sequential educational data have brought time series analysis to a pivotal position in educational research, highlighting its essential role in facilitating data-driven decision-making. However, there is a lack of comprehensive summaries that consolidate these advancements. To the best of our knowledge, this paper is the first to provide a comprehensive review of time series analysis techniques specifically within the educational context. We begin by exploring the landscape of educational data analytics, categorizing various data sources and types relevant to education. We then review four prominent time series methods-forecasting, classification, clustering, and anomaly detection-illustrating their specific application points in educational settings. Subsequently, we present a range of educational scenarios and applications, focusing on how these methods are employed to address diverse educational tasks, which highlights the practical integration of multiple time series methods to solve complex educational problems. Finally, we conclude with a discussion on future directions, including personalized learning analytics, multimodal data fusion, and the role of large language models (LLMs) in educational time series. The contributions of this paper include a detailed taxonomy of educational data, a synthesis of time series techniques with specific educational applications, and a forward-looking perspective on emerging trends and future research opportunities in educational analysis. The related papers and resources are available and regularly updated at the project page.

application, prediction, student, (16 more...)

2408.1396

Country:

North America > Trinidad and Tobago > Trinidad > Arima > Arima (0.04)
South America > Uruguay > Maldonado > Maldonado (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(16 more...)

Genre:

Research Report (1.00)
Overview (1.00)
Instructional Material > Online (1.00)
Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Information Technology (1.00)
Health & Medicine > Therapeutic Area (1.00)
Education > Educational Technology > Educational Software > Computer Based Training (1.00)
(5 more...)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(4 more...)