AITopics

2409.07379

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > California > Santa Clara County > Palo Alto (0.04)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Government > Regional Government (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

arXiv.org Artificial IntelligenceSep-11-2024

Dividable Configuration Performance Learning

Gong, Jingzhi, Chen, Tao, Bahsoon, Rami

Machine/deep learning models have been widely adopted for predicting the configuration performance of software systems. However, a crucial yet unaddressed challenge is how to cater for the sparsity inherited from the configuration landscape: the influence of configuration options (features) and the distribution of data samples are highly sparse. In this paper, we propose a model-agnostic and sparsity-robust framework for predicting configuration performance, dubbed DaL, based on the new paradigm of dividable learning that builds a model via "divide-and-learn". To handle sample sparsity, the samples from the configuration landscape are divided into distant divisions, for each of which we build a sparse local model, e.g., regularized Hierarchical Interaction Neural Network, to deal with the feature sparsity. A newly given configuration would then be assigned to the right model of division for the final prediction. Further, DaL adaptively determines the optimal number of divisions required for a system and sample size without any extra training or profiling. Experiment results from 12 real-world systems and five sets of training data reveal that, compared with the state-of-the-art approaches, DaL performs no worse than the best counterpart on 44 out of 60 cases with up to 1.61x improvement on accuracy; requires fewer samples to reach the same/better accuracy; and producing acceptable training overhead. In particular, the mechanism that adapted the parameter d can reach the optimal value for 76.43% of the individual runs. The result also confirms that the paradigm of dividable learning is more suitable than other similar paradigms such as ensemble learning for predicting configuration performance. Practically, DaL considerably improves different global models when using them as the underlying local models, which further strengthens its flexibility.

configuration, dal, local model, (16 more...)

2409.07629

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Massachusetts > Hampshire County > Amherst (0.14)
Europe > United Kingdom > England > Greater London > London (0.14)
(37 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Education (0.45)
Information Technology (0.45)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (1.00)
(2 more...)

Chen, Youguang, Wen, Zheyu, Biros, George

A Scalable Algorithm for Active Learning

arXiv.org Machine LearningSep-11-2024

FIRAL is a recently proposed deterministic active learning algorithm for multiclass classification using logistic regression. It was shown to outperform the state-of-the-art in terms of accuracy and robustness and comes with theoretical performance guarantees. However, its scalability suffers when dealing with datasets featuring a large number of points $n$, dimensions $d$, and classes $c$, due to its $\mathcal{O}(c^2d^2+nc^2d)$ storage and $\mathcal{O}(c^3(nd^2 + bd^3 + bn))$ computational complexity where $b$ is the number of points to select in active learning. To address these challenges, we propose an approximate algorithm with storage requirements reduced to $\mathcal{O}(n(d+c) + cd^2)$ and a computational complexity of $\mathcal{O}(bncd^2)$. Additionally, we present a parallel implementation on GPUs. We demonstrate the accuracy and scalability of our approach using MNIST, CIFAR-10, Caltech101, and ImageNet. The accuracy tests reveal no deterioration in accuracy compared to FIRAL. We report strong and weak scaling tests on up to 12 GPUs, for three million point synthetic dataset.

complexity, matrix, ound step, (16 more...)

2409.07392

Country:

North America > United States > Texas > Travis County > Austin (0.04)
North America > United States > Georgia > Fulton County > Atlanta (0.04)
North America > Canada > Ontario > Toronto (0.04)

Genre: Research Report > New Finding (0.34)

Industry:

Energy (0.46)
Government > Regional Government (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.66)

Wu, Kejin, Politis, Dimitris N.

Deep Limit Model-free Prediction in Regression

arXiv.org Artificial IntelligenceSep-11-2024

In this paper, we provide a novel Model-free approach based on Deep Neural Network (DNN) to accomplish point prediction and prediction interval under a general regression setting. Usually, people rely on parametric or non-parametric models to bridge dependent and independent variables (Y and X). However, this classical method relies heavily on the correct model specification. Even for the non-parametric approach, some additive form is often assumed. A newly proposed Model-free prediction principle sheds light on a prediction procedure without any model assumption. Previous work regarding this principle has shown better performance than other standard alternatives. Recently, DNN, one of the machine learning methods, has received increasing attention due to its great performance in practice. Guided by the Model-free prediction idea, we attempt to apply a fully connected forward DNN to map X and some appropriate reference random variable Z to Y. The targeted DNN is trained by minimizing a specially designed loss function so that the randomness of Y conditional on X is outsourced to Z through the trained DNN. Our method is more stable and accurate compared to other DNN-based counterparts, especially for optimal point predictions. With a specific prediction procedure, our prediction interval can capture the estimation variability so that it can render a better coverage rate for finite sample cases. The superior performance of our method is verified by simulation and empirical studies.

point prediction, prediction, probability, (16 more...)

2408.09532

Country: North America > United States > California > San Diego County > San Diego (0.04)

Genre: Research Report (1.00)

Industry: Consumer Products & Services > Food, Beverage, Tobacco & Cannabis > Beverages (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Scheuerer, Michael, Heinrich-Mertsching, Claudio, Bahaga, Titike K., Gudoshava, Masilin, Thorarinsdottir, Thordis L.

Applications of machine learning to predict seasonal precipitation for East Africa

arXiv.org Machine LearningSep-10-2024

Seasonal climate forecasts are commonly based on model runs from fully coupled forecasting systems that use Earth system models to represent interactions between the atmosphere, ocean, land and other Earth-system components. Recently, machine learning (ML) methods are increasingly being investigated for this task where large-scale climate variability is linked to local or regional temperature or precipitation in a linear or non-linear fashion. This paper investigates the use of interpretable ML methods to predict seasonal precipitation for East Africa in an operational setting. Dimension reduction is performed by decomposing the precipitation fields via empirical orthogonal functions (EOFs), such that only the respective factor loadings need to the predicted. Indices of large-scale climate variability--including the rate of change in individual indices as well as interactions between different indices--are then used as potential features to obtain tercile forecasts from an interpretable ML algorithm. Several research questions regarding the use of data and the effect of model complexity are studied. The results are compared against the ECMWF seasonal forecasting system (SEAS5) for three seasons--MAM, JJAS and OND--over the period 1993-2020. Compared to climatology for the same period, the ECMWF forecasts have negative skill in MAM and JJAS and significant positive skill in OND. The ML approach is on par with climatology in MAM and JJAS and a significantly positive skill in OND, if not quite at the level of the OND ECMWF forecast.

forecast, precipitation, predictor, (17 more...)

2409.06238

Country:

Africa > East Africa (0.70)
Indian Ocean (0.05)
North America > United States (0.04)
(20 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study > Negative Result (0.34)

Industry:

Energy (0.46)
Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Gregório, Fabio, Castro, Rafaela, Belloze, Kele, Lopes, Rui Pedro, Bezerra, Eduardo

GLARE: Guided LexRank for Advanced Retrieval in Legal Analysis

arXiv.org Artificial IntelligenceSep-10-2024

The Brazilian Constitution, known as the Citizen's Charter, provides mechanisms for citizens to petition the Judiciary, including the so-called special appeal. This specific type of appeal aims to standardize the legal interpretation of Brazilian legislation in cases where the decision contradicts federal laws. The handling of special appeals is a daily task in the Judiciary, regularly presenting significant demands in its courts. We propose a new method called GLARE, based on unsupervised machine learning, to help the legal analyst classify a special appeal on a topic from a list made available by the National Court of Brazil (STJ). As part of this method, we propose a modification of the graph-based LexRank algorithm, which we call Guided LexRank. This algorithm generates the summary of a special appeal. The degree of similarity between the generated summary and different topics is evaluated using the BM25 algorithm. As a result, the method presents a ranking of themes most appropriate to the analyzed special appeal. The proposed method does not require prior labeling of the text to be evaluated and eliminates the need for large volumes of data to train a model. We evaluate the effectiveness of the method by applying it to a special appeal corpus previously classified by human experts.

2409.15348

Country:

North America > Canada (0.14)
Asia > Singapore (0.04)
South America > Brazil > Rio de Janeiro > Rio de Janeiro (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Law > Statutes (0.88)
Law > Government & the Courts (0.75)

Technology:

Information Technology > Information Management (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(3 more...)

Cas, Davide Dei, Di Camillo, Barbara, Fadini, Gian Paolo, Sparacino, Giovanni, Longato, Enrico

Effect of Clinical History on Predictive Model Performance for Renal Complications of Diabetes

arXiv.org Artificial IntelligenceSep-10-2024

Diabetes is a chronic disease characterised by a high risk of developing diabetic nephropathy, which, in turn, is the leading cause of end-stage chronic kidney disease. The early identification of individuals at heightened risk of such complications or their exacerbation can be of paramount importance to set a correct course of treatment. In the present work, from the data collected in the DARWIN-Renal (DApagliflozin Real-World evIdeNce-Renal) study, a nationwide multicentre retrospective real-world study, we develop an array of logistic regression models to predict, over different prediction horizons, the crossing of clinically relevant glomerular filtration rate (eGFR) thresholds for patients with diabetes by means of variables associated with demographic, anthropometric, laboratory, pathology, and therapeutic data. In doing so, we investigate the impact of information coming from patient's past visits on the model's predictive performance, coupled with an analysis of feature importance through the Boruta algorithm. Our models yield very good performance (AUROC as high as 0.98). We also show that the introduction of information from patient's past visits leads to improved model performance of up to 4%. The usefulness of past information is further corroborated by a feature importance analysis.

diabetes, information, prediction horizon, (12 more...)

2409.13743

Country:

Europe > Italy (0.05)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)

Genre: Research Report > Experimental Study (1.00)

Industry:

Health & Medicine > Therapeutic Area > Nephrology (1.00)
Health & Medicine > Therapeutic Area > Endocrinology > Diabetes (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.56)

arXiv.org Machine LearningSep-9-2024

Enhancing Preference-based Linear Bandits via Human Response Time

Li, Shen, Zhang, Yuyang, Ren, Zhaolin, Liang, Claire, Li, Na, Shah, Julie A.

Binary human choice feedback is widely used in interactive preference learning for its simplicity, but it provides limited information about preference strength. To overcome this limitation, we leverage human response times, which inversely correlate with preference strength, as complementary information. Our work integrates the EZ-diffusion model, which jointly models human choices and response times, into preference-based linear bandits. We introduce a computationally efficient utility estimator that reformulates the utility estimation problem using both choices and response times as a linear regression problem. Theoretical and empirical comparisons with traditional choice-only estimators reveal that for queries with strong preferences ("easy" queries), choices alone provide limited information, while response times offer valuable complementary information about preference strength. As a result, incorporating response times makes easy queries more useful. We demonstrate this advantage in the fixed-budget best-arm identification problem, with simulations based on three real-world datasets, consistently showing accelerated learning when response times are incorporated.

artificial intelligence, machine learning, simulation of human behavior, (20 more...)

2409.05798

Country: North America > United States > Massachusetts (0.28)

Genre: Research Report (1.00)

Industry:

Energy > Oil & Gas > Upstream (0.46)
Retail (0.46)
Health & Medicine (0.46)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science > Simulation of Human Behavior (0.66)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Banerjee, Shuvayan, Srivastava, Radhendushka, Saunderson, James, Rajwade, Ajit

Robust Non-adaptive Group Testing under Errors in Group Membership Specifications

arXiv.org Machine LearningSep-9-2024

Given $p$ samples, each of which may or may not be defective, group testing (GT) aims to determine their defect status by performing tests on $n < p$ `groups', where a group is formed by mixing a subset of the $p$ samples. Assuming that the number of defective samples is very small compared to $p$, GT algorithms have provided excellent recovery of the status of all $p$ samples with even a small number of groups. Most existing methods, however, assume that the group memberships are accurately specified. This assumption may not always be true in all applications, due to various resource constraints. Such errors could occur, eg, when a technician, preparing the groups in a laboratory, unknowingly mixes together an incorrect subset of samples as compared to what was specified. We develop a new GT method, the Debiased Robust Lasso Test Method (DRLT), that handles such group membership specification errors. The proposed DRLT method is based on an approach to debias, or reduce the inherent bias in, estimates produced by Lasso, a popular and effective sparse regression technique. We also provide theoretical upper bounds on the reconstruction error produced by our estimator. Our approach is then combined with two carefully designed hypothesis tests respectively for (i) the identification of defective samples in the presence of errors in group membership specifications, and (ii) the identification of groups with erroneous membership specifications. The DRLT approach extends the literature on bias mitigation of statistical estimators such as the LASSO, to handle the important case when some of the measurements contain outliers, due to factors such as group membership specification errors. We present numerical results which show that our approach outperforms several baselines and robust regression techniques for identification of defective samples as well as erroneously specified groups.

artificial intelligence, machine learning, matrix, (18 more...)

2409.05345

Country:

Europe > Switzerland > Basel-City > Basel (0.04)
North America > United States (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report (0.81)

Industry: Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.86)

Douwes, Constance, Serizel, Romain

Normalizing Energy Consumption for Hardware-Independent Evaluation

arXiv.org Artificial IntelligenceSep-9-2024

The increasing use of machine learning (ML) models in signal processing has raised concerns about their environmental impact, particularly during resource-intensive training phases. In this study, we present a novel methodology for normalizing energy consumption across different hardware platforms to facilitate fair and consistent comparisons. We evaluate different normalization strategies by measuring the energy used to train different ML architectures on different GPUs, focusing on audio tagging tasks. Our approach shows that the number of reference points, the type of regression and the inclusion of computational metrics significantly influences the normalization process. We find that the appropriate selection of two reference points provides robust normalization, while incorporating the number of floating-point operations and parameters improves the accuracy of energy consumption predictions. By supporting more accurate energy consumption evaluation, our methodology promotes the development of environmentally sustainable ML practices.

energy consumption, reference point, regression, (14 more...)

2409.05602

Country:

Europe > United Kingdom > England > Greater London > London (0.04)
Europe > France > Grand Est > Meurthe-et-Moselle > Nancy (0.04)

Genre: Research Report > New Finding (0.66)

Industry: Energy (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.32)