AITopics

2405.01994

Country:

Europe > United Kingdom > England > Oxfordshire > Oxford (0.27)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
South America > Brazil > São Paulo (0.04)
(20 more...)

Genre:

Research Report > Strength High (1.00)
Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)
(2 more...)

Industry:

Health & Medicine > Therapeutic Area > Nutrition and Weight Loss (1.00)
Health & Medicine > Therapeutic Area > Internal Medicine (1.00)
Health & Medicine > Therapeutic Area > Gastroenterology (1.00)
(4 more...)

Technology:

Information Technology > Data Science > Data Mining > Big Data (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
(5 more...)

arXiv.org Machine LearningMay-3-2024

A comparative study of conformal prediction methods for valid uncertainty quantification in machine learning

Dewolf, Nicolas

In the past decades, most work in the area of data analysis and machine learning was focused on optimizing predictive models and getting better results than what was possible with existing models. To what extent the metrics with which such improvements were measured were accurately capturing the intended goal, whether the numerical differences in the resulting values were significant, or whether uncertainty played a role in this study and if it should have been taken into account, was of secondary importance. Whereas probability theory, be it frequentist or Bayesian, used to be the gold standard in science before the advent of the supercomputer, it was quickly replaced in favor of black box models and sheer computing power because of their ability to handle large data sets. This evolution sadly happened at the expense of interpretability and trustworthiness. However, while people are still trying to improve the predictive power of their models, the community is starting to realize that for many applications it is not so much the exact prediction that is of importance, but rather the variability or uncertainty. The work in this dissertation tries to further the quest for a world where everyone is aware of uncertainty, of how important it is and how to embrace it instead of fearing it. A specific, though general, framework that allows anyone to obtain accurate uncertainty estimates is singled out and analysed. Certain aspects and applications of the framework -- dubbed `conformal prediction' -- are studied in detail. Whereas many approaches to uncertainty quantification make strong assumptions about the data, conformal prediction is, at the time of writing, the only framework that deserves the title `distribution-free'. No parametric assumptions have to be made and the nonparametric results also hold without having to resort to the law of large numbers in the asymptotic regime.

clusterwise average prediction, clusterwise representation complexity, conditional nonconformity distribution, (16 more...)

2405.02082

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.13)
Asia > Middle East > Jordan (0.04)
Europe > Belgium > Flanders > East Flanders > Ghent (0.04)
(6 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)

Industry:

Transportation (1.00)
Health & Medicine (1.00)
Education > Educational Setting (1.00)

Technology:

Information Technology > Modeling & Simulation (1.00)
Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications (1.00)
(6 more...)

Godichon-Baggioni, Antoine, Lu, Wei, Portier, Bruno

A Full Adagrad algorithm with O(Nd) operations

arXiv.org Machine LearningMay-3-2024

A novel approach is given to overcome the computational challenges of the full-matrix Adaptive Gradient algorithm (Full AdaGrad) in stochastic optimization. By developing a recursive method that estimates the inverse of the square root of the covariance of the gradient, alongside a streaming variant for parameter updates, the study offers efficient and practical algorithms for large-scale applications. This innovative strategy significantly reduces the complexity and resource demands typically associated with full-matrix methods, enabling more effective optimization processes. Moreover, the convergence rates of the proposed estimators and their asymptotic efficiency are given. Their effectiveness is demonstrated through numerical studies.

algorithm, full adagrad algorithm, opération, (11 more...)

2405.01908

Country:

Europe > France > Normandy > Seine-Maritime > Rouen (0.04)
North America > United States > Colorado (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Li, Weizhen, Carvalho, Rui

Automating the Discovery of Partial Differential Equations in Dynamical Systems

arXiv.org Machine LearningMay-2-2024

In recent years, scientists have increasingly employed statistical and machine learning methods to uncover the governing equations of dynamical systems, particularly differential equations, from observational data [1-5]. Data-driven methods offer several advantages over traditional approaches that rely on first principles and expert knowledge. These methods can reveal patterns and relationships in the data that may not be apparent from first principles, providing new insights into complex systems [6, 7]. They are also adept at working with noisy or incomplete data commonly encountered in real-world applications, employing techniques from machine learning to enhance the robustness of discoveries [8-11]. Furthermore, by reducing the need for manual intervention and domain expertise, data-driven methods can significantly streamline the discovery process [12]. Data-driven discovery in dynamical systems has evolved from early parameter estimation using spline approximation and system reconstruction [13, 14], to leveraging statistical methods such as least squares [15-17], mixed-effects models [18, 19], and Bayesian approaches [2, 20] for parameter estimation in ordinary and partial differential equations (ODEs and PDEs). The advent of high-performance computing has further propelled symbolic regression, enabling the discovery of governing equations from data in physics and engineering [1, 21-23]. A notable development in this field is the Sparse Identification of Nonlinear Dynamics (SINDy) approach [3, 4], which constructs an extensive library of potential terms and employs the Sequential Threshold Ridge Regression (STRidge) algorithm [4] to select significant terms iteratively.

artificial intelligence, deep learning, machine learning, (17 more...)

2404.16444

Country:

Europe > United Kingdom > England > Durham > Durham (0.04)
North America > United States > New York > New York County > New York City (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Iran > Tehran Province > Tehran (0.04)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.46)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (0.46)
Health & Medicine > Therapeutic Area > Immunology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Ma, Kevin, Grandi, Daniele, McComb, Christopher, Goucher-Lambert, Kosa

Exploring the Capabilities of Large Language Models for Generating Diverse Design Solutions

arXiv.org Artificial IntelligenceMay-2-2024

Access to large amounts of diverse design solutions can support designers during the early stage of the design process. In this paper, we explore the efficacy of large language models (LLM) in producing diverse design solutions, investigating the level of impact that parameter tuning and various prompt engineering techniques can have on the diversity of LLM-generated design solutions. Specifically, LLMs are used to generate a total of 4,000 design solutions across five distinct design topics, eight combinations of parameters, and eight different types of prompt engineering techniques, comparing each combination of parameter and prompt engineering method across four different diversity metrics. LLM-generated solutions are compared against 100 human-crowdsourced solutions in each design topic using the same set of diversity metrics. Results indicate that human-generated solutions consistently have greater diversity scores across all design topics. Using a post hoc logistic regression analysis we investigate whether these differences primarily exist at the semantic level. Results show that there is a divide in some design topics between humans and LLM-generated solutions, while others have no clear divide. Taken together, these results contribute to the understanding of LLMs' capabilities in generating a large volume of diverse design solutions and offer insights for future research that leverages LLMs to generate diverse design solutions for a broad range of design tasks (e.g., inspirational stimuli).

design solution, diversity, llm, (14 more...)

2405.02345

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
(2 more...)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)

Wang, Han, Kawasaki, Eiji, Damblin, Guillaume, Daniel, Geoffrey

Multivariate Bayesian Last Layer for Regression: Uncertainty Quantification and Disentanglement

arXiv.org Machine LearningMay-2-2024

We present new Bayesian Last Layer models in the setting of multivariate regression under heteroscedastic noise, and propose an optimization algorithm for parameter learning. Bayesian Last Layer combines Bayesian modelling of the predictive distribution with neural networks for parameterization of the prior, and has the attractive property of uncertainty quantification with a single forward pass. The proposed framework is capable of disentangling the aleatoric and epistemic uncertainty, and can be used to transfer a canonically trained deep neural network to new data domains with uncertainty-aware capability.

epistemic uncertainty, matrix, xx 1, (13 more...)

2405.01761

Country:

Europe > France (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)

Simplifying Kinematic Parameter Estimation in sEMG Prosthetic Hands: A Two-Point Approach

Liu, Gang, Wang, Zhenxiang, He, Ziyang, Guo, Shanshan, Zhang, Rui, Yao, Dezhong

Regression-based sEMG prosthetic hands are widely used for their ability to provide continuous kinematic parameters. However, establishing these models traditionally requires complex kinematic sensor systems to collect corresponding kinematic data in synchronization with EMG, which is cumbersome and user-unfriendly. This paper presents a simplified approach utilizing only two data points to depict kinematic parameters. Finger flexion is recorded as 1, extension as -1, and a near-linear model is employed to interpolate intermediate values, offering a viable alternative for kinematic data. We validated the approach with twenty participants through offline analysis and online experiments. The offline analysis confirmed the model's capability to fill in intermediate points and the online experiments demonstrated that participants could control gestures, adjust force accurately. This study significantly reduces the complexity of collecting dynamic parameters in EMG-based regression prosthetics, thus enhancing usability for prosthetic hands.

engineering, experiment, semg signal, (15 more...)

2407.00014

Country:

Asia > China > Henan Province > Zhengzhou (0.05)
Asia > China > Shanghai > Shanghai (0.05)
Asia > China > Sichuan Province > Chengdu (0.04)
(4 more...)

Genre: Research Report > New Finding (1.00)

Industry:

Health & Medicine > Health Care Technology (1.00)
Health & Medicine > Therapeutic Area > Neurology (0.94)
Health & Medicine > Therapeutic Area > Orthopedics/Orthopedic Surgery (0.90)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.46)

Hu, Lunjia, Peale, Charlotte, Shen, Judy Hanwen

Multigroup Robustness

To address the shortcomings of real-world datasets, robust learning algorithms have been designed to overcome arbitrary and indiscriminate data corruption. However, practical processes of gathering data may lead to patterns of data corruption that are localized to specific partitions of the training dataset. Motivated by critical applications where the learned model is deployed to make predictions about people from a rich collection of overlapping subpopulations, we initiate the study of multigroup robust algorithms whose robustness guarantees for each subpopulation only degrade with the amount of data corruption inside that subpopulation. When the data corruption is not distributed uniformly over subpopulations, our algorithms provide more meaningful robustness guarantees than standard guarantees that are oblivious to how the data corruption and the affected subpopulations are related. Our techniques establish a new connection between multigroup fairness and robustness.

algorithm, predictor, robustness, (16 more...)

2405.00614

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Germany (0.04)
(2 more...)

Genre: Research Report > New Finding (0.47)

Industry: Government > Regional Government > North America Government > United States Government (0.45)

Technology:

Information Technology > Data Science (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.69)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Condor, Aubrey, Pardos, Zachary

Explainable Automatic Grading with Neural Additive Models

The use of automatic short answer grading (ASAG) models may help alleviate the time burden of grading while encouraging educators to frequently incorporate open-ended items in their curriculum. However, current state-of-the-art ASAG models are large neural networks (NN) often described as "black box", providing no explanation for which characteristics of an input are important for the produced output. This inexplicable nature can be frustrating to teachers and students when trying to interpret, or learn from an automatically-generated grade. To create a powerful yet intelligible ASAG model, we experiment with a type of model called a Neural Additive Model that combines the performance of a NN with the explainability of an additive model. We use a Knowledge Integration (KI) framework from the learning sciences to guide feature engineering to create inputs that reflect whether a student includes certain ideas in their response. We hypothesize that indicating the inclusion (or exclusion) of predefined ideas as features will be sufficient for the NAM to have good predictive power and interpretability, as this may guide a human scorer using a KI rubric. We compare the performance of the NAM with another explainable model, logistic regression, using the same features, and to a non-explainable neural model, DeBERTa, that does not require feature engineering.

explainable automatic grading, nam, shape function, (10 more...)

2405.00489

Country:

North America > United States > California > Alameda County > Berkeley (0.14)
Europe > United Kingdom > England > Greater London > London (0.04)
Asia > Middle East > Iran > Hamadan Province > Hamadan (0.04)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry:

Education > Curriculum > Subject-Specific Education (0.48)
Education > Educational Technology > Educational Software (0.46)
Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.34)

Zhang, Xunzheng, Moazzeni, Shadi, Parra-Ullauri, Juan Marcelo, Nejabati, Reza, Simeonidou, Dimitra

Federated Transfer Component Analysis Towards Effective VNF Profiling

The increasing concerns of knowledge transfer and data privacy challenge the traditional gather-and-analyse paradigm in networks. Specifically, the intelligent orchestration of Virtual Network Functions (VNFs) requires understanding and profiling the resource consumption. However, profiling all kinds of VNFs is time-consuming. It is important to consider transferring the well-profiled VNF knowledge to other lack-profiled VNF types while keeping data private. To this end, this paper proposes a Federated Transfer Component Analysis (FTCA) method between the source and target VNFs. FTCA first trains Generative Adversarial Networks (GANs) based on the source VNF profiling data, and the trained GANs model is sent to the target VNF domain. Then, FTCA realizes federated domain adaptation by using the generated source VNF data and less target VNF profiling data, while keeping the raw data locally. Experiments show that the proposed FTCA can effectively predict the required resources for the target VNF. Specifically, the RMSE index of the regression model decreases by 38.5% and the R-squared metric advances up to 68.6%.

ftca, target vnf, vnf, (15 more...)

2404.17553

Country:

Europe > United Kingdom (0.04)
Asia > China (0.04)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (0.86)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (0.35)