AITopics

2407.15525

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.05)
North America > Canada > Ontario > Toronto (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
(2 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.52)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Nishikawa, Naoki, Suzuki, Taiji

State Space Models are Comparable to Transformers in Estimating Functions with Dynamic Smoothness

arXiv.org Machine LearningMay-29-2024

Deep neural networks based on state space models (SSMs) are attracting much attention in sequence modeling since their computational cost is significantly smaller than that of Transformers. While the capabilities of SSMs have been primarily investigated through experimental comparisons, theoretical understanding of SSMs is still limited. In particular, there is a lack of statistical and quantitative evaluation of whether SSM can replace Transformers. In this paper, we theoretically explore in which tasks SSMs can be alternatives of Transformers from the perspective of estimating sequence-to-sequence functions. We consider the setting where the target function has direction-dependent smoothness and prove that SSMs can estimate such functions with the same convergence rate as Transformers. Additionally, we prove that SSMs can estimate the target function, even if the smoothness changes depending on the input sequence, as well as Transformers. Our results show the possibility that SSMs can replace Transformers when estimating the functions in certain classes that appear in practice.

log 2, ssm, transformer, (16 more...)

2405.19036

Country: Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

arXiv.org Artificial IntelligenceFeb-29-2024

Adaptive Testing Environment Generation for Connected and Automated Vehicles with Dense Reinforcement Learning

Yang, Jingxuan, Bai, Ruoxuan, Ji, Haoyuan, Zhang, Yi, Hu, Jianming, Feng, Shuo

The assessment of safety performance plays a pivotal role in the development and deployment of connected and automated vehicles (CAVs). A common approach involves designing testing scenarios based on prior knowledge of CAVs (e.g., surrogate models), conducting tests in these scenarios, and subsequently evaluating CAVs' safety performances. However, substantial differences between CAVs and the prior knowledge can significantly diminish the evaluation efficiency. In response to this issue, existing studies predominantly concentrate on the adaptive design of testing scenarios during the CAV testing process. Yet, these methods have limitations in their applicability to high-dimensional scenarios. To overcome this challenge, we develop an adaptive testing environment that bolsters evaluation robustness by incorporating multiple surrogate models and optimizing the combination coefficients of these surrogate models to enhance evaluation efficiency. We formulate the optimization problem as a regression task utilizing quadratic programming. To efficiently obtain the regression target via reinforcement learning, we propose the dense reinforcement learning method and devise a new adaptive policy with high sample efficiency. Essentially, our approach centers on learning the values of critical scenes displaying substantial surrogate-to-real gaps. The effectiveness of our method is validated in high-dimensional overtaking scenarios, demonstrating that our approach achieves notable evaluation efficiency.

combination coefficient, scenario, vehicle, (15 more...)

2402.19275

Country:

Asia > China > Beijing > Beijing (0.05)
North America > United States > Michigan (0.04)
Asia > China > Guangdong Province > Shenzhen (0.04)
(3 more...)

Genre: Research Report (1.00)

Industry:

Transportation > Ground > Road (1.00)
Automobiles & Trucks (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Utkin, Lev V., Eremenko, Danila Y., Konstantinov, Andrei V.

SurvBeNIM: The Beran-Based Neural Importance Model for Explaining the Survival Models

arXiv.org Machine LearningDec-11-2023

One of the important types of data in several applications is censored survival data processed in the framework of survival analysis [1, 2]. This type of data can be found in applications where objects are characterized by times to some events of interest, for example, by times to failure in reliability, times to recovery or times to death in medicine, times to bankruptcy of a bank or times to an economic crisis in economics. The important peculiarity of survival data is that the corresponding event does not necessarily occur during its observation period. In this case, we say about the so-called censored or right-censored data [3]. There are many machine learning models dealing with survival data, including models based on applying and extending the Cox proportional hazard model [4], for example, models presented in [5, 6], models based on a survival modification of random forests and called random survival forests (RSF) [7, 8, 9, 10, 11], models extending the neural networks [6, 12, 13, 14]. These models have gained considerable attention for their ability to analyze time-to-event data and to predict survival outcomes accurately. However, most models are perceived as black boxes, lacking interpretability.

artificial intelligence, machine learning, survbenim, (17 more...)

2312.06638

Country:

Asia > Russia (0.14)
North America > United States > New York (0.04)
North America > United States > New Jersey (0.04)
(2 more...)

Genre:

Research Report > New Finding (0.67)
Research Report > Experimental Study (0.66)

Industry:

Health & Medicine > Therapeutic Area > Oncology (0.93)
Law > Civil Rights & Constitutional Law (0.74)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Harel, Nimrod, Obolski, Uri, Gilad-Bachrach, Ran

Inherent Inconsistencies of Feature Importance

arXiv.org Artificial IntelligenceDec-5-2023

The rapid advancement and widespread adoption of machine learning-driven technologies have underscored the practical and ethical need for creating interpretable artificial intelligence systems. Feature importance, a method that assigns scores to the contribution of individual features on prediction outcomes, seeks to bridge this gap as a tool for enhancing human comprehension of these systems. Feature importance serves as an explanation of predictions in diverse contexts, whether by providing a global interpretation of a phenomenon across the entire dataset or by offering a localized explanation for the outcome of a specific data point. Furthermore, feature importance is being used both for explaining models and for identifying plausible causal relations in the data, independently from the model. However, it is worth noting that these various contexts have traditionally been explored in isolation, with limited theoretical foundations. This paper presents an axiomatic framework designed to establish coherent relationships among the different contexts of feature importance scores. Notably, our work unveils a surprising conclusion: when we combine the proposed properties with those previously outlined in the literature, we demonstrate the existence of an inconsistency. This inconsistency highlights that certain essential properties of feature importance scores cannot coexist harmoniously within a single framework.

feature importance score, importance score, value function, (15 more...)

2206.08204

Country:

Asia > Middle East > Israel > Tel Aviv District > Tel Aviv (0.05)
North America > United States > Hawaii > Honolulu County > Honolulu (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Industry: Health & Medicine > Therapeutic Area > Oncology (0.68)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Salaün, Corentin, Huang, Xingchang, Georgiev, Iliyan, Mitra, Niloy J., Singh, Gurprit

Efficient Gradient Estimation via Adaptive Sampling and Importance Sampling

arXiv.org Artificial IntelligenceNov-27-2023

Machine learning problems rely heavily on stochastic gradient descent (SGD) for optimization. The effectiveness of SGD is contingent upon accurately estimating gradients from a mini-batch of data samples. Instead of the commonly used uniform sampling, adaptive or importance sampling reduces noise in gradient estimation by forming mini-batches that prioritize crucial data points. Previous research has suggested that data points should be selected with probabilities proportional to their gradient norm. Nevertheless, existing algorithms have struggled to efficiently integrate importance sampling into machine learning frameworks. In this work, we make two contributions. First, we present an algorithm that can incorporate existing importance functions into our framework. Second, we propose a simplified importance function that relies solely on the loss gradient of the output layer. By leveraging our proposed gradient estimation techniques, we observe improved convergence in classification and regression tasks with minimal computational overhead. Stochastic gradient descent (SGD) combined with back-propagation and efficient gradient techniques--such as Adam [12]--has unlocked a realm of possibilities.

algorithm, data sample, gradient, (15 more...)

2311.14468

Country:

South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.05)
Europe > Germany > Saarland > Saarbrücken (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > France > Hauts-de-France > Nord > Lille (0.04)

Genre: Research Report > New Finding (0.66)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.96)

#artificialintelligenceJun-18-2021, 18:55:28 GMT

Machine Learning Explainability

One simple method is Permutation Feature Importance, It is a model inspection technique that can be used for any fitted estimator when the data is tabular. The permutation feature importance is defined to be the decrease in a model score when a single feature value is randomly shuffled. This procedure breaks the relationship between the feature and the target, thus the drop in the model score is indicative of how much the model depends on the feature. A good practice is to drop one of the correlated features based on domain understanding and try to apply the Permutation Feature Importance algorithm which will provide better feature understanding. Let's discuss another method to interpret the black box models.

algorithm, feature importance, machine learning explainability, (14 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.93)

Vogel, Robin, Achab, Mastane, Clémençon, Stéphan, Tillier, Charles

Weighted Empirical Risk Minimization: Sample Selection Bias Correction based on Importance Sampling

arXiv.org Machine LearningFeb-19-2020

ABSTRACT We consider statistical learning problems, when the distribution P ′ of the training observations Z ′ 1,..., Z′ n differs from the distribution P involved in the risk one seeks to minimize (referred to as the test distribution) but is still defined on the same measurable space as P and dominates it. In the unrealistic case where the likelihood ratio Φ(z) dP/dP ′ (z) is known, one may straightforwardly extends the Empirical Risk Minimization (ERM) approach to this specific transfer learning setup using the same idea as that behind Importance Sampling, by minimizing a weighted version of the empirical risk functional computed from the'biased' training data Zi ′ with weights Φ(Zi ′). Although the importance function Φ(z) is generally unknown in practice, we show that, in various situations frequently encountered in practice, it takes a simple form and can be directly estimated from the Zi ′ 's and some auxiliary information on the statistical population P. By means of linearization techniques, we then prove that the generalization capacity of the approach aforementioned is preserved when plugging the resulting estimates of the Φ(Zi ′)'s into the weighted empirical risk. Beyond these theoretical guarantees, numerical results provide strong empirical evidence of the relevance of the approach promoted in this article. Keywords: Statistical Learning Theory, Importance Sampling, Transfer Learning. 1 Introduction Prediction problems are of major importance in statistical learning. The main paradigm of predictive learning is Empirical Risk Minimization (ERM in abbreviated form), see e.g. In the standard setup, Z is a random variable (r.v. in short) that takes its values in a feature space Z with distribution P, Θ is a parameter space and l: Θ Z R is a (measurable) loss function. The risk is then defined by: θ Θ, R P (θ) E P [l(θ, Z)], (1) and more generally for any measure Q on Z: R Q (θ) l(θ, z)dQ(z).

dataset, information, strata, (14 more...)

2002.05145

Country: Europe > France (0.04)

Genre: Research Report (0.50)

Industry: Education (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Cheng, Jian, Druzdzel, Marek J.

Confidence Inference in Bayesian Networks

arXiv.org Artificial IntelligenceJan-10-2013

We present two sampling algorithms for probabilistic confidence inference in Bayesian networks. These two algorithms (we call them AIS-BN-mu and AIS-BN-sigma algorithms) guarantee that estimates of posterior probabilities are with a given probability within a desired precision bound. Our algorithms are based on recent advances in sampling algorithms for (1) estimating the mean of bounded random variables and (2) adaptive importance sampling in Bayesian networks. In addition to a simple stopping rule for sampling that they provide, the AIS-BN-mu and AIS-BN-sigma algorithms are capable of guiding the learning process in the AIS-BN algorithm. An empirical evaluation of the proposed algorithms shows excellent performance, even for very unlikely evidence.

artificial intelligence, bayesian inference, machine learning, (17 more...)

1301.226

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > California > San Mateo County > San Mateo (0.04)
(2 more...)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Yuan, Changhe, Druzdzel, Marek J.

An Importance Sampling Algorithm Based on Evidence Pre-propagation

arXiv.org Artificial IntelligenceOct-19-2012

Precision achieved by stochastic sampling algorithms for Bayesian networks typically deteriorates in face of extremely unlikely evidence. To address this problem, we propose the Evidence Pre-propagation Importance Sampling algorithm (EPIS-BN), an importance sampling algorithm that computes an approximate importance function by the heuristic methods: loopy belief Propagation and e-cutoff. We tested the performance of e-cutoff on three large real Bayesian networks: ANDES, CPCS, and PATHFINDER. We observed that on each of these networks the EPIS-BN algorithm gives us a considerable improvement over the current state of the art algorithm, the AIS-BN algorithm. In addition, it avoids the costly learning stage of the AIS-BN algorithm.

artificial intelligence, bayesian inference, machine learning, (18 more...)

1212.2507

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > New York > New York County > New York City (0.05)
North America > United States > California > San Mateo County > San Mateo (0.05)
(5 more...)

Genre: Research Report > New Finding (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)