AITopics | correlated

Collaborating Authors

correlated

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Supplementary File for " Stochastic Gradient Descent in Correlated Settings: AStudy on Gaussian Processes "

Neural Information Processing SystemsFeb-7-2026, 16:55:32 GMT

The supplementary file is organized as follows: Section 1 restates the assumptions and main theorems on the convergence of parameter iterates and the full gradient; Section 2 is devoted to the proofs of the two main theorems, while Section 3 includes the proofs of supporting lemmas; Section 4 includes additional figures from the numerical study. Under Assumptions 1.1 to 1.3, when m > C for some constant C > 0, we have the following results under two corresponding conditions on sl(m): First we present the following lemma, showing that the loss function has a property similar from strong convexity. For the first case discussed in Lemma 2.1, define eg(θ(k)) = (g(θ(k)))2, and for the second case define eg(θ(k)) = g(θ(k)). Therefore, combining Lemma 2.1, Lemma 2.2 and (7) leads to the following conclusion. Apply(15)inLemma 2.3 with = 12, then for any 0<α<1, with probability at least 1 2m α, we have A11 1 Under this case, we can still apply (15) in Lemma 2.3.

artificial intelligence, machine learning, probability, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.86)

Add feedback

Stochastic Gradient Descent in Correlated Settings: A Study on Gaussian Processes

Neural Information Processing SystemsDec-23-2025, 20:02:10 GMT

Stochastic gradient descent (SGD) and its variants have established themselves as the go-to algorithms for large-scale machine learning problems with independent samples due to their generalization performance and intrinsic computational advantage. However, the fact that the stochastic gradient is a biased estimator of the full gradient with correlated samples has led to the lack of theoretical understanding of how SGD behaves under correlated settings and hindered its use in such cases. In this paper, we focus on the Gaussian process (GP) and take a step forward towards breaking the barrier by proving minibatch SGD converges to a critical point of the full loss function, and recovers model hyperparameters with rate $O(\frac{1}{K})$ up to a statistical error term depending on the minibatch size. Numerical studies on both simulated and real datasets demonstrate that minibatch SGD has better generalization over state-of-the-art GP methods while reducing the computational burden and opening a new, previously unexplored, data size regime for GPs.

gaussian process, name change, stochastic gradient descent, (4 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.91)

Add feedback

Supplementary File for " Stochastic Gradient Descent in Correlated Settings: A Study on Gaussian Processes "

Neural Information Processing SystemsOct-2-2025, 08:53:06 GMT

The supplementary file is organized as follows: Section 1 restates the assumptions and main theorems on the convergence of parameter iterates and the full gradient; Section 2 is devoted to the proofs of the two main theorems, while Section 3 includes the proofs of supporting lemmas; Section 4 includes additional figures from the numerical study. Under Assumptions 1.1 to 1.3, when m > C for some constant C > 0, we have the following results under two corresponding conditions on s First we present the following lemma, showing that the loss function has a property similar from strong convexity. For the first case discussed in Lemma 2.1, define null g (θ ( k 1) (k 1) ( k 1) (k 1) ( k 1) (k 1) Therefore, combining Lemma 2.1, Lemma 2.2 and (7) leads to the following conclusion. Proof of Theorem 2. We start from bounding null Under this case, we can still apply (15) in Lemma 2.3. The following proof of this claim is very similar to the proof of Lemma 5.2 in [2].

artificial intelligence, machine learning, probability, (17 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.86)

Add feedback

Review for NeurIPS paper: Stochastic Gradient Descent in Correlated Settings: A Study on Gaussian Processes

Neural Information Processing SystemsJan-22-2025, 07:57:06 GMT

Additional Feedback: I'd like to see main paper Figure 1 / supplementary figure 4.1 expanded. The two questions I have that I don't think the figure currently answers are (1) how does the variance in final \sigma {2}_{f} across trials compare to a full batch GP, and (2) if full batch GPs have smaller variance, do much larger batch sizes (e.g., say m 1000) decrease this variance further? In figure 4.1, it does not seem the variance decreases much from m 16 to m 64 -- it'd be nice to know whether the batch size is the source of the variance. If it is, then running with very large batch sizes even up to m 10000 may not be too challenging. To the point of running large batch sizes, while the ability to use SGD will clearly outperform full batch training at some size N (at a guess, probably somewhere in the the N 100k-500k range), I don't think the results in Table 1 are necessarily representative of the settings you might actually want to run sgGP or EGP with.

gaussian process, neurips paper, stochastic gradient descent, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.85)

Add feedback

Review for NeurIPS paper: Stochastic Gradient Descent in Correlated Settings: A Study on Gaussian Processes

Neural Information Processing SystemsJan-22-2025, 07:56:59 GMT

All the reviewers agree that the paper presents a worthwhile theoretical contribution, which may facilitate/motivate further work to tackle more challenging problems. The main limitation of the work is its practical impact as the proposed analysis does not apply to the lengthscales. Although R3 stands by their comments, they expressed their willingness to accept and recognized, during discussions, this work as an excellent attempt at the problem. Overall, I believe the NeurIPS community will benefit from this work and recommend the authors to take the reviewers' suggestions and comments into consideration.

gaussian process, neurips paper, stochastic gradient descent, (2 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.85)

Add feedback

Stochastic Gradient Descent in Correlated Settings: A Study on Gaussian Processes

Neural Information Processing SystemsOct-9-2024, 16:30:29 GMT

Stochastic gradient descent (SGD) and its variants have established themselves as the go-to algorithms for large-scale machine learning problems with independent samples due to their generalization performance and intrinsic computational advantage. However, the fact that the stochastic gradient is a biased estimator of the full gradient with correlated samples has led to the lack of theoretical understanding of how SGD behaves under correlated settings and hindered its use in such cases. In this paper, we focus on the Gaussian process (GP) and take a step forward towards breaking the barrier by proving minibatch SGD converges to a critical point of the full loss function, and recovers model hyperparameters with rate O(\frac{1}{K}) up to a statistical error term depending on the minibatch size. Numerical studies on both simulated and real datasets demonstrate that minibatch SGD has better generalization over state-of-the-art GP methods while reducing the computational burden and opening a new, previously unexplored, data size regime for GPs.

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (1.00)

Add feedback

Are Classification Robustness and Explanation Robustness Really Strongly Correlated? An Analysis Through Input Loss Landscape

Chen, Tiejin, Huang, Wenwang, Pang, Linsey, Luo, Dongsheng, Wei, Hua

arXiv.org Artificial IntelligenceMar-9-2024

This paper delves into the critical area of deep learning robustness, challenging the conventional belief that classification robustness and explanation robustness in image classification systems are inherently correlated. Through a novel evaluation approach leveraging clustering for efficient assessment of explanation robustness, we demonstrate that enhancing explanation robustness does not necessarily flatten the input loss landscape with respect to explanation loss - contrary to flattened loss landscapes indicating better classification robustness. To deeply investigate this contradiction, a groundbreaking training method designed to adjust the loss landscape with respect to explanation loss is proposed. Through the new training method, we uncover that although such adjustments can impact the robustness of explanations, they do not have an influence on the robustness of classification. These findings not only challenge the prevailing assumption of a strong correlation between the two forms of robustness but also pave new pathways for understanding relationship between loss landscape and explanation loss.

classification robustness, explanation robustness, robustness, (11 more...)

arXiv.org Artificial Intelligence

2403.06013

Country: North America > United States > Arizona (0.04)

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.89)

Add feedback