AITopics | krf

Collaborating Authors

krf

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Contents of the Appendix

Neural Information Processing SystemsApr-30-2026, 05:08:35 GMT

A.1 CIFAR-10 dataset Figure 6 displays test accuracy curves for all six backbone algorithms under three distinct imbalance parameters: 2{ 0.3,1,10}. The results clearly demonstrate that FedNAR outperforms the baselines, particularly in scenarios with imbalanced data. A.2 Shakespeare dataset The experimental results presented in Figure 7 and 8 showcase the outcomes of experiments performed on the Shakespeare dataset. Six backbone algorithms were utilized, with initial weight decay values selected from {10 3,10 4}. These findings serve as evidence that FedNAR, as an adaptive weight decay scheduling algorithm, exhibits effectiveness across various initial weight decay values.

algorithm, artificial intelligence, rfi, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.48)

Add feedback

A single gradient step finds adversarial examples on random two-layers neural networks

Neural Information Processing SystemsApr-25-2026, 22:45:01 GMT

Daniely and Schacham [2020] recently showed that gradient descent finds adversarial examples on random undercomplete two-layers ReLU neural networks. The term "undercomplete" refers to the fact that their proof only holds when the number of neurons is a vanishing fraction of the ambient dimension. We extend their result to the overcomplete case, where the number of neurons is larger than the dimension (yet also subexponential in the dimension). In fact we prove that a single step of gradient descent suffices. We also show this result for any subexponential width random neural network with smooth activation function.

artificial intelligence, machine learning, probability, (16 more...)

Neural Information Processing Systems

Country: Europe > United Kingdom > England (0.14)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

01db36a646c07c64dd39a92b4eceb417-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 07:38:40 GMT

apple 2, artificial intelligence, machine learning, (17 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.68)

Add feedback

002262941c9edfd472a79298b2ac5e17-Supplemental-Conference.pdf

Neural Information Processing SystemsApr-24-2026, 07:14:54 GMT

A.1 Proof Sketch We first introduce the following lemma: Lemma 1. Lemma 2. For matrices A,B 2Mn, if A B, then we have min(A) min(B)and max(A) max(B), where max() (resp., min()) denotes taking the maximum (resp., minimum) eigenvalue.. Proof of Lemma 2. For any matrix P 2Mn with P> = P, we have max(P) = max We first consider the condition number of ˆH when X is in a locally convex area. By equations 3 and 4, we have M1 H M2. Rearranging the terms yields H M1 0 and M2 H 0. Therefore, for any vector x 2RM, we have We next consider the minimum singular value of H and ˆH with min(H)= p min(H2) and min(ˆH)= q min(ˆH2) in any case. Under Assumption 1 and equation 4, we have H M2. Similarly, we can obtain H M2. By Lemma 2, we further have max(H) max(M2)= nmax 2 C.1 kr ˆf(ˆX) k2 vs. krf(X) k2 In this section, we explain why we use kr ˆf(ˆX) k2 rather than kr f(X) k2 to characterize the convergence rate. In general, it is hard to develop a convergence rate for objective values. However, when the global model is in a locally convex area of f, we can obtain the relationship between the gradient and the local optimum.

artificial intelligence, machine learning, rfi xti, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.69)

Add feedback

xTV0[j ] ReLU:max (,0)

Neural Information Processing SystemsFeb-11-2026, 05:07:16 GMT

By Eq. (103) and Condition 2, we have

artificial intelligence, hrfv, machine learning, (16 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.46)

Add feedback

There has been a longstanding dispute over which formalism is the best for representing knowledge in AI. The well-known "declarative vs. procedural controversy" is concerned with the choice of utilizing declarations or procedures as the primary mode of knowledge representation. The ongoing debate between symbolic AI and connectionist AI also revolves around the question of whether knowledge should be represented implicitly (e.g., as parametric knowledge in deep learning and large language models) or explicitly (e.g., as logical theories in traditional knowledge representation and reasoning). To address these issues, we propose a general framework to capture various knowledge representation formalisms in which we are interested. Within the framework, we find a family of universal knowledge representation formalisms, and prove that all universal formalisms are recursively isomorphic. Moreover, we show that all pairwise intertranslatable formalisms that admit the padding property are also recursively isomorphic. These imply that, up to an offline compilation, all universal (or natural and equally expressive) representation formalisms are in fact the same, which thus provides a partial answer to the aforementioned dispute.

artificial intelligence, formalism, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2412.11855

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Zhejiang Province > Hangzhou (0.04)
Asia > China > Tianjin Province > Tianjin (0.04)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Supplementary Material Outline

Neural Information Processing SystemsFeb-18-2024, 05:19:29 GMT

Such independent samples can be obtained by querying the SO at (x, y) for three times. A.2 Technical Lemmas for Lipschitz Properties and Hessian Inverse Estimation We first restate Lemmas 2.2 of (Ghadimi and Wang, 2018) to characterize the smoothness properties of y Lemma A.1 Suppose Assumptions 3.3 and 3.4 hold. Throughout this section, we assume Assumptions 3.1, 3.2, 3.3, and 3.4 hold and the step-sizes follow (5) that q q Therefore, under Assumption 3.3, for all t apple T, for all 1 apple j apple b, we have E[ku B.2 Lemma B.2 and Its Proof We quantify the convergence behavior of consensus errors under the choices of step-sizes (5) and (6) as follows. Lemma B.2 Suppose Assumptions 3.1, 3.2, 3.3, and 3.4 hold and the step-sizes satisfy Lemma B.3 Suppose Assumptions 3.1, 3.2, 3.3, and 3.4 hold. B.7 Proof of Theorem 5.1 Proof: We start our analysis by considering the term kȳ Throughout this subsection, we assume Assumptions 3.1, 3.2, 3.3, 3.4, and 5.2 hold. C.1 Lemma C.1 and Its Proof Lemma C.1 Suppose Assumptions 3.1, 3.2, 3.3, 3.4, and 5.2 hold and the objective F satisfies µ-PL Assumption 5.2 in addition.

apple 2, assumption 3, kr 2, (15 more...)

Neural Information Processing Systems

Country: Oceania > Australia (0.04)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (0.68)

Add feedback

A Proof of Theorem 1, A2, B1, B

Neural Information Processing SystemsFeb-18-2024, 04:14:14 GMT

A.1 Proof Sketch We first introduce the following lemma: Lemma 1. We first consider the condition number of Ĥ when X is in a locally convex area. In general, it is hard to develop a convergence rate for objective values. However, when the global model is in a locally convex area of f, we can obtain the relationship between the gradient and the local optimum. Theorem 4. When there is no parameter heat dispersion, and X is in a µ-strongly convex area of f We note that there is a difference between equation 18 and 21: for each client i, equation 18 involves all the parameters of the full model while equation 21 involves only partial parameters of the submodel, which causes a change in the lower bound of T (Y) and further leads to a change of conclusion.

apple, fedsubavg, heat dispersion, (16 more...)

Neural Information Processing Systems

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.69)

Add feedback