AITopics

Plotting

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Reviewer 1: " the statement in line 153 in the neighbourhood of z J

Neural Information Processing SystemsMay-31-2025, 15:44:21 GMT

We are grateful to the reviewers for the insightful comments on our submission. All the minor comments will also be addressed in the revised manuscript. Response: We appreciate the reviewer's comment and suggestion. We will update line 153 to "f(x) does not change The domain of z can be easily adjusted by translation and dilation after the training process. Reviewer 1: "emphasize the need for gradient evaluations when you state the observation." Response: The observation statement (line 118-120) will be updated to "For a fixed pair (x, z) satisfying z = g(x), if Response: We agree with both reviewers that the caption of Figure 2 is not very clear.

artificial intelligence, machine learning, reviewer 4, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Leveraging Contrastive Learning for Enhanced Node Representations in Tokenized Graph Transformers Jinsong Chen 1,2,3, John E. Hopcroft 3,4

Neural Information Processing SystemsMay-31-2025, 15:44:07 GMT

While tokenized graph Transformers have demonstrated strong performance in node classification tasks, their reliance on a limited subset of nodes with high similarity scores for constructing token sequences overlooks valuable information from other nodes, hindering their ability to fully harness graph information for learning optimal node representations. To address this limitation, we propose a novel graph Transformer called GCFormer. Unlike previous approaches, GCFormer develops a hybrid token generator to create two types of token sequences, positive and negative, to capture diverse graph information. And a tailored Transformer-based backbone is adopted to learn meaningful node representations from these generated token sequences. Additionally, GCFormer introduces contrastive learning to extract valuable information from both positive and negative token sequences, enhancing the quality of learned node representations. Extensive experimental results across various datasets, including homophily and heterophily graphs, demonstrate the superiority of GCFormer in node classification, when compared to representative graph neural networks (GNNs) and graph Transformers.

artificial intelligence, machine learning, representation, (15 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.88)

Add feedback

Sub-optimal Experts mitigate Ambiguity in Inverse Reinforcement Learning

Neural Information Processing SystemsMay-31-2025, 15:43:50 GMT

Inverse Reinforcement Learning (IRL) deals with the problem of deducing a reward function that explains the behavior of an expert agent who is assumed to act optimally in an underlying unknown task. Recent works have studied the IRL problem from the perspective of recovering the feasible reward set, i.e., the class of reward functions that are compatible with a unique optimal expert. However, in several problems of interest it is possible to observe the behavior of multiple experts with different degree of optimality (e.g., racing drivers whose skills ranges from amateurs to professionals). For this reason, in this work, we focus on the reconstruction of the feasible reward set when, in addition to demonstrations from the optimal expert, we observe the behavior of multiple sub-optimal experts. Given this problem, we first study the theoretical properties showing that the presence of multiple sub-optimal experts, in addition to the optimal one, can significantly shrink the set of compatible rewards, ultimately mitigating the inherent ambiguity of IRL. Furthermore, we study the statistical complexity of estimating the feasible reward set with a generative model and analyze a uniform sampling algorithm that turns out to be minimax optimal whenever the sub-optimal experts' performance level is sufficiently close to that of the optimal expert.

machine learning, reinforcement learning, sub-optimal expert, (19 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.67)

Industry: Leisure & Entertainment > Sports > Motorsports (0.48)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

On the convergence of single-call stochastic extra-gradient methods

Yu-Guan Hsieh, Franck Iutzeler, Jérôme Malick, Panayotis Mertikopoulos

Neural Information Processing SystemsMay-31-2025, 15:43:25 GMT

Variational inequalities have recently attracted considerable interest in machine learning as a flexible paradigm for models that go beyond ordinary loss function minimization (such as generative adversarial networks and related deep learning systems). In this setting, the optimal O(1/t) convergence rate for solving smooth monotone variational inequalities is achieved by the Extra-Gradient (EG) algorithm and its variants. Aiming to alleviate the cost of an extra gradient step per iteration (which can become quite substantial in deep learning applications), several algorithms have been proposed as surrogates to Extra-Gradient with a single oracle call per iteration. In this paper, we develop a synthetic view of such algorithms, and we complement the existing literature by showing that they retain a O(1/t) ergodic convergence rate in smooth, deterministic problems. Subsequently, beyond the monotone deterministic case, we also show that the last iterate of single-call, stochastic extra-gradient methods still enjoys a O(1/t) local convergence rate to solutions of non-monotone variational inequalities that satisfy a second-order sufficient condition.

algorithm, artificial intelligence, machine learning, (20 more...)

Neural Information Processing Systems

Country:

Genre: Research Report (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A-FedPD: Aligning Dual-Drift is All Federated Primal-Dual Learning Needs

Neural Information Processing SystemsMay-31-2025, 15:42:13 GMT

As a popular paradigm for juggling data privacy and collaborative training, federated learning (FL) is flourishing to distributively process the large scale of heterogeneous datasets on edged clients. Due to bandwidth limitations and security considerations, it ingeniously splits the original problem into multiple subproblems to be solved in parallel, which empowers primal dual solutions to great application values in FL. In this paper, we review the recent development of classical federated primal dual methods and point out a serious common defect of such methods in non-convex scenarios, which we say is a "dual drift" caused by dual hysteresis of those longstanding inactive clients under partial participation training. To further address this problem, we propose a novel Aligned Federated Primal Dual (A-FedPD) method, which constructs virtual dual updates to align global consensus and local dual variables for those protracted unparticipated local clients. Meanwhile, we provide a comprehensive analysis of the optimization and generalization efficiency for the A-FedPD method on smooth non-convex objectives, which confirms its high efficiency and practicality. Extensive experiments are conducted on several classical FL setups to validate the effectiveness of our proposed method.

artificial intelligence, machine learning, optimization problem, (16 more...)

Neural Information Processing Systems

Country: South America > Peru (0.14)

Genre: Research Report > Experimental Study (0.93)

Industry: Information Technology > Security & Privacy (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.67)

Add feedback

Stochastic Continuous Greedy ++: When Upper and Lower Bounds Match

Amin Karbasi, Hamed Hassani, Aryan Mokhtari, Zebang Shen

Neural Information Processing SystemsMay-31-2025, 15:38:48 GMT

In this paper, we develop Stochastic Continuous Greedy++ (SCG++), the first efficient variant of a conditional gradient method for maximizing a continuous submodular function subject to a convex constraint.

artificial intelligence, complexity, machine learning, (15 more...)

Neural Information Processing Systems

Country: North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.31)

Add feedback

General response to all reviewers regarding empirical study of SCG++

Neural Information Processing SystemsMay-31-2025, 15:38:34 GMT

We thank the reviewers for their careful consideration and constructive feedback. Below, please find our responses. Since no experimental evidence are provided... A1: Please see our general response above. Q2: If'F' is available in closed form and the its gradients can be computed exactly.... A2: If the gradients can be In this case, both SCG and SCG++ reach a (1 1/e ɛ)-opt solution with O(1/ɛ) oracle calls. Q4: Extension to the discrete setting: I do not understand how one would compute the multilinear extension efficiently.

artificial intelligence, function evaluation, reviewer, (13 more...)

Neural Information Processing Systems

Genre: Research Report (0.30)

Technology: Information Technology > Artificial Intelligence (0.30)

Add feedback

GRU-ODE-Bayes: Continuous modeling of sporadically-observed time series

Neural Information Processing SystemsMay-31-2025, 15:38:00 GMT

Modeling real-world multidimensional time series can be particularly challenging when these are sporadically observed (i.e., sampling is irregular both in time and across dimensions)--such as in the case of clinical patient data. To address these challenges, we propose (1) a continuous-time version of the Gated Recurrent Unit, building upon the recent Neural Ordinary Differential Equations (Chen et al., 2018), and (2) a Bayesian update network that processes the sporadic observations. We bring these two ideas together in our GRU-ODE-Bayes method. We then demonstrate that the proposed method encodes a continuity prior for the latent process and that it can exactly represent the Fokker-Planck dynamics of complex processes driven by a multidimensional stochastic differential equation. Additionally, empirical evaluation shows that our method outperforms the state of the art on both synthetic data and real-world data with applications in healthcare and climate forecast. What is more, the continuity prior is shown to be well suited for low number of samples settings.

artificial intelligence, data mining, machine learning, (16 more...)

Neural Information Processing Systems

Country: Europe > Belgium > Flanders (0.15)

Genre: Research Report > Experimental Study (0.68)

Industry: Health & Medicine (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.34)

Add feedback

455cb2657aaa59e32fad80cb0b65b9dc-AuthorFeedback.pdf

Neural Information Processing SystemsMay-31-2025, 15:37:47 GMT

We thank reviewers for the relevant comments. We first address general questions and then give brief individual answers. GRU-ODE integrates the dynamics of the hidden process h(t) in time. Those projected distributions vary smoothly as they are driven by an ODE. Point processes (Mei and Eisner, NeurIPS 2017; Gunawardana et al., NeurIPS 2011) are intrinsically continuous as Continuous-time Bayesian networks (Nodelman et al., UAI 2002) address a This joint modeling of continuous measurements and events was left for future work.

artificial intelligence, gru-bayes, machine learning, (14 more...)

Neural Information Processing Systems

Industry: Health & Medicine (0.30)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.69)

Add feedback

SocraticLM: Exploring Socratic Personalized Teaching with Large Language Models 1,2

Neural Information Processing SystemsMay-31-2025, 15:37:33 GMT

Large language models (LLMs) are considered a crucial technology for advancing intelligent education since they exhibit the potential for an in-depth understanding of teaching scenarios and providing students with personalized guidance. Nonetheless, current LLM-based application in personalized teaching predominantly follows a "Question-Answering" paradigm, where students are passively provided with answers and explanations. In this paper, we propose SocraticLM, which achieves a Socratic "Thought-Provoking" teaching paradigm that fulfills the role of a real classroom teacher in actively engaging students in the thought process required for genuine problem-solving mastery. To build SocraticLM, we first propose a novel "Dean-Teacher-Student" multi-agent pipeline to construct a new dataset, SocraTeach, which contains 35K meticulously crafted Socratic-style multi-round (equivalent to 208K single-round) teaching dialogues grounded in fundamental mathematical problems. Our dataset simulates authentic teaching scenarios, interacting with six representative types of simulated students with different cognitive states, and strengthening four crucial teaching abilities. SocraticLM is then fine-tuned on SocraTeach with three strategies balancing its teaching and reasoning abilities. Moreover, we contribute a comprehensive evaluation system encompassing five pedagogical dimensions for assessing the teaching quality of LLMs. Extensive experiments verify that SocraticLM achieves significant improvements in the teaching performance, outperforming GPT4 by more than 12%.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country: Asia > China (0.14)

Genre: Research Report > Experimental Study (0.93)

Industry: