AITopics | Educational Setting

Collaborating Authors

Educational Setting

Unveiling Causal Reasoning in Large Language Models: Reality or Mirage?

Neural Information Processing SystemsMar-26-2025, 23:05:58 GMT

Causal reasoning capability is critical in advancing large language models (LLMs) toward strong artificial intelligence. While versatile LLMs appear to have demonstrated capabilities in understanding contextual causality and providing responses that obey the laws of causality, it remains unclear whether they perform genuine causal reasoning akin to humans. However, current evidence indicates the contrary. Specifically, LLMs are only capable of performing shallow (level-1) causal reasoning, primarily attributed to the causal knowledge embedded in their parameters, but they lack the capacity for genuine human-like (level-2) causal reasoning. To support this hypothesis, methodologically, we delve into the autoregression mechanism of transformer-based LLMs, revealing that it is not inherently causal. Empirically, we introduce a new causal Q&A benchmark called CausalProbe-2024, whose corpora are fresh and nearly unseen for the studied LLMs. The LLMs exhibit a significant performance drop on CausalProbe-2024 compared to earlier benchmarks, indicating the fact that they primarily engage in level-1 causal reasoning. To bridge the gap towards level-2 causal reasoning, we draw inspiration from the fact that human reasoning is usually facilitated by general knowledge and intended goals.

large language model, machine learning, natural language, (19 more...)

Neural Information Processing Systems

Country:

Asia (0.28)
Europe > United Kingdom > England (0.14)

Genre:

Research Report > Experimental Study (1.00)
Overview (0.93)

Industry:

Education > Educational Setting (0.92)
Media > News (0.67)
Information Technology (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Online Learning of Quantum States

Scott Aaronson, Xinyi Chen, Elad Hazan, Satyen Kale, Ashwin Nayak

Neural Information Processing SystemsMar-26-2025, 23:03:45 GMT

Neural Information Processing Systems http://nips.cc/

artificial intelligence, machine learning, north america government, (17 more...)

Neural Information Processing Systems

Country: North America > United States (0.93)

Industry:

Education > Educational Setting > Online (0.53)
Government > Regional Government (0.46)

Technology:

Information Technology > Enterprise Applications > Human Resources > Learning Management (0.52)
Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.46)

Add feedback

PCoTTA: Continual Test-Time Adaptation for Multi-Task Point Cloud Understanding

Neural Information Processing SystemsMar-26-2025, 22:41:27 GMT

We introduce a multi-task setting for PCoTTA, which is practical and realistic, handling multiple tasks within one unified model during the continual adaptation. Our PCoTTA involves three key components: automatic prototype mixture (APM), Gaussian Splatted feature shifting (GSFS), and contrastive prototype repulsion (CPR). Firstly, APM is designed to automatically mix the source prototypes with the learnable prototypes with a similarity balancing factor, avoiding catastrophic forgetting. Then, GSFS dynamically shifts the testing sample toward the source domain, mitigating error accumulation in an online manner. In addition, CPR is proposed to pull the nearest learnable prototype close to the testing feature and push it away from other prototypes, making each prototype distinguishable during the adaptation. Experimental comparisons lead to a new benchmark, demonstrating PCoTTA's superiority in boosting the model's transferability towards the continually changing target domain. Our source code is available at: https://github.com/Jinec98/PCoTTA.

artificial intelligence, machine learning, natural language, (14 more...)

Neural Information Processing Systems

Country: Asia > China (0.14)

Genre: Research Report > Experimental Study (0.93)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Vision (0.95)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Learning To Learn Around A Common Mean

Giulia Denevi, Carlo Ciliberto, Dimitris Stamos, Massimiliano Pontil

Neural Information Processing SystemsMar-26-2025, 22:24:51 GMT

We show that, in this setting, the L TL problem can be reformulated as a Least Squares (LS) problem and we exploit a novel meta-algorithm to efficiently solve it.

algorithm, artificial intelligence, machine learning, (12 more...)

Neural Information Processing Systems

Country: North America > United States (0.28)

Industry: Education > Educational Setting (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Transfer Learning (0.65)

Add feedback

Benchmarking Long-context Document Understanding with Visualizations

Neural Information Processing SystemsMar-26-2025, 22:24:18 GMT

Understanding documents with rich layouts and multi-modal components is a long-standing and practical task. Recent Large Vision-Language Models (LVLMs) have made remarkable strides in various tasks, particularly in single-page document understanding (DU). However, their abilities on long-context DU remain an open problem.

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Country:

Europe (0.67)
North America > United States > Illinois (0.14)
Asia > Middle East > UAE (0.14)

Genre: Research Report (1.00)

Industry:

Leisure & Entertainment (0.93)
Information Technology (0.93)
Law (0.68)
(2 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Communications > Social Media (0.92)
Information Technology > Artificial Intelligence > Vision (0.88)
(2 more...)

Add feedback

Equitable Stable Matchings in Quadratic Time

Nikolaos Tziavelis, Ioannis Giannakopoulos, Katerina Doka, Nectarios Koziris, Panagiotis Karras

Neural Information Processing SystemsMar-26-2025, 21:56:12 GMT

Neural Information Processing Systems http://nips.cc/

algorithm, artificial intelligence, stable matching, (12 more...)

Neural Information Processing Systems

Country:

Europe (1.00)
North America > Canada (0.28)

Genre: Research Report > New Finding (0.68)

Industry: Education > Educational Setting (0.93)

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)

Add feedback

Computerized Adaptive Testing via Collaborative Ranking

Neural Information Processing SystemsMar-26-2025, 21:25:26 GMT

As the deep integration of machine learning and intelligent education, Computerized Adaptive Testing (CAT) has received more and more research attention. Compared to traditional paper-and-pencil tests, CAT can deliver both personalized and interactive assessments by automatically adjusting testing questions according to the performance of students during the test process. Therefore, CAT has been recognized as an efficient testing methodology capable of accurately estimating a student's ability with a minimal number of questions, leading to its widespread adoption in mainstream selective exams such as the GMAT and GRE. However, just improving the accuracy of ability estimation is far from satisfactory in the real-world scenarios, since an accurate ranking of students is usually more important (e.g., in high-stakes exams). Considering the shortage of existing CAT solutions in student ranking, this paper emphasizes the importance of aligning test outcomes (student ranks) with the true underlying abilities of students. Along this line, different from the conventional independent testing paradigm among students, we propose a novel collaborative framework, Collaborative Computerized Adaptive Testing (CCAT), that leverages inter-student information to enhance student ranking. By using collaborative students as anchors to assist in ranking test-takers, CCAT can give both theoretical guarantees and experimental validation for ensuring ranking consistency.

artificial intelligence, machine learning, natural language, (14 more...)

Neural Information Processing Systems

Country:

North America > United States (0.46)
Asia > China > Anhui Province (0.14)
Europe > Austria > Vienna (0.14)

Genre:

Research Report > Experimental Study (0.93)
Research Report > New Finding (0.92)

Industry: Education > Educational Setting > Online (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.94)
Information Technology > Data Science (0.93)
(3 more...)

Add feedback

Appendix A Ethics Statement MATH-V, we should like to demonstrate that we are far from the boundary for action or infringement

Neural Information Processing SystemsMar-26-2025, 20:45:48 GMT

Question: What time does the measure the length of the line to shown?

large language model, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: Europe (0.14)

Industry: Education > Educational Setting > K-12 Education (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.69)
(2 more...)

Add feedback

Measuring Multimodal Mathematical Reasoning with the MATH-Vision Dataset Ke Wang 1, Junting Pan 1,4,, Zimu Lu

Neural Information Processing SystemsMar-26-2025, 20:45:43 GMT

Recent advancements in Large Multimodal Models (LMMs) have shown promising results in mathematical reasoning within visual contexts, with models exceeding human-level performance on existing benchmarks such as MathVista. However, we observe significant limitations in the diversity of questions and breadth of subjects covered by these benchmarks. To address this issue, we present the MATH-Vision (MATH-V) dataset, a meticulously curated collection of 3,040 highquality mathematical problems with visual contexts sourced from real math competitions. Spanning 16 distinct mathematical disciplines and graded across 5 levels of difficulty, our dataset provides a comprehensive and diverse set of challenges for evaluating LMMs' mathematical reasoning abilities. Through extensive experimentation, we unveil a notable performance gap between current LMMs and human performance on MATH-V, underscoring the imperative for further advancements in LMMs. Moreover, our detailed categorization allows for a thorough error analysis of LMMs, offering valuable insights to guide future research and development.

large language model, machine learning, natural language, (22 more...)

Neural Information Processing Systems

Country:

Europe (0.67)
Asia > China (0.27)
North America > United States > California (0.14)

Genre: Research Report > New Finding (0.45)

Industry:

Education > Educational Setting > K-12 Education (0.92)
Transportation (0.67)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
(4 more...)

Add feedback

Towards Challenging Real World Spreadsheet Manipulation

Neural Information Processing SystemsMar-26-2025, 20:26:51 GMT

The associated spreadsheets from the forums contain a variety of tabular data such as multiple tables, non-standard relational tables, and abundant non-textual elements. Furthermore, we propose a more reliable evaluation metric akin to online judge platforms, where multiple spreadsheet files are created as test cases for each instruction, ensuring the evaluation of robust solutions capable of handling spreadsheets with varying values. Our comprehensive evaluation of various LLMs under both single-round and multi-round inference settings reveals a substantial gap between the state-ofthe-art (SOTA) models and human performance, highlighting the benchmark's difficulty.

large language model, machine learning, natural language, (21 more...)

Neural Information Processing Systems

Genre: Research Report (1.00)

Industry: