AITopics | toc

Collaborating Authors

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

EDGI: Equivariant Diffusion for Planning with Embodied Agents Johann Brehmer Qualcomm AI Research

Neural Information Processing SystemsFeb-17-2026, 01:58:38 GMT

We integrate this model in a planning loop, where conditioning and classifier guidance let us softly break the symmetry for specific tasks as needed.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country: North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report (0.68)

Industry:

Telecommunications (0.41)
Semiconductors & Electronics (0.41)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.50)
(2 more...)

Add feedback

ToC: Tree-of-Claims Search with Multi-Agent Language Models

Yu, Shuyang, Liang, Jianan, Hu, Hui

arXiv.org Artificial IntelligenceNov-24-2025

Optimizing patent claims is a critical yet challenging task, demanding careful balance between maximizing novelty and preserving legal scope. Manual claim drafting is labor-intensive, costly, and inherently inconsistent, while conventional Large Language Models (LLMs) often lack the structured, iterative reasoning essential for precise claim refinement. To address these challenges, we introduce Tree of Claims (ToC), an innovative framework that redefines claim editing as a guided search problem. ToC synergistically integrates Monte Carlo Tree Search (MCTS) with a collaborative multi-agent system, comprising an LLM-based EditorAgent that proposes contextually grounded edits, and an ExaminerAgent that mimics patent examiner critiques through structured, chain-of-thought analyses of novelty and prior art disclosure. Driven by a carefully designed multi-objective reward function, ToC jointly optimizes novelty, scope retention, and semantic coherence. Experimental evaluation on a benchmark of 1145 claims demonstrates that ToC significantly outperforms standard LLMs in zero-shot and few-shot scenarios, achieving an average composite score improvement of 8\%, and up to 9\% in certain cases. Extensive experiments, including detailed ablation studies, validate ToC's efficacy in generating superior, legally robust claim revisions. Overall, ToC establishes a transparent, controllable, and interpretable methodology that effectively bridges advanced LLM reasoning capabilities with strategic MCTS planning for structured patent claim optimization.The source code is available at https://github.com/ysy2003/ToC.

arxiv preprint arxiv, large language model, natural language, (16 more...)

arXiv.org Artificial Intelligence

2511.16972

Genre: Research Report (0.64)

Industry: Law > Intellectual Property & Technology Law (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Tree-of-Code: A Tree-Structured Exploring Framework for End-to-End Code Generation and Execution in Complex Task Handling

Ni, Ziyi, Li, Yifan, Yang, Ning, Shen, Dou, Lv, Pin, Dong, Daxiang

arXiv.org Artificial IntelligenceDec-19-2024

Solving complex reasoning tasks is a key real-world application of agents. Thanks to the pretraining of Large Language Models (LLMs) on code data, recent approaches like CodeAct successfully use code as LLM agents' action, achieving good results. However, CodeAct greedily generates the next action's code block by relying on fragmented thoughts, resulting in inconsistency and instability. Moreover, CodeAct lacks action-related ground-truth (GT), making its supervision signals and termination conditions questionable in multi-turn interactions. To address these issues, we first introduce a simple yet effective end-to-end code generation paradigm, CodeProgram, which leverages code's systematic logic to align with global reasoning and enable cohesive problem-solving. Then, we propose Tree-of-Code (ToC), which self-grows CodeProgram nodes based on the executable nature of the code and enables self-supervision in a GT-free scenario. Experimental results on two datasets using ten popular zero-shot LLMs show ToC remarkably boosts accuracy by nearly 20% over CodeAct with less than 1/4 turns. Several LLMs even perform better on one-turn CodeProgram than on multi-turn CodeAct. To further investigate the trade-off between efficacy and efficiency, we test different ToC tree sizes and exploration mechanisms. We also highlight the potential of ToC's end-to-end data generation for supervised and reinforced fine-tuning.

artificial intelligence, large language model, natural language, (17 more...)

arXiv.org Artificial Intelligence

2412.15305

Genre: Workflow (0.93)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Low-Latency Task-Oriented Communications with Multi-Round, Multi-Task Deep Learning

Sagduyu, Yalin E., Erpek, Tugba, Yener, Aylin, Ulukus, Sennur

arXiv.org Artificial IntelligenceNov-15-2024

In this paper, we address task-oriented (or goal-oriented) communications where an encoder at the transmitter learns compressed latent representations of data, which are then transmitted over a wireless channel. At the receiver, a decoder performs a machine learning task, specifically for classifying the received signals. The deep neural networks corresponding to the encoder-decoder pair are jointly trained, taking both channel and data characteristics into account. Our objective is to achieve high accuracy in completing the underlying task while minimizing the number of channel uses determined by the encoder's output size. To this end, we propose a multi-round, multi-task learning (MRMTL) approach for the dynamic update of channel uses in multi-round transmissions. The transmitter incrementally sends an increasing number of encoded samples over the channel based on the feedback from the receiver, and the receiver utilizes the signals from a previous round to enhance the task performance, rather than only considering the latest transmission. This approach employs multi-task learning to jointly optimize accuracy across varying number of channel uses, treating each configuration as a distinct task. By evaluating the confidence of the receiver in task decisions, MRMTL decides on whether to allocate additional channel uses in multiple rounds. We characterize both the accuracy and the delay (total number of channel uses) of MRMTL, demonstrating that it achieves the accuracy close to that of conventional methods requiring large numbers of channel uses, but with reduced delay by incorporating signals from a prior round. We consider the CIFAR-10 dataset, convolutional neural network architectures, and AWGN and Rayleigh channel models for performance evaluation. We show that MRMTL significantly improves the efficiency of task-oriented communications, balancing accuracy and latency effectively.

artificial intelligence, channel use, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2411.10385

Country:

North America > United States > Maryland > Prince George's County > College Park (0.14)
North America > United States > Ohio > Franklin County > Columbus (0.04)
North America > United States > Maryland > Montgomery County > Gaithersburg (0.04)

Genre: Research Report (0.64)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Is Sarcasm Detection A Step-by-Step Reasoning Process in Large Language Models?

Yao, Ben, Zhang, Yazhou, Li, Qiuchi, Qin, Jing

arXiv.org Artificial IntelligenceJul-17-2024

Elaborating a series of intermediate reasoning steps significantly improves the ability of large language models (LLMs) to solve complex problems, as such steps would evoke LLMs to think sequentially. However, human sarcasm understanding is often considered an intuitive and holistic cognitive process, in which various linguistic, contextual, and emotional cues are integrated to form a comprehensive understanding of the speaker's true intention, which is argued not be limited to a step-by-step reasoning process. To verify this argument, we introduce a new prompting framework called SarcasmCue, which contains four prompting strategies, $viz.$ chain of contradiction (CoC), graph of cues (GoC), bagging of cues (BoC) and tensor of cues (ToC), which elicits LLMs to detect human sarcasm by considering sequential and non-sequential prompting methods. Through a comprehensive empirical comparison on four benchmarking datasets, we show that the proposed four prompting methods outperforms standard IO prompting, CoT and ToT with a considerable margin, and non-sequential prompting generally outperforms sequential prompting.

language model, llm, sarcasm detection, (14 more...)

arXiv.org Artificial Intelligence

2407.12725

Country:

Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Europe > Denmark > Capital Region > Copenhagen (0.04)
South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
(9 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

The future of document indexing: GPT and Donut revolutionize table of content processing

Feyisa, Degaga Wolde, Berihun, Haylemicheal, Zewdu, Amanuel, Najimoghadam, Mahsa, Zare, Marzieh

arXiv.org Artificial IntelligenceMar-12-2024

Industrial projects rely heavily on lengthy, complex specification documents, making tedious manual extraction of structured information a major bottleneck. This paper introduces an innovative approach to automate this process, leveraging the capabilities of two cutting-edge AI models: Donut, a model that extracts information directly from scanned documents without OCR, and OpenAI GPT-3.5 Turbo, a robust large language model. The proposed methodology is initiated by acquiring the table of contents (ToCs) from construction specification documents and subsequently structuring the ToCs text into JSON data. Remarkable accuracy is achieved, with Donut reaching 85% and GPT-3.5 Turbo reaching 89% in effectively organizing the ToCs. This landmark achievement represents a significant leap forward in document indexing, demonstrating the immense potential of AI to automate information extraction tasks across diverse document types, boosting efficiency and liberating critical resources in various industries.

arxiv preprint arxiv, information, language model, (12 more...)

arXiv.org Artificial Intelligence

2403.07553

Country:

North America > Canada (0.05)
Europe > Switzerland > Vaud > Lausanne (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
(2 more...)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.35)

Add feedback

A Scalable Framework for Table of Contents Extraction from Complex ESG Annual Reports

Wang, Xinyu, Gui, Lin, He, Yulan

arXiv.org Artificial IntelligenceOct-27-2023

Table of contents (ToC) extraction centres on structuring documents in a hierarchical manner. In this paper, we propose a new dataset, ESGDoc, comprising 1,093 ESG annual reports from 563 companies spanning from 2001 to 2022. These reports pose significant challenges due to their diverse structures and extensive length. To address these challenges, we propose a new framework for Toc extraction, consisting of three steps: (1) Constructing an initial tree of text blocks based on reading order and font sizes; (2) Modelling each tree node (or text block) independently by considering its contextual information captured in node-centric subtree; (3) Modifying the original tree by taking appropriate action on each tree node (Keep, Delete, or Move). This construction-modelling-modification (CMM) process offers several benefits. It eliminates the need for pairwise modelling of section headings as in previous approaches, making document segmentation practically feasible. By incorporating structured information, each section heading can leverage both local and long-distance context relevant to itself. Experimental results show that our approach outperforms the previous state-of-the-art baseline with a fraction of running time. Our framework proves its scalability by effectively handling documents of any length.

esgdoc, information, node, (15 more...)

arXiv.org Artificial Intelligence

2310.18073

Country:

North America > United States > New York > New York County > New York City (0.04)
North America > Canada > Ontario > Toronto (0.04)
Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
(5 more...)

Genre:

Collection (0.61)
Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

EDGI: Equivariant Diffusion for Planning with Embodied Agents

Brehmer, Johann, Bose, Joey, de Haan, Pim, Cohen, Taco

arXiv.org Machine LearningOct-19-2023

Embodied agents operate in a structured world, often solving tasks with spatial, temporal, and permutation symmetries. Most algorithms for planning and model-based reinforcement learning (MBRL) do not take this rich geometric structure into account, leading to sample inefficiency and poor generalization. We introduce the Equivariant Diffuser for Generating Interactions (EDGI), an algorithm for MBRL and planning that is equivariant with respect to the product of the spatial symmetry group SE(3), the discrete-time translation group Z, and the object permutation group Sn. EDGI follows the Diffuser framework (Janner et al., 2022) in treating both learning a world model and planning in it as a conditional generative modeling problem, training a diffusion model on an offline trajectory dataset. We introduce a new SE(3)xZxSn-equivariant diffusion model that supports multiple representations. We integrate this model in a planning loop, where conditioning and classifier guidance let us softly break the symmetry for specific tasks as needed. On object manipulation and navigation tasks, EDGI is substantially more sample efficient and generalizes better across the symmetry group than non-equivariant models.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

arXiv.org Machine Learning

2303.1241

Country: North America > Canada > Quebec > Montreal (0.04)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.70)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
(2 more...)

Add feedback

The Value of AI in Logistics - TOC

#artificialintelligenceJan-21-2019, 09:16:41 GMT

Because of revolutionary advances in technology, some have said that our industry is in the midst of a "renaissance." According to Forbes Insight Research, the supply chain, logistics, and transportation processes are in an "era of profound transformation." And what is largely driving all of this change? Artificial intelligence, referred to as "AI", and machine learning, or "ML" for short. Artificial intelligence is the development of computer system's ability to perform tasks that normally only humans are able to do, like decision making and translation between languages.

artificial intelligence, machine learning, supply chain, (3 more...)

#artificialintelligence

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.40)

Add feedback

Filters

Collaborating Authors

toc

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

EDGI: Equivariant Diffusion for Planning with Embodied Agents Johann Brehmer Qualcomm AI Research

ToC: Tree-of-Claims Search with Multi-Agent Language Models

c95c049637c5c549c2a08e8d6dcbca4b-Paper-Conference.pdf

Tree-of-Code: A Tree-Structured Exploring Framework for End-to-End Code Generation and Execution in Complex Task Handling

Low-Latency Task-Oriented Communications with Multi-Round, Multi-Task Deep Learning

Is Sarcasm Detection A Step-by-Step Reasoning Process in Large Language Models?

The future of document indexing: GPT and Donut revolutionize table of content processing

A Scalable Framework for Table of Contents Extraction from Complex ESG Annual Reports

EDGI: Equivariant Diffusion for Planning with Embodied Agents

The Value of AI in Logistics - TOC