Goto

Collaborating Authors

 asg




Breaking Token Into Concepts: Exploring Extreme Compression in Token Representation Via Compositional Shared Semantics

R, Kavin V, Goyal, Pawan

arXiv.org Artificial Intelligence

Standard language models employ unique, monolithic embeddings for each token, potentially limiting their ability to capture the multifaceted nature of word meanings. We investigate whether tokens can be more effectively represented through a compositional structure that accumulates diverse semantic facets. To explore this, we propose Aggregate Semantic Grouping (ASG), a novel approach leveraging Product Quantization (PQ). We apply ASG to standard transformer architectures (mBERT, XLM-R, mT5) and evaluate this representational scheme across diverse tasks (NLI, NER, QA), as well as a biomedical domain-specific benchmark (BC5CDR) using BioBERT. Our findings demonstrate that representing tokens compositionally via ASG achieves extreme compression in embedding parameters (0.4--0.5\%) while maintaining $>$95\% task performance relative to the base model, even in generative tasks and extends to both cross lingual transfer and domain-specific settings. These results validate the principle that tokens can be effectively modeled as combinations of shared semantic building blocks. ASG offers a simple yet concrete method for achieving this, showcasing how compositional representations can capture linguistic richness while enabling compact yet semantically rich models.


PASG: A Closed-Loop Framework for Automated Geometric Primitive Extraction and Semantic Anchoring in Robotic Manipulation

Zhu, Zhihao, Zheng, Yifan, Pan, Siyu, Jin, Yaohui, Mu, Yao

arXiv.org Artificial Intelligence

The fragmentation between high-level task semantics and low-level geometric features remains a persistent challenge in robotic manipulation. While vision-language models (VLMs) have shown promise in generating affordance-aware visual representations, the lack of semantic grounding in canonical spaces and reliance on manual annotations severely limit their ability to capture dynamic semantic-affordance relationships. To address these, we propose Primitive-Aware Semantic Grounding (PASG), a closed-loop framework that introduces: (1) Automatic primitive extraction through geometric feature aggregation, enabling cross-category detection of keypoints and axes; (2) VLM-driven semantic anchoring that dynamically couples geometric primitives with functional affordances and task-relevant description; (3) A spatial-semantic reasoning benchmark and a fine-tuned VLM (Qwen2.5VL-PA). We demonstrate PASG's effectiveness in practical robotic manipulation tasks across diverse scenarios, achieving performance comparable to manual annotations. PASG achieves a finer-grained semantic-affordance understanding of objects, establishing a unified paradigm for bridging geometric primitives with task semantics in robotic manipulation.


$\texttt{SEM-CTRL}$: Semantically Controlled Decoding

Albinhassan, Mohammad, Madhyastha, Pranava, Russo, Alessandra

arXiv.org Artificial Intelligence

Ensuring both syntactic and semantic correctness in Large Language Model (LLM) outputs remains a significant challenge, despite being critical for real-world deployment. In this paper, we introduce $\texttt{SEM-CTRL}$, a unified approach that enforces rich context-sensitive constraints and task- and instance-specific semantics directly on an LLM decoder. Our approach integrates token-level MCTS, which is guided by specific syntactic and semantic constraints. The constraints over the desired outputs are expressed using Answer Set Grammars -- a logic-based formalism that generalizes context-sensitive grammars while incorporating background knowledge to represent task-specific semantics. We show that our approach guarantees correct completions for any off-the-shelf LLM without the need for fine-tuning. We evaluate $\texttt{SEM-CTRL}$ on a range of tasks, including synthetic grammar synthesis, combinatorial reasoning, and planning. Our results demonstrate that $\texttt{SEM-CTRL}$ allows small pre-trained LLMs to efficiently outperform larger variants and state-of-the-art reasoning models (e.g., o1-preview) while simultaneously guaranteeing solution correctness.


Sample Efficient Reinforcement Learning in Mixed Systems through Augmented Samples and Its Applications to Queueing Networks

Wei, Honghao, Liu, Xin, Wang, Weina, Ying, Lei

arXiv.org Artificial Intelligence

This paper considers a class of reinforcement learning problems, which involve systems with two types of states: stochastic and pseudo-stochastic. In such systems, stochastic states follow a stochastic transition kernel while the transitions of pseudo-stochastic states are deterministic given the stochastic states/transitions. We refer to such systems as mixed systems, which are widely used in various applications, including manufacturing systems, communication networks, and queueing networks. We propose a sample efficient RL method that accelerates learning by generating augmented data samples. The proposed algorithm is data-driven and learns the policy from data samples from both real and augmented samples. This method significantly improves learning by reducing the sample complexity such that the dataset only needs to have sufficient coverage of the stochastic states. We analyze the sample complexity of the proposed method under Fitted Q Iteration (FQI) and demonstrate that the optimality gap decreases as $\tilde{\mathcal{O}}(\sqrt{{1}/{n}}+\sqrt{{1}/{m}}),$ where $n$ is the number of real samples and $m$ is the number of augmented samples per real sample. It is important to note that without augmented samples, the optimality gap is $\tilde{\mathcal{O}}(1)$ due to insufficient data coverage of the pseudo-stochastic states. Our experimental results on multiple queueing network applications confirm that the proposed method indeed significantly accelerates learning in both deep Q-learning and deep policy gradient.


On the Foundations of Cycles in Bayesian Networks

Baier, Christel, Dubslaff, Clemens, Hermanns, Holger, Käfer, Nikolai

arXiv.org Artificial Intelligence

Bayesian networks (BNs) are a probabilistic graphical model widely used for representing expert knowledge and reasoning under uncertainty. Traditionally, they are based on directed acyclic graphs that capture dependencies between random variables. However, directed cycles can naturally arise when cross-dependencies between random variables exist, e.g., for modeling feedback loops. Existing methods to deal with such cross-dependencies usually rely on reductions to BNs without cycles. These approaches are fragile to generalize, since their justifications are intermingled with additional knowledge about the application context. In this paper, we present a foundational study regarding semantics for cyclic BNs that are generic and conservatively extend the cycle-free setting. First, we propose constraint-based semantics that specify requirements for full joint distributions over a BN to be consistent with the local conditional probabilities and independencies. Second, two kinds of limit semantics that formalize infinite unfolding approaches are introduced and shown to be computable by a Markov chain construction.


Answer-Set Programming for Lexicographical Makespan Optimisation in Parallel Machine Scheduling

Eiter, Thomas, Geibinger, Tobias, Musliu, Nysret, Oetsch, Johannes, Skocovsky, Peter, Stepanova, Daria

arXiv.org Artificial Intelligence

We deal with a challenging scheduling problem on parallel machines with sequence-dependent setup times and release dates from a real-world application of semiconductor work-shop production. There, jobs can only be processed by dedicated machines, thus few machines can determine the makespan almost regardless of how jobs are scheduled on the remaining ones. This causes problems when machines fail and jobs need to be rescheduled. Instead of optimising only the makespan, we put the individual machine spans in non-ascending order and lexicographically minimise the resulting tuples. This achieves that all machines complete as early as possible and increases the robustness of the schedule. We study the application of Answer-Set Programming (ASP) to solve this problem. While ASP eases modelling, the combination of timing constraints and the considered objective function challenges current solving technology. The former issue is addressed by using an extension of ASP by difference logic. For the latter, we devise different algorithms that use multi-shot solving. To tackle industrial-sized instances, we study different approximations and heuristics. Our experimental results show that ASP is indeed a promising KRR paradigm for this problem and is competitive with state-of-the-art CP and MIP solvers. Under consideration in Theory and Practice of Logic Programming (TPLP).


Say As You Wish: Fine-grained Control of Image Caption Generation with Abstract Scene Graphs

Chen, Shizhe, Jin, Qin, Wang, Peng, Wu, Qi

arXiv.org Artificial Intelligence

Humans are able to describe image contents with coarse to fine details as they wish. However, most image captioning models are intention-agnostic which can not generate diverse descriptions according to different user intentions initiatively. In this work, we propose the Abstract Scene Graph (ASG) structure to represent user intention in fine-grained level and control what and how detailed the generated description should be. The ASG is a directed graph consisting of three types of \textbf{abstract nodes} (object, attribute, relationship) grounded in the image without any concrete semantic labels. Thus it is easy to obtain either manually or automatically. From the ASG, we propose a novel ASG2Caption model, which is able to recognise user intentions and semantics in the graph, and therefore generate desired captions according to the graph structure. Our model achieves better controllability conditioning on ASGs than carefully designed baselines on both VisualGenome and MSCOCO datasets. It also significantly improves the caption diversity via automatically sampling diverse ASGs as control signals.


What's happened in MOOC Posts Analysis, Knowledge Tracing and Peer Feedbacks? A Review

Ravikiran, Manikandan

arXiv.org Artificial Intelligence

Learning Management Systems (LMS) and Educational Data Mining (EDM) are two important parts of online educational environment with the former being a centralised web-based information systems where the learning content is managed and learning activities are organised (Stone and Zheng,2014) and latter focusing on using data mining techniques for the analysis of data so generated. As part of this work, we present a literature review of three major tasks of EDM (See section 2), by identifying shortcomings and existing open problems, and a Blumenfield chart (See section 3). The consolidated set of papers and resources so used are released in https://github.com/manikandan-ravikiran/cs6460-Survey. The coverage statistics and review matrix of the survey are as shown in Figure 1 & Table 1 respectively. Acronym expansions are added in the Appendix Section 4.1.