cohesion
Online Partitioned Local Depth for semi-supervised applications
Foley, John D., Lee, Justin T.
We introduce an extension of the partitioned local depth (PaLD) algorithm that is adapted to online applications such as semi-supervised prediction. The new algorithm we present, online PaLD, is well-suited to situations where it is a possible to pre-compute a cohesion network from a reference dataset. After $O(n^3)$ steps to construct a queryable data structure, online PaLD can extend the cohesion network to a new data point in $O(n^2)$ time. Our approach complements previous speed up approaches based on approximation and parallelism. For illustrations, we present applications to online anomaly detection and semi-supervised classification for health-care datasets.
StructuredDNA: A Bio-Physical Framework for Energy-Aware Transformer Routing
The rapid scaling of large computational models has led to a critical increase in energy and compute costs. Inspired by biological systems where structure and function emerge from low-energy configurations, we introduce StructuredDNA, a sparse architecture framework for modular, energy-aware Transformer routing. StructuredDNA replaces dense Mixture-of-Experts routing with a bio-physical, energy-guided routing layer based on semantic energy minimization. Inputs are dynamically grouped into semantic codons, and routing selects a single expert by minimizing a global energy functional that combines cohesion, uncertainty, and computational cost. We validate StructuredDNA on both specialized (BioASQ) and open-domain benchmarks (WikiText-103). On BioASQ (K = 50), we achieve a 97.7% reduction in Energy Utilization Density (EUD) and a Semantic Stability Index (SSI) of 0.998. We further demonstrate a Semantic Scaling Law on WikiText-103, showing that the architecture generalizes to open domains by scaling expert granularity (K = 2048) while maintaining more than 99% energy efficiency. StructuredDNA thus establishes a robust, domain-agnostic paradigm for future sparse computational frameworks. StructuredDNA provides an explicit link between bio-physical principles and sparse expert routing in Transformer architectures, and points toward future energy-aware, modular, and scalable computational systems. We discuss limitations of this proof-of-concept study and outline directions for scaling the approach to larger models, datasets, and hardware platforms. The StructuredDNA implementation is available at https://github.com/InnoDeep-repos/StructuredDNA .
The Unified Cognitive Consciousness Theory for Language Models: Anchoring Semantics, Thresholds of Activation, and Emergent Reasoning
Chang, Edward Y., Kaya, Zeyneb N., Chang, Ethan
We propose semantic anchoring, a unified account of how large language models turn pretrained capacity into goal-directed behavior: external structure (in-context examples, retrieval, or light tuning) binds the model's latent patterns to desired targets. Unified Contextual Control Theory (UCCT) formalizes this via anchoring strength $S = ρ_d - d_r - \log k$, where $ρ_d$ measures target cohesion in representation space, $d_r$ measures mismatch from prior knowledge, and $k$ is the anchor budget. UCCT predicts threshold-like performance flips and strictly generalizes in-context learning, reading retrieval and fine-tuning as anchoring variants. Three controlled studies provide evidence. Experiment 1 demonstrates cross-domain anchoring rebinding strong priors in text and vision. Experiment 2 varies representational familiarity via numeral bases (base-10/8/9) at fixed complexity, yielding ordered thresholds and transfer patterns tracking $ρ_d$, $d_r$, and $S$. Experiment 3 establishes a geometry-to-behavior correlate: layer-wise peak anchoring and trajectory area predict few-shot thresholds $θ_{50}$. UCCT offers testable theory and practical metrics for optimizing prompts, retrieval, and tuning.
- Europe > Latvia > Lubāna Municipality > Lubāna (0.04)
- North America > United States > Florida > Miami-Dade County > Miami (0.04)
- Asia > Thailand > Bangkok > Bangkok (0.04)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Building Resilient Information Ecosystems: Large LLM-Generated Dataset of Persuasion Attacks
Kao, Hsien-Te, Panasyuk, Aleksey, Bautista, Peter, Dupree, William, Ganberg, Gabriel, Beaubien, Jeffrey M., Cassani, Laura, Volkova, Svitlana
Organization's communication is essential for public trust, but the rise of generative AI models has introduced significant challenges by generating persuasive content that can form competing narratives with official messages from government and commercial organizations at speed and scale. This has left agencies in a reactive position, often unaware of how these models construct their persuasive strategies, making it more difficult to sustain communication effectiveness. In this paper, we introduce a large LLM-generated persuasion attack dataset, which includes 134,136 attacks generated by GPT-4, Gemma 2, and Llama 3.1 on agency news. These attacks span 23 persuasive techniques from SemEval 2023 Task 3, directed toward 972 press releases from ten agencies. The generated attacks come in two mediums, press release statements and social media posts, covering both long-form and short-form communication strategies. We analyzed the moral resonance of these persuasion attacks to understand their attack vectors. GPT-4's attacks mainly focus on Care, with Authority and Loyalty also playing a role. Gemma 2 emphasizes Care and Authority, while Llama 3.1 centers on Loyalty and Care. Analyzing LLM-generated persuasive attacks across models will enable proactive defense, allow to create the reputation armor for organizations, and propel the development of both effective and resilient communications in the information ecosystem.
- Press Release (0.55)
- Research Report (0.50)
- Government > Regional Government > North America Government > United States Government (1.00)
- Government > Military (1.00)
Context is Enough: Empirical Validation of $\textit{Sequentiality}$ on Essays
Sunny, Amal, Gupta, Advay, Sreekumar, Vishnu
Recent work has proposed using Large Language Models (LLMs) to quantify narrative flow through a measure called sequentiality, which combines topic and contextual terms. A recent critique argued that the original results were confounded by how topics were selected for the topic-based component, and noted that the metric had not been validated against ground-truth measures of flow. That work proposed using only the contextual term as a more conceptually valid and interpretable alternative. In this paper, we empirically validate that proposal. Using two essay datasets with human-annotated trait scores, ASAP++ and ELLIPSE, we show that the contextual version of sequentiality aligns more closely with human assessments of discourse-level traits such as Organization and Cohesion. While zero-shot prompted LLMs predict trait scores more accurately than the contextual measure alone, the contextual measure adds more predictive value than both the topic-only and original sequentiality formulations when combined with standard linguistic features. Notably, this combination also outperforms the zero-shot LLM predictions, highlighting the value of explicitly modeling sentence-to-sentence flow. Our findings support the use of context-based sequentiality as a validated, interpretable, and complementary feature for automated essay scoring and related NLP tasks.
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > Mexico > Mexico City > Mexico City (0.04)
- Europe > Spain > Catalonia > Barcelona Province > Barcelona (0.04)
- (2 more...)
Parameter-Efficient Conditioning for Material Generalization in Graph-Based Simulators
Manoharan, Naveen Raj, Iqbal, Hassan, Kumar, Krishna
Graph network-based simulators (GNS) have demonstrated strong potential for learning particle-based physics (such as fluids, deformable solids, and granular flows) while generalizing to unseen geometries due to their inherent inductive biases. However, existing models are typically trained for a single material type and fail to generalize across distinct constitutive behaviors, limiting their applicability in real-world engineering settings. Using granular flows as a running example, we propose a parameter-efficient conditioning mechanism that makes the GNS model adaptive to material parameters. We identify that sensitivity to material properties is concentrated in the early message-passing (MP) layers, a finding we link to the local nature of constitutive models (e.g., Mohr-Coulomb) and their effects on information propagation. We empirically validate this by showing that fine-tuning only the first few (1-5) of 10 MP layers of a pretrained model achieves comparable test performance as compared to fine-tuning the entire network. Building on this insight, we propose a parameter-efficient Feature-wise Linear Modulation (FiLM) conditioning mechanism designed to specifically target these early layers. This approach produces accurate long-term rollouts on unseen, interpolated, or moderately extrapolated values (e.g., up to 2.5 degrees for friction angle and 0.25 kPa for cohesion) when trained exclusively on as few as 12 short simulation trajectories from new materials, representing a 5-fold data reduction compared to a baseline multi-task learning method. Finally, we validate the model's utility by applying it to an inverse problem, successfully identifying unknown cohesion parameters from trajectory data. This approach enables the use of GNS in inverse design and closed-loop control tasks where material properties are treated as design variables.
- North America > United States > Texas > Travis County > Austin (0.04)
- North America > United States > Wisconsin > Dane County > Madison (0.04)
Leveraging Code Cohesion Analysis to Identify Source Code Supply Chain Attacks
Reuben, Maor, Mendel, Ido, Feldman, Or, Kravchik, Moshe, Guri, Mordehai, Puzis, Rami
Supply chain attacks significantly threaten software security with malicious code injections within legitimate projects. Such attacks are very rare but may have a devastating impact. Detecting spurious code injections using automated tools is further complicated as it often requires deciphering the intention of both the inserted code and its context. In this study, we propose an unsupervised approach for highlighting spurious code injections by quantifying cohesion disruptions in the source code. Using a name-prediction-based cohesion (NPC) metric, we analyze how function cohesion changes when malicious code is introduced compared to natural cohesion fluctuations. An analysis of 54,707 functions over 369 open-source C++ repositories reveals that code injection reduces cohesion and shifts naming patterns toward shorter, less descriptive names compared to genuine function updates. Considering the sporadic nature of real supply-chain attacks, we evaluate the proposed method with extreme test-set imbalance and show that monitoring high-cohesion functions with NPC can effectively detect functions with injected code, achieving a Precision@100 of 36.41% at a 1:1,000 ratio and 12.47% at 1:10,000. These results suggest that automated cohesion measurements, in general, and name-prediction-based cohesion, in particular, may help identify supply chain attacks, improving source code integrity.
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States (0.04)
- Europe > Portugal > Lisbon > Lisbon (0.04)
- (2 more...)
Large Language Models Preserve Semantic Isotopies in Story Continuations
In this work, we explore the relevance of textual semantics to Large Language Models (LLMs), extending previous insights into the connection between distributional semantics and structural semantics. We investigate whether LLM-generated texts preserve semantic isotopies. We design a story continuation experiment using 10,000 ROCStories prompts completed by five LLMs. We first validate GPT-4o's ability to extract isotopies from a linguistic benchmark, then apply it to the generated stories. We then analyze structural (coverage, density, spread) and semantic properties of isotopies to assess how they are affected by completion. Results show that LLM completion within a given token horizon preserves semantic isotopies across multiple properties.
- North America > Canada > Ontario > Toronto (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- Europe > Austria > Vienna (0.14)
- (13 more...)
Bio-inspired decision making in swarms under biases from stubborn robots, corrupted communication, and independent discovery
Zakir, Raina, Carletti, Timoteo, Dorigo, Marco, Reina, Andreagiovanni
Minimalistic robot swarms offer a scalable, robust, and cost-effective approach to performing complex tasks with the potential to transform applications in healthcare, disaster response, and environmental monitoring. However, coordinating such decentralised systems remains a fundamental challenge, particularly when robots are constrained in communication, computation, and memory. In our study, individual robots frequently make errors when sensing the environment, yet the swarm can rapidly and reliably reach consensus on the best among $n$ discrete options. We compare two canonical mechanisms of opinion dynamics -- direct-switch and cross-inhibition -- which are simple yet effective rules for collective information processing observed in biological systems across scales, from neural populations to insect colonies. We generalise the existing mean-field models by considering asocial biases influencing the opinion dynamics. While swarms using direct-switch reliably select the best option in absence of asocial dynamics, their performance deteriorates once such biases are introduced, often resulting in decision deadlocks. In contrast, bio-inspired cross-inhibition enables faster, more cohesive, accurate, robust, and scalable decisions across a wide range of biased conditions. Our findings provide theoretical and practical insights into the coordination of minimal swarms and offer insights that extend to a broad class of decentralised decision-making systems in biology and engineering.
- Europe > Germany (0.04)
- Europe > Belgium > Wallonia > Namur Province > Namur (0.04)
- North America > United States > New Jersey > Middlesex County > Piscataway (0.04)
- (4 more...)
Enhancing Essay Cohesion Assessment: A Novel Item Response Theory Approach
Rosa, Bruno Alexandre, Oliveira, Hilário, Rodrigues, Luiz, Oliveira, Eduardo Araujo, Mello, Rafael Ferreira
Essays are considered a valuable mechanism for evaluating learning outcomes in writing. Textual cohesion is an essential characteristic of a text, as it facilitates the establishment of meaning between its parts. Automatically scoring cohesion in essays presents a challenge in the field of educational artificial intelligence. The machine learning algorithms used to evaluate texts generally do not consider the individual characteristics of the instances that comprise the analysed corpus. In this meaning, item response theory can be adapted to the context of machine learning, characterising the ability, difficulty and discrimination of the models used. This work proposes and analyses the performance of a cohesion score prediction approach based on item response theory to adjust the scores generated by machine learning models. In this study, the corpus selected for the experiments consisted of the extended Essay-BR, which includes 6,563 essays in the style of the National High School Exam (ENEM), and the Brazilian Portuguese Narrative Essays, comprising 1,235 essays written by 5th to 9th grade students from public schools. We extracted 325 linguistic features and treated the problem as a machine learning regression task. The experimental results indicate that the proposed approach outperforms conventional machine learning models and ensemble methods in several evaluation metrics. This research explores a potential approach for improving the automatic evaluation of cohesion in educational essays.
- South America > Brazil > Pernambuco > Recife (0.04)
- Oceania > Australia > Victoria > Melbourne (0.04)
- Europe > San Marino > Fiorentino > Fiorentino (0.04)
- (3 more...)
- Education > Educational Setting > K-12 Education > Secondary School (0.48)
- Education > Educational Technology > Educational Software (0.47)