Goto

Collaborating Authors

 nexus


Nexus: Higher-Order Attention Mechanisms in Transformers

Chen, Hanting, Zhu, Chong, Han, Kai, Tian, Yuchuan, Liang, Yuchen, Guo, Tianyu, Chen, Xinghao, Tao, Dacheng, Wang, Yunhe

arXiv.org Artificial Intelligence

Transformers have achieved significant success across various domains, relying on self-attention to capture dependencies. However, the standard first-order attention mechanism is often limited by a low-rank bottleneck, struggling to capture intricate, multi-hop relationships within a single layer. In this paper, we propose the Nexus, a novel architecture designed to enhance representational power through a recursive framework. Unlike standard approaches that use static linear projections for Queries and Keys, Nexus dynamically refines these representations via nested self-attention mechanisms. Specifically, the Query and Key vectors are themselves outputs of inner attention loops, allowing tokens to aggregate global context and model high-order correlations \textit{prior} to the final attention computation. We enforce a parameter-efficient weight-sharing strategy across recursive steps, ensuring that this enhanced expressivity incurs $\mathcal{O}(1)$ additional parameters. We provide theoretical analysis demonstrating that our method breaks the linear bottleneck of standard attention. Empirically, Nexus outperforms standard Transformers on multiple benchmarks.


NEXUS: Network Exploration for eXploiting Unsafe Sequences in Multi-Turn LLM Jailbreaks

Asl, Javad Rafiei, Narula, Sidhant, Ghasemigol, Mohammad, Blanco, Eduardo, Takabi, Daniel

arXiv.org Artificial Intelligence

Large Language Models (LLMs) have revolutionized natural language processing but remain vulnerable to jailbreak attacks, especially multi-turn jailbreaks that distribute malicious intent across benign exchanges and bypass alignment mechanisms. Existing approaches often explore the adversarial space poorly, rely on hand-crafted heuristics, or lack systematic query refinement. We present NEXUS (Network Exploration for eXploiting Unsafe Sequences), a modular framework for constructing, refining, and executing optimized multi-turn attacks. NEXUS comprises: (1) ThoughtNet, which hierarchically expands a harmful intent into a structured semantic network of topics, entities, and query chains; (2) a feedback-driven Simulator that iteratively refines and prunes these chains through attacker-victim-judge LLM collaboration using harmfulness and semantic-similarity benchmarks; and (3) a Network Traverser that adaptively navigates the refined query space for real-time attacks. This pipeline uncovers stealthy, high-success adversarial paths across LLMs. On several closed-source and open-source LLMs, NEXUS increases attack success rate by 2.1% to 19.4% over prior methods. Code: https://github.com/inspire-lab/NEXUS


Improving Large Language Models Function Calling and Interpretability via Guided-Structured Templates

Dang, Hy, Liu, Tianyi, Wu, Zhuofeng, Yang, Jingfeng, Jiang, Haoming, Yang, Tao, Chen, Pei, Wang, Zhengyang, Wang, Helen, Li, Huasheng, Yin, Bing, Jiang, Meng

arXiv.org Artificial Intelligence

Large language models (LLMs) have demonstrated strong reasoning and tool-use capabilities, yet they often fail in real-world tool-interactions due to incorrect parameterization, poor tool selection, or misinterpretation of user intent. These issues often stem from an incomplete understanding of user goals and inadequate comprehension of tool documentation. While Chain-of-Thought (CoT) prompting has proven effective for enhancing reasoning in general contexts, our analysis reveals that free-form CoT is insufficient and sometimes counterproductive for structured function-calling tasks. To address this, we introduce a curriculum-inspired framework that leverages structured reasoning templates to guide LLMs through more deliberate step-by-step instructions for generating function callings. Experimental results show that our method reduces tool-use errors, achieving 3-12% relative improvements over strong baselines across diverse model series and approaches. Moreover, our framework enhances the robustness, interpretability, and transparency of tool-using agents, advancing the development of more reliable AI assistants for real-world applications.


When Better Eyes Lead to Blindness: A Diagnostic Study of the Information Bottleneck in CNN-LSTM Image Captioning Models

Gupta, Hitesh Kumar

arXiv.org Artificial Intelligence

Image captioning, situated at the intersection of computer vision and natural language processing, requires a sophisticated understanding of both visual scenes and linguistic structure. While modern approaches are dominated by large-scale Transformer architectures, this paper documents a systematic, iterative development of foundational image captioning models, progressing from a simple CNN-LSTM encoder-decoder to a competitive attention-based system. This paper presents a series of five models, beginning with Genesis and concluding with Nexus, an advanced model featuring an EfficientNetV2B3 backbone and a dynamic attention mechanism. The experiments chart the impact of architectural enhancements and demonstrate a key finding within the classic CNN-LSTM paradigm: merely upgrading the visual backbone without a corresponding attention mechanism can degrade performance, as the single-vector bottleneck cannot transmit the richer visual detail. This insight validates the architectural shift to attention. Trained on the MS COCO 2017 dataset, the final model, Nexus, achieves a BLEU-4 score of 31.4, surpassing several foundational benchmarks and validating the iterative design process. This work provides a clear, replicable blueprint for understanding the core architectural principles that underpin modern vision-language tasks.


Nexus:Proactive Intra-GPU Disaggregation of Prefill and Decode in LLM Serving

Shi, Xiaoxiang, Cai, Colin, Du, Junjia, Jia, Zhihao

arXiv.org Artificial Intelligence

Monolithic serving with chunked prefill improves GPU utilization by batching prefill and decode together, but suffers from fine-grained phase interference. Engine-level prefill-decode (PD) disaggregation avoids interference but incurs higher hardware and coordination overhead. Prior intra-GPU disaggregation approaches multiplex prefill and decode within a single GPU, using SLO-based tuning guided by heuristics from offline profiling or reactive feedback loops. However, these methods respond reactively to performance issues rather than anticipating them, limiting adaptability under dynamic workloads. We ask: can we achieve proactive intra-GPU disaggregation that adapts effectively to dynamic workloads? The key challenge lies in managing the conflicting resource demands of prefill and decode under varying conditions. We first show that GPU resources exhibit diminishing returns -- beyond a saturation point, more allocation yields minimal latency benefit. Second, we observe that memory bandwidth contention becomes a critical bottleneck. These insights motivate a design that dynamically partitions GPU resources across prefill and decode phases, while jointly considering compute capacity, memory footprint, and bandwidth contention. Evaluated on diverse LLMs and workloads, our system Nexus achieves up to 2.2x higher throughput, 20x lower TTFT, and 2.5x lower TBT than vLLM; outperforms SGLang by up to 2x; and matches or exceeds disaggregated vLLM.


The supercomputer set to supercharge America's AI future

FOX News

A growing number of fire departments across the country are turning to artificial intelligence to help detect and respond to wildfires more quickly. A major breakthrough in artificial intelligence and high-performance computing is on the way, and it's coming from Georgia Tech. Backed by a 20 million investment from the National Science Foundation (NSF), the university is building a supercomputer named Nexus. It's expected go online in spring 2026. Sign up for my FREE CyberGuy Report Get my best tech tips, urgent security alerts and exclusive deals delivered straight to your inbox.


Yuval Noah Harari: 'How Do We Share the Planet With This New Superintelligence?'

WIRED

Israeli historian and philosopher Yuval Noah Harari's book Sapiens became an international bestseller by presenting a view of history driven by the fictions created by mankind. His later work Homo Deus then depicted the a future for mankind brought about by the emergence of superintelligence. His latest book, Nexus: A Brief History of Information Networks From the Stone Age to AI, is a warning against the unparalleled threat of AI. A rising trend of techno-fascism driven by populism and artificial intelligence has been visible since the US presidential election in November. Nexus, which was published just a few months earlier, is a timely explainer of the potential consequences of AI on democracy and totalitarianism.


Nexus: A Lightweight and Scalable Multi-Agent Framework for Complex Tasks Automation

Sami, Humza, Islam, Mubashir ul, Charas, Samy, Gandhi, Asav, Gaillardon, Pierre-Emmanuel, Tenace, Valerio

arXiv.org Artificial Intelligence

Recent advancements in Large Language Models (LLMs) have substantially evolved Multi-Agent Systems (MASs) capabilities, enabling systems that not only automate tasks but also leverage near-human reasoning capabilities. To achieve this, LLM-based MASs need to be built around two critical principles: (i) a robust architecture that fully exploits LLM potential for specific tasks -- or related task sets -- and ($ii$) an effective methodology for equipping LLMs with the necessary capabilities to perform tasks and manage information efficiently. It goes without saying that a priori architectural designs can limit the scalability and domain adaptability of a given MAS. To address these challenges, in this paper we introduce Nexus: a lightweight Python framework designed to easily build and manage LLM-based MASs. Nexus introduces the following innovations: (i) a flexible multi-supervisor hierarchy, (ii) a simplified workflow design, and (iii) easy installation and open-source flexibility: Nexus can be installed via pip and is distributed under a permissive open-source license, allowing users to freely modify and extend its capabilities. Experimental results demonstrate that architectures built with Nexus exhibit state-of-the-art performance across diverse domains. In coding tasks, Nexus-driven MASs achieve a 99% pass rate on HumanEval and a flawless 100% on VerilogEval-Human, outperforming cutting-edge reasoning language models such as o3-mini and DeepSeek-R1. Moreover, these architectures display robust proficiency in complex reasoning and mathematical problem solving, achieving correct solutions for all randomly selected problems from the MATH dataset. In the realm of multi-objective optimization, Nexus-based architectures successfully address challenging timing closure tasks on designs from the VTR benchmark suite, while guaranteeing, on average, a power saving of nearly 30%.


The S2 Hierarchical Discrete Global Grid as a Nexus for Data Representation, Integration, and Querying Across Geospatial Knowledge Graphs

Stephen, Shirly, Faulk, Mitchell, Janowicz, Krzysztof, Fisher, Colby, Thelen, Thomas, Zhu, Rui, Hitzler, Pascal, Shimizu, Cogan, Currier, Kitty, Schildhauer, Mark, Rehberger, Dean, Wang, Zhangyu, Christou, Antrea

arXiv.org Artificial Intelligence

Geospatial Knowledge Graphs (GeoKGs) have become integral to the growing field of Geospatial Artificial Intelligence. Initiatives like the U.S. National Science Foundation's Open Knowledge Network program aim to create an ecosystem of nation-scale, cross-disciplinary GeoKGs that provide AI-ready geospatial data aligned with FAIR principles. However, building this infrastructure presents key challenges, including 1) managing large volumes of data, 2) the computational complexity of discovering topological relations via SPARQL, and 3) conflating multi-scale raster and vector data. Discrete Global Grid Systems (DGGS) help tackle these issues by offering efficient data integration and representation strategies. The KnowWhereGraph utilizes Google's S2 Geometry -- a DGGS framework -- to enable efficient multi-source data processing, qualitative spatial querying, and cross-graph integration. This paper outlines the implementation of S2 within KnowWhereGraph, emphasizing its role in topologically enriching and semantically compressing data. Ultimately, this work demonstrates the potential of DGGS frameworks, particularly S2, for building scalable GeoKGs.


Nexus by Yuval Noah Harari review – the AI apocalypse

The Guardian

As befits a writer whose breakout work, Sapiens, was a history of the entire human race, Yuval Noah Harari is a master of the sententious generalisation. "Human life," he writes here, "is a balancing act between endeavouring to improve ourselves and accepting who we were." Elsewhere, one might be surprised to read: "The ancient Romans had a clear understanding of what democracy means." No doubt the Romans would have been happy to hear that they would, 2,000 years in the future, be given a gold star for their comprehension of eternally stable political concepts by Yuval Noah Harari. In his 2018 book, 21 Lessons for the 21st Century, Harari wrote: "Liberals don't understand how history deviated from its preordained course, and they lack an alternative prism through which to interpret reality. Disorientation causes them to think in apocalyptic terms."