Goto

Collaborating Authors

 griffin


People Are Protesting Data Centers--but Embracing the Factories That Supply Them

WIRED

As the data center backlash grows, support is growing for server factories and the hundreds of jobs they're expected to bring. Last month, Pamela Griffin and two other residents of Taylor, Texas, took to the lectern at a city council meeting to object to a data center project. But later, they sat back as council members discussed a proposed tech factory. Griffin didn't speak up against that development. A similar contrast is repeating in communities across the US.


Simulating Extinct Species

Communications of the ACM

How did extinct animals move? Paleontologists are interested in figuring this out since it can tell us more about their ways of life, such as whether they were agile enough to hunt prey. It can also provide clues about how locomotion evolved; for example, when our ancestors started to walk upright. Researchers have come up with hypotheses about the movement of long-gone species by examining evidence such as fossilized bones or well-preserved footprints. Extinct animals can also be compared to similar living ones: comparing their limb length, for example, can give an idea of their speed of movement.


Scalable LLM Math Reasoning Acceleration with Low-rank Distillation

Dong, Harry, Acun, Bilge, Chen, Beidi, Chi, Yuejie

arXiv.org Artificial Intelligence

While many existing efficient inference methods have been developed with excellent performance preservation on language tasks, they often severely degrade math performance. In this paper, we propose Caprese, a resource-efficient distillation method to recover lost capabilities from deploying efficient inference methods, focused primarily in feedforward blocks. With original weights unperturbed, roughly 1% of additional parameters, and only 20K synthetic training samples, we are able to recover much if not all of the math capabilities lost from efficient inference for thinking LLMs and without harm to language tasks for instruct LLMs. Moreover, Caprese slashes the number of active parameters ( 2B cut for Gemma 2 9B and Llama 3.1 8B) and integrates cleanly into existing model layers to reduce latency (>16% time-to-next-token reduction) while encouraging response brevity (up to 8.5% fewer tokens).


Follow the Path: Reasoning over Knowledge Graph Paths to Improve LLM Factuality

Zhang, Mike, Bjerva, Johannes, Biswas, Russa

arXiv.org Artificial Intelligence

We introduce fs1, a simple yet effective method that improves the factuality of reasoning traces by sourcing them from large reasoning models (e.g., DeepSeek-R1) and grounding them by conditioning on knowledge graph (KG) paths. We fine-tune eight instruction-tuned Large Language Models (LLMs) on 3.9K factually grounded reasoning traces and rigorously evaluate them on six complex open-domain question-answering (QA) benchmarks encompassing 23.9K questions. Our results demonstrate that our fs1-tuned model (32B parameters) consistently outperforms instruction-tuned counterparts with parallel sampling by 6-14 absolute points (pass@$16$). Our detailed analysis shows that fs1 considerably improves model performance over more complex questions (requiring 3 or more hops on KG paths) and numerical answer types compared to the baselines. Furthermore, in single-pass inference, we notice that smaller LLMs show the most improvements. While prior works demonstrate the effectiveness of reasoning traces primarily in the STEM domains, our work shows strong evidence that anchoring reasoning to factual KG paths is a critical step in transforming LLMs for reliable knowledge-intensive tasks.


GLASS: Test-Time Acceleration for LLMs via Global-Local Neural Importance Aggregation

Sattarifard, Amirmohsen, Lavasani, Sepehr, Imani, Ehsan, Zhang, Kunlin, Xu, Hanlin, Sun, Fengyu, Hassanpour, Negar, Gao, Chao

arXiv.org Artificial Intelligence

Deploying Large Language Models (LLMs) on edge hardware demands aggressive, prompt-aware dynamic pruning to reduce computation without degrading quality. Static or predictor-based schemes either lock in a single sparsity pattern or incur extra runtime overhead, and recent zero-shot methods that rely on statistics from a single prompt fail on short prompt and/or long generation scenarios. We introduce A/I-GLASS: Activation- and Impact-based Global-Local neural importance Aggregation for feed-forward network SparSification, two training-free methods that dynamically select FFN units using a rank-aggregation of prompt local and model-intrinsic global neuron statistics. Empirical results across multiple LLMs and benchmarks demonstrate that GLASS significantly outperforms prior training-free methods, particularly in challenging long-form generation scenarios, without relying on auxiliary predictors or adding any inference overhead.


Griffin: Towards a Graph-Centric Relational Database Foundation Model

Wang, Yanbo, Wang, Xiyuan, Gan, Quan, Wang, Minjie, Yang, Qibin, Wipf, David, Zhang, Muhan

arXiv.org Artificial Intelligence

We introduce Griffin, the first foundation model attemptation designed specifically for Relational Databases (RDBs). Unlike previous smaller models focused on single RDB tasks, Griffin unifies the data encoder and task decoder to handle diverse tasks. Additionally, we enhance the architecture by incorporating a cross-attention module and a novel aggregator. Griffin utilizes pretraining on both single-table and RDB datasets, employing advanced encoders for categorical, numerical, and metadata features, along with innovative components such as cross-attention modules and enhanced message-passing neural networks (MPNNs) to capture the complexities of relational data. Evaluated on large-scale, heterogeneous, and temporal graphs extracted from RDBs across various domains (spanning over 150 million nodes), Griffin demonstrates superior or comparable performance to individually trained models, excels in low-data scenarios, and shows strong transferability with similarity and diversity in pretraining across new datasets and tasks, highlighting its potential as a universally applicable foundation model for RDBs. Code available at https://github.com/yanxwb/Griffin.


Scott Bessent kicks off Milken bash by doubling down on Trump agenda

Los Angeles Times

Treasury Secretary Scott Bessent kicked off Michael Milken's annual financial bash in Beverly Hills by doubling down on President Trump's economic policy of trade reform, tax cuts and deregulation -- promising the "America First" agenda would be "the blueprint for a more abundant world." The former hedge fund manager, in a brief speech Monday that opened the Milken Institute Global Conference, said that all three elements of the policy must be taken together in order to be understood. "They are interlocking parts of an engine designed to drive long-term investment in the American economy," he said, in remarks at the Beverly Hilton. "Tariffs are engineered to encourage companies like yours to invest directly in the United States. Hire your workers here, build your factories here, make your products here. You'll be glad you did, not only because we have the most productive work force in the world, but because we will soon have the most favorable tax and regulatory environment as well," he said.


Griffin: Aerial-Ground Cooperative Detection and Tracking Dataset and Benchmark

Wang, Jiahao, Cao, Xiangyu, Zhong, Jiaru, Zhang, Yuner, Yu, Haibao, He, Lei, Xu, Shaobing

arXiv.org Artificial Intelligence

Despite significant advancements, autonomous driving systems continue to struggle with occluded objects and long-range detection due to the inherent limitations of single-perspective sensing. Aerial-ground cooperation offers a promising solution by integrating UAVs' aerial views with ground vehicles' local observations. However, progress in this emerging field has been hindered by the absence of public datasets and standardized evaluation benchmarks. To address this gap, this paper presents a comprehensive solution for aerial-ground cooperative 3D perception through three key contributions: (1) Griffin, a large-scale multi-modal dataset featuring over 200 dynamic scenes (30k+ frames) with varied UAV altitudes (20-60m), diverse weather conditions, and occlusion-aware 3D annotations, enhanced by CARLA-AirSim co-simulation for realistic UAV dynamics; (2) A unified benchmarking framework for aerial-ground cooperative detection and tracking tasks, including protocols for evaluating communication efficiency, latency tolerance, and altitude adaptability; (3) AGILE, an instance-level intermediate fusion baseline that dynamically aligns cross-view features through query-based interaction, achieving an advantageous balance between communication overhead and perception accuracy. Extensive experiments prove the effectiveness of aerial-ground cooperative perception and demonstrate the direction of further research. The dataset and codes are available at https://github.com/wang-jh18-SVM/Griffin.


GRIFFIN: Effective Token Alignment for Faster Speculative Decoding

Hu, Shijing, Li, Jingyang, Xie, Xingyu, Lu, Zhihui, Toh, Kim-Chuan, Zhou, Pan

arXiv.org Artificial Intelligence

Speculative decoding accelerates inference in large language models (LLMs) by generating multiple draft tokens simultaneously. However, existing methods often struggle with token misalignment between the training and decoding phases, limiting their performance. To address this, we propose GRIFFIN, a novel framework that incorporates a token-alignable training strategy and a token-alignable draft model to mitigate misalignment. The training strategy employs a loss masking mechanism to exclude highly misaligned tokens during training, preventing them from negatively impacting the draft model's optimization. The token-alignable draft model introduces input tokens to correct inconsistencies in generated features. Experiments on LLaMA-series and Vicuna models demonstrate that GRIFFIN achieves an average acceptance length improvement of over 7\% and a speedup ratio exceeding 8%, outperforming current SoTAs as shown in Fig. 1 (a) and (b).


LongRAG: A Dual-Perspective Retrieval-Augmented Generation Paradigm for Long-Context Question Answering

Zhao, Qingfei, Wang, Ruobing, Cen, Yukuo, Zha, Daren, Tan, Shicheng, Dong, Yuxiao, Tang, Jie

arXiv.org Artificial Intelligence

Long-Context Question Answering (LCQA), a challenging task, aims to reason over long-context documents to yield accurate answers to questions. Existing long-context Large Language Models (LLMs) for LCQA often struggle with the "lost in the middle" issue. Retrieval-Augmented Generation (RAG) mitigates this issue by providing external factual evidence. However, its chunking strategy disrupts the global long-context information, and its low-quality retrieval in long contexts hinders LLMs from identifying effective factual details due to substantial noise. To this end, we propose LongRAG, a general, dual-perspective, and robust LLM-based RAG system paradigm for LCQA to enhance RAG's understanding of complex long-context knowledge (i.e., global information and factual details). We design LongRAG as a plug-and-play paradigm, facilitating adaptation to various domains and LLMs. Extensive experiments on three multi-hop datasets demonstrate that LongRAG significantly outperforms long-context LLMs (up by 6.94%), advanced RAG (up by 6.16%), and Vanilla RAG (up by 17.25%). Furthermore, we conduct quantitative ablation studies and multi-dimensional analyses, highlighting the effectiveness of the system's components and fine-tuning strategies. Data and code are available at https://github.com/QingFei1/LongRAG.