persistence
- North America > United States > Massachusetts > Middlesex County > Belmont (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
df3aebc649f9e3b674eeb790a4da224e-AuthorFeedback.pdf
Search issound, because all nodes are eventually expanded, and path costs updated. While all the evaluated39 domains were deterministic, our framework also supports Probabilistic PDDL [5], and can be readily extended to40 stochastic domains, e.g. Our use ofareplanning model also means that itisnot41 as strongly tied to deterministic environments as R3 contends: ifthe agent encounters astate that itdid not plan for42 (e.g.
- Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.05)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- (2 more...)
- North America > United States > Texas > Brazos County > College Station (0.15)
- North America > Canada (0.04)
- Asia > China > Beijing > Beijing (0.04)
Scissorhands: Exploiting the Persistence of Importance Hypothesis for LLM KV Cache Compression at Test Time
Large language models(LLMs) have sparked a new wave of exciting AI applications. Hosting these models at scale requires significant memory resources. One crucial memory bottleneck for the deployment stems from the context window. It is commonly recognized that model weights are memory hungry; however, the size of key-value embedding stored during the generation process (KV cache) can easily surpass the model size. The enormous size of the KV cache puts constraints on the inference batch size, which is crucial for high throughput inference workload.
Unsupervised Learning of Density Estimates with Topological Optimization
Tanweer, Sunia, Khasawneh, Firas A.
Kernel density estimation is a key component of a wide variety of algorithms in machine learning, Bayesian inference, stochastic dynamics and signal processing. However, the unsupervised density estimation technique requires tuning a crucial hyperparameter: the kernel bandwidth. The choice of bandwidth is critical as it controls the bias-variance trade-off by over- or under-smoothing the topological features. Topological data analysis provides methods to mathematically quantify topological characteristics, such as connected components, loops, voids et cetera, even in high dimensions where visualization of density estimates is impossible. In this paper, we propose an unsupervised learning approach using a topology-based loss function for the automated and unsupervised selection of the optimal bandwidth and benchmark it against classical techniques -- demonstrating its potential across different dimensions.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > Michigan > Ingham County > Lansing (0.04)
- North America > United States > Michigan > Ingham County > East Lansing (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)
SOCK: A Benchmark for Measuring Self-Replication in Large Language Models
Chavarria, Justin, Raizada, Rohan, White, Justin, Alhetairshi, Eyad
We introduce SOCK, a benchmark command line interface (CLI) that measures large language models' (LLMs) ability to self-replicate without human intervention. In this benchmark, self-replication is defined not only as an LLM's ability to create a functioning and running copy of itself, but also the ability for that self-replication to persist and occur across different computational contexts. Accordingly, we've developed a system to categorize LLMs based on broad self-replication capabilities in two general classes, Replication-Capability Levels (RCL) and Persistence-Capability Levels (PCL). Using a five-task suite based on practically manipulable modern CLI utilities and computer processes, experiments are orchestrated in a controlled environment with an LLM acting agentically. The performance of the LLM on agent tasks is then computed to produce an R-score (a quantitative evaluation of overall self-replication ability) and data used to categorize LLMs into specific RCL-PCL matrices. SOCK offers two primary contributions: (1) Provides the first formalized definitions and benchmark suite for evaluating LLM self-replication, with the goal of establishing a standard for future research, to our knowledge; (2) Allows the industry to track the effectiveness of future multi-agent systems and mitigate potential self-replication threat vectors within them. The results compiled from evaluating a variety of open-weight and proprietary frontier models reveal significant obstacles to persistent self-replication and multi-agent systems, including context retention and multi-agent decision-making. We propose future research directions to safely reduce the severity of these obstacles, potentially lowering future risk of more functional multi-agent systems.
Towards Effective, Stealthy, and Persistent Backdoor Attacks Targeting Graph Foundation Models
Luo, Jiayi, Sun, Qingyun, Lyu, Lingjuan, Zhang, Ziwei, Yuan, Haonan, Fu, Xingcheng, Li, Jianxin
Graph Foundation Models (GFMs) are pre-trained on diverse source domains and adapted to unseen targets, enabling broad generalization for graph machine learning. Despite that GFMs have attracted considerable attention recently, their vulnerability to backdoor attacks remains largely underexplored. A compromised GFM can introduce backdoor behaviors into downstream applications, posing serious security risks. However, launching backdoor attacks against GFMs is non-trivial due to three key challenges. (1) Effectiveness: Attackers lack knowledge of the downstream task during pre-training, complicating the assurance that triggers reliably induce misclassifications into desired classes. (2) Stealthiness: The variability in node features across domains complicates trigger insertion that remains stealthy. (3) Persistence: Downstream fine-tuning may erase backdoor behaviors by updating model parameters. To address these challenges, we propose GFM-BA, a novel Backdoor Attack model against Graph Foundation Models. Specifically, we first design a label-free trigger association module that links the trigger to a set of prototype embeddings, eliminating the need for knowledge about downstream tasks to perform backdoor injection. Then, we introduce a node-adaptive trigger generator, dynamically producing node-specific triggers, reducing the risk of trigger detection while reliably activating the backdoor. Lastly, we develop a persistent backdoor anchoring module that firmly anchors the backdoor to fine-tuning-insensitive parameters, enhancing the persistence of the backdoor under downstream adaptation. Extensive experiments demonstrate the effectiveness, stealthiness, and persistence of GFM-BA.