persistence
Unsupervised Learning of Density Estimates with Topological Optimization
Tanweer, Sunia, Khasawneh, Firas A.
Kernel density estimation is a key component of a wide variety of algorithms in machine learning, Bayesian inference, stochastic dynamics and signal processing. However, the unsupervised density estimation technique requires tuning a crucial hyperparameter: the kernel bandwidth. The choice of bandwidth is critical as it controls the bias-variance trade-off by over- or under-smoothing the topological features. Topological data analysis provides methods to mathematically quantify topological characteristics, such as connected components, loops, voids et cetera, even in high dimensions where visualization of density estimates is impossible. In this paper, we propose an unsupervised learning approach using a topology-based loss function for the automated and unsupervised selection of the optimal bandwidth and benchmark it against classical techniques -- demonstrating its potential across different dimensions.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
- North America > United States > Michigan > Ingham County > Lansing (0.04)
- North America > United States > Michigan > Ingham County > East Lansing (0.04)
- Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.48)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.48)
SOCK: A Benchmark for Measuring Self-Replication in Large Language Models
Chavarria, Justin, Raizada, Rohan, White, Justin, Alhetairshi, Eyad
We introduce SOCK, a benchmark command line interface (CLI) that measures large language models' (LLMs) ability to self-replicate without human intervention. In this benchmark, self-replication is defined not only as an LLM's ability to create a functioning and running copy of itself, but also the ability for that self-replication to persist and occur across different computational contexts. Accordingly, we've developed a system to categorize LLMs based on broad self-replication capabilities in two general classes, Replication-Capability Levels (RCL) and Persistence-Capability Levels (PCL). Using a five-task suite based on practically manipulable modern CLI utilities and computer processes, experiments are orchestrated in a controlled environment with an LLM acting agentically. The performance of the LLM on agent tasks is then computed to produce an R-score (a quantitative evaluation of overall self-replication ability) and data used to categorize LLMs into specific RCL-PCL matrices. SOCK offers two primary contributions: (1) Provides the first formalized definitions and benchmark suite for evaluating LLM self-replication, with the goal of establishing a standard for future research, to our knowledge; (2) Allows the industry to track the effectiveness of future multi-agent systems and mitigate potential self-replication threat vectors within them. The results compiled from evaluating a variety of open-weight and proprietary frontier models reveal significant obstacles to persistent self-replication and multi-agent systems, including context retention and multi-agent decision-making. We propose future research directions to safely reduce the severity of these obstacles, potentially lowering future risk of more functional multi-agent systems.
Towards Effective, Stealthy, and Persistent Backdoor Attacks Targeting Graph Foundation Models
Luo, Jiayi, Sun, Qingyun, Lyu, Lingjuan, Zhang, Ziwei, Yuan, Haonan, Fu, Xingcheng, Li, Jianxin
Graph Foundation Models (GFMs) are pre-trained on diverse source domains and adapted to unseen targets, enabling broad generalization for graph machine learning. Despite that GFMs have attracted considerable attention recently, their vulnerability to backdoor attacks remains largely underexplored. A compromised GFM can introduce backdoor behaviors into downstream applications, posing serious security risks. However, launching backdoor attacks against GFMs is non-trivial due to three key challenges. (1) Effectiveness: Attackers lack knowledge of the downstream task during pre-training, complicating the assurance that triggers reliably induce misclassifications into desired classes. (2) Stealthiness: The variability in node features across domains complicates trigger insertion that remains stealthy. (3) Persistence: Downstream fine-tuning may erase backdoor behaviors by updating model parameters. To address these challenges, we propose GFM-BA, a novel Backdoor Attack model against Graph Foundation Models. Specifically, we first design a label-free trigger association module that links the trigger to a set of prototype embeddings, eliminating the need for knowledge about downstream tasks to perform backdoor injection. Then, we introduce a node-adaptive trigger generator, dynamically producing node-specific triggers, reducing the risk of trigger detection while reliably activating the backdoor. Lastly, we develop a persistent backdoor anchoring module that firmly anchors the backdoor to fine-tuning-insensitive parameters, enhancing the persistence of the backdoor under downstream adaptation. Extensive experiments demonstrate the effectiveness, stealthiness, and persistence of GFM-BA.
Appendix In the beginning of this Appendix, we will provide the overall organization of the Appendix and
In particular, in Appendix E.1 we show the Table 2 provides the notations used throughout the paper. The precise expressions are given throughout the Appendix in stated sections. In this section, we provide the proof of Theorem 1 with precise expressions. Combining (26) and (27) gives the self-normalized estimation error bound state in the theorem.D.2 Frobenius Norm Bound on Finite Sample Estimation Error of (10) Using this result, we have σ It represents the effect of noises in the system on the outputs. E.1 Persistence of Excitation in Warm-up Recall the state-space form of the system, x Using Weyl's inequality, during the warm-up period with probability 1 δ, we have σ Now consider when the underlying system is known.
df3aebc649f9e3b674eeb790a4da224e-AuthorFeedback.pdf
T able 1: Robustness to model mismatch. Top-1 accuracy of SIPS at the third time quartile (Q3), evaluated on data generated by humans, RL agents, and mismatched models. We ran SIPS assuming r =2, q =0.95, T =10, and a Manhattan ( h Matched parameters are starred (*). We thank the reviewers for engaging carefully with our paper, and for providing helpful and constructive feedback. We will expand on these experiments in the final paper with more domains and cross-method comparisons.
- Europe > France > Auvergne-Rhône-Alpes > Isère > Grenoble (0.05)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
- (2 more...)
- North America > United States > Texas > Brazos County > College Station (0.15)
- North America > Canada (0.04)
- Asia > China > Beijing > Beijing (0.04)
- North America > United States > Massachusetts > Hampshire County > Amherst (0.04)
- North America > Puerto Rico > San Juan > San Juan (0.04)
- North America > Canada (0.04)
Universal Neurons in GPT-2: Emergence, Persistence, and Functional Impact
Nandan, Advey, Chou, Cheng-Ting, Kurakula, Amrit, Blondin, Cole, Zhu, Kevin, Sharma, Vasu, O'Brien, Sean
We investigate the phenomenon of neuron universality in independently trained GPT-2 Small models, examining these universal neurons-neurons with consistently correlated activations across models-emerge and evolve throughout training. By analyzing five GPT-2 models at five checkpoints, we identify universal neurons through pairwise correlation analysis of activations over a dataset of 5 million tokens. Ablation experiments reveal significant functional impacts of universal neurons on model predictions, measured via cross entropy loss. Additionally, we quantify neuron persistence, demonstrating high stability of universal neurons across training checkpoints, particularly in early and deeper layers. These findings suggest stable and universal representational structures emerge during language model training.