ora
Appendix for Diffusion Improves Graph Learning A Graph diffusion as a polynomial filter
L and use the binomial equation, i.e. To obtain a more convenient form for K we shift the summation index using m = k j, i.e. ξ All remaining nodes are part of the test set and only used once for testing. Different seeds are used for validation and test splits. The patience is reset after an increase in accuracy on the validation set. We use the same development set for optimizing the hyperparameters for clustering.
Draw a Portrait of Your Graph Data: An Instance-Level Profiling Framework for Graph-Structured Data
Zhao, Tianqi, Biswas, Russa, Khosla, Megha
Graph machine learning models often achieve similar overall performance yet behave differently at the node level, failing on different subsets of nodes with varying reliability. Standard evaluation metrics such as accuracy obscure these fine grained differences, making it difficult to diagnose when and where models fail. We introduce NodePro, a node profiling framework that enables fine-grained diagnosis of model behavior by assigning interpretable profile scores to individual nodes. These scores combine data-centric signals, such as feature dissimilarity, label uncertainty, and structural ambiguity, with model-centric measures of prediction confidence and consistency during training. By aligning model behavior with these profiles, NodePro reveals systematic differences between models, even when aggregate metrics are indistinguishable. We show that node profiles generalize to unseen nodes, supporting prediction reliability without ground-truth labels. Finally, we demonstrate the utility of NodePro in identifying semantically inconsistent or corrupted nodes in a structured knowledge graph, illustrating its effectiveness in real-world settings.
- North America > United States (0.04)
- Europe > Netherlands > South Holland > Delft (0.04)
- Europe > Denmark > North Jutland > Aalborg (0.04)
- Asia > China > Beijing > Beijing (0.04)
- Information Technology (0.67)
- Banking & Finance > Credit (0.46)
- Information Technology > Data Science > Data Mining (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)
Vision as LoRA
Wang, Han, Ye, Yongjie, Li, Bingru, Nie, Yuxiang, Lu, Jinghui, Tang, Jingqun, Wang, Yanjie, Huang, Can
We introduce Vision as LoRA (VoRA), a novel paradigm for transforming an LLM into an MLLM. Unlike prevalent MLLM architectures that rely on external vision modules for vision encoding, VoRA internalizes visual capabilities by integrating vision-specific LoRA layers directly into the LLM. This design allows the added parameters to be seamlessly merged into the LLM during inference, eliminating structural complexity and minimizing computational overhead. Moreover, inheriting the LLM's ability of handling flexible context, VoRA can process inputs at arbitrary resolutions. To further strengthen VoRA's visual capabilities, we introduce a block-wise distillation method that transfers visual priors from a pre-trained ViT into the LoRA layers, effectively accelerating training by injecting visual knowledge. Additionally, we apply bi-directional attention masks to better capture the context information of an image. We successfully demonstrate that with additional pre-training data, VoRA can perform comparably with conventional encode-based MLLMs. All training data, codes, and model weights will be released at https://github.com/Hon-Wong/VoRA.
- Europe > Netherlands > North Holland > Amsterdam (0.04)
- Asia > China > Beijing > Beijing (0.04)
How Well Can Differential Privacy Be Audited in One Run?
Keinan, Amit, Shenfeld, Moshe, Ligett, Katrina
Recent methods for auditing the privacy of machine learning algorithms have improved computational efficiency by simultaneously intervening on multiple training examples in a single training run. Steinke et al. (2024) prove that one-run auditing indeed lower bounds the true privacy parameter of the audited algorithm, and give impressive empirical results. Their work leaves open the question of how precisely one-run auditing can uncover the true privacy parameter of an algorithm, and how that precision depends on the audited algorithm. In this work, we characterize the maximum achievable efficacy of one-run auditing and show that one-run auditing can only perfectly uncover the true privacy parameters of algorithms whose structure allows the effects of individual data elements to be isolated. Our characterization helps reveal how and when one-run auditing is still a promising technique for auditing real machine learning algorithms, despite these fundamental gaps.
- Asia > Middle East > Israel > Jerusalem District > Jerusalem (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > United States > New York > New York County > New York City (0.04)
- Asia > South Korea > Seoul > Seoul (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.54)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)
Estimation and inference for transfer learning with high-dimensional quantile regression
Huang, Jiayu, Wang, Mingqiu, Wu, Yuanshan
Transfer learning has become an essential technique to exploit information from the source domain to boost performance of the target task. Despite the prevalence in high-dimensional data, heterogeneity and heavy tails are insufficiently accounted for by current transfer learning approaches and thus may undermine the resulting performance. We propose a transfer learning procedure in the framework of high-dimensional quantile regression models to accommodate heterogeneity and heavy tails in the source and target domains. We establish error bounds of transfer learning estimator based on delicately selected transferable source domains, showing that lower error bounds can be achieved for critical selection criterion and larger sample size of source tasks. We further propose valid confidence interval and hypothesis test procedures for individual component of high-dimensional quantile regression coefficients by advocating a double transfer learning estimator, which is one-step debiased estimator for the transfer learning estimator wherein the technique of transfer learning is designed again. By adopting data-splitting technique, we advocate a transferability detection approach that guarantees to circumvent negative transfer and identify transferable sources with high probability. Simulation results demonstrate that the proposed method exhibits some favorable and compelling performances and the practical utility is further illustrated by analyzing a real example.
- Asia > China > Hunan Province (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > New York (0.04)
- (4 more...)
Data Augmentation in the Underparameterized and Overparameterized Regimes
Huang, Kevin Han, Orbanz, Peter, Austern, Morgane
We provide results that exactly quantify how data augmentation affects the variance and limiting distribution of estimates, and analyze several specific models in detail. The results confirm some observations made in machine learning practice, but also lead to unexpected findings: Data augmentation may increase rather than decrease the uncertainty of estimates, such as the empirical prediction risk. It can act as a regularizer, but fails to do so in certain high-dimensional problems, and it may shift the double-descent peak of an empirical risk. Overall, the analysis shows that several properties data augmentation has been attributed with are not either true or false, but rather depend on a combination of factors -- notably the data distribution, the properties of the estimator, and the interplay of sample size, number of augmentations, and dimension. Our main theoretical tool is a limit theorem for functions of randomly transformed, high-dimensional random vectors. The proof draws on work in probability on noise stability of functions of many variables.
- North America > United States (0.28)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Asia > Middle East > Israel (0.04)
On Regularization and Inference with Label Constraints
Wang, Kaifu, He, Hangfeng, Nguyen, Tin D., Kumar, Piyush, Roth, Dan
Prior knowledge and symbolic rules in machine learning are often expressed in the form of label constraints, especially in structured prediction problems. In this work, we compare two common strategies for encoding label constraints in a machine learning pipeline, regularization with constraints and constrained inference, by quantifying their impact on model performance. For regularization, we show that it narrows the generalization gap by precluding models that are inconsistent with the constraints. However, its preference for small violations introduces a bias toward a suboptimal model. For constrained inference, we show that it reduces the population risk by correcting a model's violation, and hence turns the violation into an advantage. Given these differences, we further explore the use of two approaches together and propose conditions for constrained inference to compensate for the bias introduced by regularization, aiming to improve both the model complexity and optimal risk.
- North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.14)
- Oceania > Australia > Victoria > Melbourne (0.04)
- North America > United States > Washington > King County > Seattle (0.04)
- (8 more...)
- Government > Regional Government > North America Government > United States Government (0.46)
- Government > Military (0.46)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.67)
- Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.48)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.46)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.46)
Simultaneous Best Subset Selection and Dimension Reduction via Primal-Dual Iterations
Wen, Canhong, Dong, Ruipeng, Wang, Xueqin, Li, Weiyu, Zhang, Heping
Sparse reduced rank regression is an essential statistical learning method. In the contemporary literature, estimation is typically formulated as a nonconvex optimization that often yields to a local optimum in numerical computation. Yet, their theoretical analysis is always centered on the global optimum, resulting in a discrepancy between the statistical guarantee and the numerical computation. In this research, we offer a new algorithm to address the problem and establish an almost optimal rate for the algorithmic solution. We also demonstrate that the algorithm achieves the estimation with a polynomial number of iterations. In addition, we present a generalized information criterion to simultaneously ensure the consistency of support set recovery and rank estimation. Under the proposed criterion, we show that our algorithm can achieve the oracle reduced rank estimation with a significant probability. The numerical studies and an application in the ovarian cancer genetic data demonstrate the effectiveness and scalability of our approach.
- Asia > China > Anhui Province (0.04)
- North America > United States > New York (0.04)
- North America > United States > Connecticut > New Haven County > New Haven (0.04)
Reward-Punishment Symmetric Universal Intelligence
Alexander, Samuel Allen, Hutter, Marcus
Can an agent's intelligence level be negative? We extend the Legg-Hutter agent-environment framework to include punishments and argue for an affirmative answer to that question. We show that if the background encodings and Universal Turing Machine (UTM) admit certain Kolmogorov complexity symmetries, then the resulting Legg-Hutter intelligence measure is symmetric about the origin. In particular, this implies reward-ignoring agents have Legg-Hutter intelligence 0 according to such UTMs.
Diffusion Improves Graph Learning
Klicpera, Johannes, Weißenberger, Stefan, Günnemann, Stephan
Graph convolution is the core of most Graph Neural Networks (GNNs) and usually approximated by message passing between direct (one-hop) neighbors. In this work, we remove the restriction of using only the direct neighbors by introducing a powerful, yet spatially localized graph convolution: Graph diffusion convolution (GDC). GDC leverages generalized graph diffusion, examples of which are the heat kernel and personalized PageRank. It alleviates the problem of noisy and often arbitrarily defined edges in real graphs. We show that GDC is closely related to spectral-based models and thus combines the strengths of both spatial (message passing) and spectral methods. We demonstrate that replacing message passing with graph diffusion convolution consistently leads to significant performance improvements across a wide range of models on both supervised and unsupervised tasks and a variety of datasets. Furthermore, GDC is not limited to GNNs but can trivially be combined with any graph-based model or algorithm (e.g. spectral clustering) without requiring any changes to the latter or affecting its computational complexity. Our implementation is available online.
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- North America > Canada (0.04)
- Europe > Germany > Bavaria > Upper Bavaria > Munich (0.04)
- Asia > Middle East > Jordan (0.04)