mrd
MRD: Multi-resolution Retrieval-Detection Fusion for High-Resolution Image Understanding
Understanding high-resolution images remains a significant challenge for multimodal large language models (MLLMs). Recent study address this issue by dividing the image into smaller crops and computing the semantic similarity between each crop and a query using a pretrained retrieval-augmented generation (RAG) model. The most relevant crops are then selected to localize the target object and suppress irrelevant information. However, such crop-based processing can fragment complete objects across multiple crops, thereby disrupting the computation of semantic similarity. In our experiments, we find that image crops of objects with different sizes are better handled at different resolutions. Based on this observation, we propose Multi-resolution Retrieval-Detection (MRD), a training-free framework for high-resolution image understanding. To address the issue of semantic similarity bias caused by objects being split across different image crops, we propose a multi-resolution semantic fusion method, which integrates semantic similarity maps obtained at different resolutions to produce more accurate semantic information and preserve the integrity of target objects. Furthermore, to achieve direct localization of target objects at a global scale, we introduce an open-vocalbulary object detection (OVD) model that identifies object regions using a sliding-window approach.Experiments on high-resolution image understanding benchmarks using different MLLMs demonstrate the effectiveness of our approach.
- Asia > China > Heilongjiang Province > Harbin (0.40)
- Asia > China > Guangdong Province > Shenzhen (0.40)
- Asia > Singapore (0.04)
- Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
- Information Technology > Artificial Intelligence > Vision > Image Understanding (0.81)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)
A Neuro-inspired Interpretation of Unlearning in Large Language Models through Sample-level Unlearning Difficulty
Feng, Xiaohua, Li, Yuyuan, Wang, Chengye, Liu, Junlin, Zhang, Li, Chen, Chaochao
Driven by privacy protection laws and regulations, unlearning in Large Language Models (LLMs) is gaining increasing attention. However, current research often neglects the interpretability of the unlearning process, particularly concerning sample-level unlearning difficulty. Existing studies typically assume a uniform unlearning difficulty across samples. This simplification risks attributing the performance of unlearning algorithms to sample selection rather than the algorithm's design, potentially steering the development of LLM unlearning in the wrong direction. Thus, we investigate the relationship between LLM unlearning and sample characteristics, with a focus on unlearning difficulty. Drawing inspiration from neuroscience, we propose a Memory Removal Difficulty ($\mathrm{MRD}$) metric to quantify sample-level unlearning difficulty. Using $\mathrm{MRD}$, we analyze the characteristics of hard-to-unlearn versus easy-to-unlearn samples. Furthermore, we propose an $\mathrm{MRD}$-based weighted sampling method to optimize existing unlearning algorithms, which prioritizes easily forgettable samples, thereby improving unlearning efficiency and effectiveness. We validate the proposed metric and method using public benchmarks and datasets, with results confirming its effectiveness.
- Asia > China > Zhejiang Province > Hangzhou (0.04)
- North America > United States > California > Santa Clara County > Stanford (0.04)
- Asia > Middle East > Jordan > Amman Governorate > Amman (0.04)
- Africa > Madagascar (0.04)
- Law (1.00)
- Information Technology > Security & Privacy (1.00)
- Health & Medicine > Therapeutic Area > Neurology (0.66)
KAA: Kolmogorov-Arnold Attention for Enhancing Attentive Graph Neural Networks
Fang, Taoran, Gao, Tianhong, Wang, Chunping, Shang, Yihao, Chow, Wei, Chen, Lei, Yang, Yang
Graph neural networks (GNNs) with attention mechanisms, often referred to as attentive GNNs, have emerged as a prominent paradigm in advanced GNN models in recent years. However, our understanding of the critical process of scoring neighbor nodes remains limited, leading to the underperformance of many existing attentive GNNs. In this paper, we unify the scoring functions of current attentive GNNs and propose Kolmogorov-Arnold Attention (KAA), which integrates the Kolmogorov-Arnold Network (KAN) architecture into the scoring process. KAA enhances the performance of scoring functions across the board and can be applied to nearly all existing attentive GNNs. To compare the expressive power of KAA with other scoring functions, we introduce Maximum Ranking Distance (MRD) to quantitatively estimate their upper bounds in ranking errors for node importance. Our analysis reveals that, under limited parameters and constraints on width and depth, both linear transformation-based and MLP-based scoring functions exhibit finite expressive power. In contrast, our proposed KAA, even with a single-layer KAN parameterized by zero-order B-spline functions, demonstrates nearly infinite expressive power. Extensive experiments on both node-level and graph-level tasks using various backbone models show that KAA-enhanced scoring functions consistently outperform their original counterparts, achieving performance improvements of over 20% in some cases.
- Europe > Latvia > Riga Municipality > Riga (0.04)
- South America > Chile > Santiago Metropolitan Region > Santiago Province > Santiago (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
- Materials (0.46)
- Health & Medicine (0.46)
Mutual Regression Distance
The maximum mean discrepancy and Wasserstein distance are popular distance measures between distributions and play important roles in many machine learning problems such as metric learning, generative modeling, domain adaption, and clustering. However, since they are functions of pair-wise distances between data points in two distributions, they do not exploit the potential manifold properties of data such as smoothness and hence are not effective in measuring the dissimilarity between the two distributions in the form of manifolds. In this paper, different from existing measures, we propose a novel distance called Mutual Regression Distance (MRD) induced by a constrained mutual regression problem, which can exploit the manifold property of data. We prove that MRD is a pseudometric that satisfies almost all the axioms of a metric. Since the optimization of the original MRD is costly, we provide a tight MRD and a simplified MRD, based on which a heuristic algorithm is established. We also provide kernel variants of MRDs that are more effective in handling nonlinear data. Our MRDs especially the simplified MRDs have much lower computational complexity than the Wasserstein distance. We provide theoretical guarantees, such as robustness, for MRDs. Finally, we apply MRDs to distribution clustering, generative models, and domain adaptation. The numerical results demonstrate the effectiveness and superiority of MRDs compared to the baselines.
- Asia > China > Hong Kong (0.04)
- Asia > China > Guangdong Province > Shenzhen (0.04)
- North America > United States > Michigan (0.04)
- (2 more...)
Are Synthetic Time-series Data Really not as Good as Real Data?
Fu, Fanzhe, Chen, Junru, Zhang, Jing, Yang, Carl, Ma, Lvbin, Yang, Yang
Integrating universal Issues: The fine-tuning process for temporal data needs data synthesis methods holds promise in improving to be handled carefully as it may contain adversarial or noisy generalization. However, current methods cannot examples, which could impact the model's robustness; (2) guarantee that the generator's output covers Bias and Vulnerabilities: The use of temporal data may all unseen real data. In this paper, we introduce cause the model to inherit biases or vulnerabilities from the InfoBoost-a highly versatile cross-domain data data, thereby reducing its robustness in real-world applications; synthesizing framework with time series representation (3) Generalization Problems: Despite being trained learning capability. We have developed on vast datasets, time-series models may not generalize a method based on synthetic data that enables well to unseen or out-of-distribution data. Time-series and model training without the need for real data, surpassing spatio-temporal data may exhibit sudden shifts or trends, the performance of models trained with potentially leading to unreliable outputs, highlighting the real data. Additionally, we have trained a universal need for robust generalization (Jin et al., 2023).
- Asia > China > Zhejiang Province > Hangzhou (0.04)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- Europe > Switzerland > Basel-City > Basel (0.04)
- Asia > China > Beijing > Beijing (0.04)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
- Information Technology > Data Science > Data Mining (0.93)
- Information Technology > Data Science > Data Quality (0.68)
- Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.68)
WavPool: A New Block for Deep Neural Networks
McDermott, Samuel D., Voetberg, M., Nord, Brian
Modern deep neural networks comprise many operational layers, such as dense or convolutional layers, which are often collected into blocks. In this work, we introduce a new, wavelet-transform-based network architecture that we call the multi-resolution perceptron: by adding a pooling layer, we create a new network block, the WavPool. The first step of the multi-resolution perceptron is transforming the data into its multi-resolution decomposition form by convolving the input data with filters of fixed coefficients but increasing size. Following image processing techniques, we are able to make scale and spatial information simultaneously accessible to the network without increasing the size of the data vector. WavPool outperforms a similar multilayer perceptron while using fewer parameters, and outperforms a comparable convolutional neural network by ~ 10% on relative accuracy on CIFAR-10.
- North America > United States > Illinois > Cook County > Chicago (0.05)
- North America > United States > Illinois > Kane County > Batavia (0.04)
- Europe > Italy > Piedmont > Turin Province > Turin (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Energy (0.46)
- Government > Regional Government (0.46)
Learning Better Masking for Better Language Model Pre-training
Yang, Dongjie, Zhang, Zhuosheng, Zhao, Hai
Masked Language Modeling (MLM) has been widely used as the denoising objective in pre-training language models (PrLMs). Existing PrLMs commonly adopt a Random-Token Masking strategy where a fixed masking ratio is applied and different contents are masked by an equal probability throughout the entire training. However, the model may receive complicated impact from pre-training status, which changes accordingly as training time goes on. In this paper, we show that such time-invariant MLM settings on masking ratio and masked content are unlikely to deliver an optimal outcome, which motivates us to explore the influence of time-variant MLM settings. We propose two scheduled masking approaches that adaptively tune the masking ratio and masked content in different training stages, which improves the pre-training efficiency and effectiveness verified on the downstream tasks. Our work is a pioneer study on time-variant masking strategy on ratio and content and gives a better understanding of how masking ratio and masked content influence the MLM pre-training.
- North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
- Asia > China > Shanghai > Shanghai (0.04)
- Africa > Ethiopia > Addis Ababa > Addis Ababa (0.04)
- (2 more...)
IDEAL: Toward High-efficiency Device-Cloud Collaborative and Dynamic Recommendation System
Lv, Zheqi, Chen, Zhengyu, Zhang, Shengyu, Kuang, Kun, Zhang, Wenqiao, Li, Mengze, Ooi, Beng Chin, Wu, Fei
Recommendation systems have shown great potential to solve the information explosion problem and enhance user experience in various online applications, which recently present two emerging trends: (i) Collaboration: single-sided model trained on-cloud (separate learning) to the device-cloud collaborative recommendation (collaborative learning). (ii) Real-time Dynamic: the network parameters are the same across all the instances (static model) to adaptive network parameters generation conditioned on the real-time instances (dynamic model). The aforementioned two trends enable the device-cloud collaborative and dynamic recommendation, which deeply exploits the recommendation pattern among cloud-device data and efficiently characterizes different instances with different underlying distributions based on the cost of frequent device-cloud communication. Despite promising, we argue that most of the communications are unnecessary to request the new parameters of the recommendation system on the cloud since the on-device data distribution are not always changing. To alleviate this issue, we designed a Intelligent DEvice-Cloud PArameter Request ModeL (IDEAL) that can be deployed on the device to calculate the request revenue with low resource consumption, so as to ensure the adaptive device-cloud communication with high revenue. We envision a new device intelligence learning task to implement IDEAL by detecting the data out-of-domain. Moreover, we map the user's real-time behavior to a normal distribution, the uncertainty is calculated by the multi-sampling outputs to measure the generalization ability of the device model to the current user behavior. Our experimental study demonstrates IDEAL's effectiveness and generalizability on four public benchmarks, which yield a higher efficient device-cloud collaborative and dynamic recommendation paradigm.
- North America > United States > California > Los Angeles County > Long Beach (0.05)
- North America > United States > District of Columbia > Washington (0.04)
- Asia > China > Shanghai > Shanghai (0.04)
- (2 more...)
Multi-view Learning as a Nonparametric Nonlinear Inter-Battery Factor Analysis
Damianou, Andreas, Lawrence, Neil D., Ek, Carl Henrik
Factor analysis aims to determine latent factors, or traits, which summarize a given data set. Inter-battery factor analysis extends this notion to multiple views of the data. In this paper we show how a nonlinear, nonparametric version of these models can be recovered through the Gaussian process latent variable model. This gives us a flexible formalism for multi-view learning where the latent variables can be used both for exploratory purposes and for learning representations that enable efficient inference for ambiguous estimation tasks. Learning is performed in a Bayesian manner through the formulation of a variational compression scheme which gives a rigorous lower bound on the log likelihood. Our Bayesian framework provides strong regularization during training, allowing the structure of the latent space to be determined efficiently and automatically. We demonstrate this by producing the first (to our knowledge) published results of learning from dozens of views, even when data is scarce.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- Asia > Middle East > Jordan (0.04)
- Europe > United Kingdom > England > South Yorkshire > Sheffield (0.04)
- (6 more...)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.88)
A Fast Multi-Resolution Method for Detection of Significant Spatial Disease Clusters
Neill, Daniel B., Moore, Andrew W.
Given an N N grid of squares, where each square has a count and an underlying population,our goal is to find the square region with the highest density, and to calculate its significance by randomization. Any density measure D, dependent on the total count and total population of a region, canbe used. For example, if each count represents the number of disease cases occurring in that square, we can use Kulldorff's spatial scan statistic D
- North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.14)
- North America > United States > New York (0.04)