Goto

Collaborating Authors

 Deep Learning: Overviews


Semi-supervised Semantic Segmentation with Prototype-based Consistency Regularization

Neural Information Processing Systems

Semi-supervised semantic segmentation requires the model to effectively propagate the label information from limited annotated images to unlabeled ones. A challenge for such a per-pixel prediction task is the large intra-class variation, i.e., regions belonging to the same class may exhibit a very different appearance even in the same picture. This diversity will make the label propagation hard from pixels to pixels. To address this problem, we propose a novel approach to regularize the distribution of within-class features to ease label propagation difficulty. Specifically, our approach encourages the consistency between the prediction from a linear predictor and the output from a prototype-based predictor, which implicitly encourages features from the same pseudo-class to be close to at least one within-class prototype while staying far from the other between-class prototypes.


Characteristics of Harmful Text: Towards Rigorous Benchmarking of Language Models

Neural Information Processing Systems

Large language models produce human-like text that drives a growing number of applications. However, recent literature and, increasingly, real world observations, have demonstrated that these models can generate language that is toxic, biased, untruthful or otherwise harmful. Though work to evaluate language model harms is under way, translating foresight about which harms may arise into rigorous benchmarks is not straightforward. To facilitate this translation, we outline six ways of characterizing harmful text which merit explicit consideration when designing new benchmarks. We then use these characteristics as a lens to identify trends and gaps in existing benchmarks. Finally, we apply them in a case study of the Perspective API, a toxicity classifier that is widely used in harm benchmarks. Our characteristics provide one piece of the bridge that translates between foresight and effective evaluation.



Active Learning with LLMs for Partially Observed and Cost-Aware Scenarios

Neural Information Processing Systems

Conducting experiments and collecting data for machine learning models is a complex and expensive endeavor, particularly when confronted with limited information. Typically, extensive experiments to obtain features and labels come with a significant acquisition cost, making it impractical to carry out all of them. Therefore, it becomes crucial to strategically determine what to acquire to maximize the predictive performance while minimizing costs. To perform this task, existing data acquisition methods assume the availability of an initial dataset that is both fully-observed and labeled, crucially overlooking the partial observability of features characteristic of many real-world scenarios. In response to this challenge, we present Partially Observable Cost-Aware Active-Learning (POCA), a new learning approach aimed at improving model generalization in data-scarce and data-costly scenarios through label and/or feature acquisition. Introducing ยตPOCA as an instantiation, we maximize the uncertainty reduction in the predictive model when obtaining labels and features, considering associated costs.


A Some Concepts in Linear Algebra

Neural Information Processing Systems

In the interest of self-containedness, we provide a brief review of some concepts from linear algebra utilized in this work that might potentially be considered more advanced. Presented results are all standard; a very thorough reference is [24]. In fact, one might also consider infinite sums of Hilbert spaces: The space ' This means for example that the vector p1,0,0,0,...q is in ' C, while p1,1,1,1,...q is not. Operator Norm: Let J: H ร‘ H r be a linear operator between Hilbert spaces. We measure its'size' by what is called the operator norm, denoted by } } Special instances of normal operators are self-adjoint operators, for which we have the stronger property " IfdimH " d, we may write ฯƒp q " tฮปu Resolvent of a (normal) Operator: Given a normal operator on some Hilbert space H, we have that the operator p zq: H ร‘ H is invertible precisely ifz ฯƒp q. In this case we write Rpz, q " p zq For example ifgp q " | |, we obtain the absolute value | | of by specifying for allf P H that รฟ This now allows us to apply tools from complex analysis also to operators: If a functiong is analytic (i.e. can be expanded into a power series), we have gpzq gpฮปq " 1 2ฯ€i ฮป z dz It is a standard exercise to show that this is independent of the choice of orthonormal basis.


Unitary convolutions for learning on graphs and groups

Neural Information Processing Systems

In recent years, the design of specialized machine learning architectures for structured data has received a surge of interest. Of particular interest are architectures for data domains with inherent symmetries, such as permutation-invariance in graphs and sets, translation-invariance in images, and other symmetries that arise from fundamental laws of physics in scientific data.



CaptainCook4D: A Dataset for Understanding Errors in Procedural Activities

Neural Information Processing Systems

Following step-by-step procedures is an essential component of various activities carried out by individuals in their daily lives. These procedures serve as a guiding framework that helps to achieve goals efficiently, whether it is assembling furniture or preparing a recipe. However, the complexity and duration of procedural activities inherently increase the likelihood of making errors. Understanding such procedural activities from a sequence of frames is a challenging task that demands an accurate interpretation of visual information and the ability to reason about the structure of the activity. To this end, we collect a new egocentric 4D dataset CaptainCook4D comprising 384 recordings (94.5 hours) of people performing recipes in real kitchen environments. This dataset consists of two distinct types of activities: one in which participants adhere to the provided recipe instructions and another in which they deviate and induce errors. We provide 5.3K step annotations and 10K finegrained action annotations and benchmark the dataset for the following tasks: error recognition, multi-step localization and procedure learning


Benchmarking Structural Inference Methods for Interacting Dynamical Systems with Synthetic Data Aoran Wang 1 Tsz Pan Tong 1 Jun Pang

Neural Information Processing Systems

Understanding complex dynamical systems begins with identifying their topological structures, which expose the organization of the systems. This requires robust structural inference methods that can deduce structure from observed behavior. However, existing methods are often domain-specific and lack a standardized, objective comparison framework. We address this gap by benchmarking 13 structural inference methods from various disciplines on simulations representing two types of dynamics and 11 interaction graph models, supplemented by a biological experimental dataset to mirror real-world application. We evaluated the methods for accuracy, scalability, robustness, and sensitivity to graph properties. Our findings indicate that deep learning methods excel with multi-dimensional data, while classical statistics and information theory based approaches are notably accurate and robust.


DiffuPac: Contextual Mimicry in Adversarial Packets Generation via Diffusion Model

Neural Information Processing Systems

In domains of cybersecurity, recent advancements in Machine Learning (ML) and Deep Learning (DL) have significantly enhanced Network Intrusion Detection Systems (NIDS), improving the effectiveness of cybersecurity operations. However, attackers have also leveraged ML/DL to develop sophisticated models that generate adversarial packets capable of evading NIDS detection. Consequently, defenders must study and analyze these models to prepare for the evasion attacks that exploit NIDS detection mechanisms. Unfortunately, conventional generation models often rely on unrealistic assumptions about attackers' knowledge of NIDS components, making them impractical for real-world scenarios. To address this issue, we present DiffuPac, a first-of-its-kind generation model designed to generate adversarial packets that evade detection without relying on specific NIDS components. DiffuPac integrates a pre-trained Bidirectional Encoder Representations from Transformers (BERT) with diffusion model, which, through its capability for conditional denoising and classifier-free guidance, effectively addresses the real-world constraint of limited attacker knowledge. By concatenating malicious packets with contextually relevant normal packets and applying targeted noising only to the malicious packets, DiffuPac seamlessly blends adversarial packets into genuine network traffic. Through evaluations on real-world datasets, we demonstrate that DiffuPac achieves strong evasion capabilities against sophisticated NIDS, outperforming conventional methods by an average of 6.69 percentage points, while preserving the functionality and practicality of the generated adversarial packets.