Goto

Collaborating Authors

 Concept Tag: yes




SupplementaryMaterialsforHouseofCans: Covert TransmissionofInternalDatasetsviaCapacity-Aware NeuronSteganography

Neural Information Processing Systems

However, considering the ever-evolving paradigms in deep learning, employees with ulterior motivesmay fabricate reasons such asthe requirements ofdata augmentation [6]orthe purpose of multimodal learning [3] to apply for relevant and irrelevant private datasets, which is common in social engineering [4].



Why and How Auxiliary Tasks Improve JEPA Representations

Yu, Jiacan, Chen, Siyi, Liu, Mingrui, Horiuchi, Nono, Braverman, Vladimir, Xu, Zicheng, Haramati, Dan, Balestriero, Randall

arXiv.org Artificial Intelligence

Joint-Embedding Predictive Architecture (JEPA) is increasingly used for visual representation learning and as a component in model-based RL, but its behavior remains poorly understood. We provide a theoretical characterization of a simple, practical JEPA variant that has an auxiliary regression head trained jointly with latent dynamics. We prove a No Unhealthy Representation Collapse theorem: in deterministic MDPs, if training drives both the latent-transition consistency loss and the auxiliary regression loss to zero, then any pair of non-equivalent observations, i.e., those that do not have the same transition dynamics or auxiliary value, must map to distinct latent representations. Thus, the auxiliary task anchors which distinctions the representation must preserve. Controlled ablations in a counting environment corroborate the theory and show that training the JEPA model jointly with the auxiliary head generates a richer representation than training them separately. Our work indicates a path to improve JEPA encoders: training them with an auxiliary function that, together with the transition dynamics, encodes the right equivalence relations.






Relational inductive biases on attention mechanisms

Mijangos, Víctor, Gutierrez-Vasques, Ximena, Arriola, Verónica E., Rodríguez-Domínguez, Ulises, Cervantes, Alexis, Almanzara, José Luis

arXiv.org Artificial Intelligence

Inductive learning aims to construct general models from specific examples, guided by biases that influence hypothesis selection and determine generalization capacity. In this work, we focus on characterizing the relational inductive biases present in attention mechanisms, understood as assumptions about the underlying relationships between data elements. From the perspective of geometric deep learning, we analyze the most common attention mechanisms in terms of their equivariance properties with respect to permutation subgroups, which allows us to propose a classification based on their relational biases. Under this perspective, we show that different attention layers are characterized by the underlying relationships they assume on the input data.