Goto

Collaborating Authors

 difference



TITAN: Graph-Executable Reasoning for Cyber Threat Intelligence

Simoni, Marco, Fontana, Aleksandar, Saracino, Andrea, Mori, Paolo

arXiv.org Artificial Intelligence

TITAN (Threat Intelligence Through Automated Navigation) is a framework that connects natural-language cyber-threat queries with executable reasoning over a structured knowledge graph. It integrates a path-planner model, which predicts logical relation chains from text, and a graph executor that traverses the TITAN Ontology to retrieve factual answers and supporting evidence. Unlike traditional retrieval systems, TITAN operates on a typed, bidirectional graph derived from MITRE ATT&CK, allowing reasoning to move clearly and reversibly between threats, behaviors, and defenses. To support training and evaluation, we introduce the TITAN Dataset, a corpus of 88,209 examples (Train: 74,258; Test: 13,951) pairing natural-language questions with executable reasoning paths and step-by-step Chain-of-Thought explanations. Empirical evaluations show that TITAN enables models to generate syntactically valid and semantically coherent reasoning paths that can be deterministically executed on the underlying graph.



Single-Loop Stochastic Algorithms for Difference of Max-Structured Weakly Convex Functions

Neural Information Processing Systems

In this paper, we study a class of non-smooth non-convex problems in the form of \min_{x}[\max_{y\in\mathcal Y}\phi(x, y) - \max_{z\in\mathcal Z}\psi(x, z)], where both \Phi(x) \max_{y\in\mathcal Y}\phi(x, y) and \Psi(x) \max_{z\in\mathcal Z}\psi(x, z) are weakly convex functions, and \phi(x, y), \psi(x, z) are strongly concave functions in terms of y and z, respectively. It covers two families of problems that have been studied but are missing single-loop stochastic algorithms, i.e., difference of weakly convex functions and weakly convex strongly-concave min-max problems. We propose a stochastic Moreau envelope approximate gradient method dubbed SMAG, the first single-loop algorithm for solving these problems, and provide a state-of-the-art non-asymptotic convergence rate. The key idea of the design is to compute an approximate gradient of the Moreau envelopes of \Phi, \Psi using only one step of stochastic gradient update of the primal and dual variables. Empirically, we conduct experiments on positive-unlabeled (PU) learning and partial area under ROC curve (pAUC) optimization with an adversarial fairness regularizer to validate the effectiveness of our proposed algorithms.


Difference of Convex Functions Programming for Reinforcement Learning

Bilal Piot, Matthieu Geist, Olivier Pietquin

Neural Information Processing Systems

Large Markov Decision Processes are usually solved using Approximate Dynamic Programming methods such as Approximate Value Iteration or Approximate Policy Iteration. The main contribution of this paper is to show that, alternatively, the optimal state-action value function can be estimated using Difference of Convex functions (DC) Programming.


Adapting to Evolving Adversaries with Regularized Continual Robust Training

Dai, Sihui, Cianfarani, Christian, Bhagoji, Arjun, Sehwag, Vikash, Mittal, Prateek

arXiv.org Artificial Intelligence

Robust training methods typically defend against specific attack types, such as Lp attacks with fixed budgets, and rarely account for the fact that defenders may encounter new attacks over time. A natural solution is to adapt the defended model to new adversaries as they arise via fine-tuning, a method which we call continual robust training (CRT). However, when implemented naively, fine-tuning on new attacks degrades robustness on previous attacks. This raises the question: how can we improve the initial training and fine-tuning of the model to simultaneously achieve robustness against previous and new attacks? We present theoretical results which show that the gap in a model's robustness against different attacks is bounded by how far each attack perturbs a sample in the model's logit space, suggesting that regularizing with respect to this logit space distance can help maintain robustness against previous attacks. Extensive experiments on 3 datasets (CIFAR-10, CIFAR-100, and ImageNette) and over 100 attack combinations demonstrate that the proposed regularization improves robust accuracy with little overhead in training time. Our findings and open-source code lay the groundwork for the deployment of models robust to evolving attacks.


Disentanglement in Difference: Directly Learning Semantically Disentangled Representations by Maximizing Inter-Factor Differences

Zhang, Xingshen, Liu, Shuangrong, Lu, Xintao, Pang, Chaoran, Wang, Lin, Yang, Bo

arXiv.org Artificial Intelligence

In this study, Disentanglement in Difference(DiD) is proposed to address the inherent inconsistency between the statistical independence of latent variables and the goal of semantic disentanglement in disentanglement representation learning. Conventional disentanglement methods achieve disentanglement representation by improving statistical independence among latent variables. However, the statistical independence of latent variables does not necessarily imply that they are semantically unrelated, thus, improving statistical independence does not always enhance disentanglement performance. To address the above issue, DiD is proposed to directly learn semantic differences rather than the statistical independence of latent variables. In the DiD, a Difference Encoder is designed to measure the semantic differences; a contrastive loss function is established to facilitate inter-dimensional comparison. Both of them allow the model to directly differentiate and disentangle distinct semantic factors, thereby resolving the inconsistency between statistical independence and semantic disentanglement. Experimental results on the dSprites and 3DShapes datasets demonstrate that the proposed DiD outperforms existing mainstream methods across various disentanglement metrics.


The Differences Between Direct Alignment Algorithms are a Blur

Gorbatovski, Alexey, Shaposhnikov, Boris, Sinii, Viacheslav, Malakhov, Alexey, Gavrilov, Daniil

arXiv.org Artificial Intelligence

Direct Alignment Algorithms (DAAs) simplify language model alignment by replacing reinforcement learning (RL) and reward modeling (RM) in Reinforcement Learning from Human Feedback (RLHF) with direct policy optimization. DAAs can be classified by their ranking losses (pairwise vs. pointwise), by the rewards used in those losses (e.g., likelihood ratios of policy and reference policy, or odds ratios), or by whether a Supervised Fine-Tuning (SFT) phase is required (two-stage vs. one-stage). We first show that one-stage methods underperform two-stage methods. To address this, we incorporate an explicit SFT phase and introduce the $\beta$ parameter, controlling the strength of preference optimization, into single-stage ORPO and ASFT. These modifications improve their performance in Alpaca Eval 2 by +$3.46$ (ORPO) and +$8.27$ (ASFT), matching two-stage methods like DPO. Further analysis reveals that the key factor is whether the approach uses pairwise or pointwise objectives, rather than the specific implicit reward or loss function. These results highlight the importance of careful evaluation to avoid premature claims of performance gains or overall superiority in alignment algorithms.


FLawN-T5: An Empirical Examination of Effective Instruction-Tuning Data Mixtures for Legal Reasoning

Niklaus, Joel, Zheng, Lucia, McCarthy, Arya D., Hahn, Christopher, Rosen, Brian M., Henderson, Peter, Ho, Daniel E., Honke, Garrett, Liang, Percy, Manning, Christopher

arXiv.org Artificial Intelligence

Instruction tuning is an important step in making language models useful for direct user interaction. However, many legal tasks remain out of reach for most open LLMs and there do not yet exist any large scale instruction datasets for the domain. This critically limits research in this application area. In this work, we curate LawInstruct, a large legal instruction dataset, covering 17 jurisdictions, 24 languages and a total of 12M examples. We present evidence that domain-specific pretraining and instruction tuning improve performance on LegalBench, including improving Flan-T5 XL by 8 points or 16\% over the baseline. However, the effect does not generalize across all tasks, training regimes, model sizes, and other factors. LawInstruct is a resource for accelerating the development of models with stronger information processing and decision making capabilities in the legal domain.


PINNslope: seismic data interpolation and local slope estimation with physics informed neural networks

Brandolin, Francesco, Ravasi, Matteo, Alkhalifah, Tariq

arXiv.org Artificial Intelligence

Interpolation of aliased seismic data constitutes a key step in a seismic processing workflow to obtain high quality velocity models and seismic images. Building on the idea of describing seismic wavefields as a superposition of local plane waves, we propose to interpolate seismic data by utilizing a physics informed neural network (PINN). In the proposed framework, two feed-forward neural networks are jointly trained using the local plane wave differential equation as well as the available data as two terms in the objective function: a primary network assisted by positional encoding is tasked with reconstructing the seismic data, whilst an auxiliary, smaller network estimates the associated local slopes. Results on synthetic and field data validate the effectiveness of the proposed method in handling aliased (coarsely sampled) data and data with large gaps. Our method compares favorably against a classic least-squares inversion approach regularized by the local plane-wave equation as well as a PINN-based approach with a single network and pre-computed local slopes. We find that introducing a second network to estimate the local slopes whilst at the same time interpolating the aliased data enhances the overall reconstruction capabilities and convergence behavior of the primary network. Moreover, an additional positional encoding layer embedded as the first layer of the wavefield network confers to the network the ability to converge faster improving the accuracy of the data term.