Indian Ocean
Adaptive Split Balancing for Optimal Random Forest
Zhang, Yuqian, Ji, Weijie, Bradic, Jelena
While random forests are commonly used for regression problems, existing methods often lack adaptability in complex situations or lose optimality under simple, smooth scenarios. In this study, we introduce the adaptive split balancing forest (ASBF), capable of learning tree representations from data while simultaneously achieving minimax optimality under the Lipschitz class. To exploit higher-order smoothness levels, we further propose a localized version that attains the minimax rate under the H\"older class $\mathcal{H}^{q,\beta}$ for any $q\in\mathbb{N}$ and $\beta\in(0,1]$. Rather than relying on the widely-used random feature selection, we consider a balanced modification to existing approaches. Our results indicate that an over-reliance on auxiliary randomness may compromise the approximation power of tree models, leading to suboptimal results. Conversely, a less random, more balanced approach demonstrates optimality. Additionally, we establish uniform upper bounds and explore the application of random forests in average treatment effect estimation problems. Through simulation studies and real-data applications, we demonstrate the superior empirical performance of the proposed methods over existing random forests.
ToolSword: Unveiling Safety Issues of Large Language Models in Tool Learning Across Three Stages
Ye, Junjie, Li, Sixian, Li, Guanyu, Huang, Caishuang, Gao, Songyang, Wu, Yilong, Zhang, Qi, Gui, Tao, Huang, Xuanjing
Tool learning is widely acknowledged as a foundational approach or deploying large language models (LLMs) in real-world scenarios. While current research primarily emphasizes leveraging tools to augment LLMs, it frequently neglects emerging safety considerations tied to their application. To fill this gap, we present $ToolSword$, a comprehensive framework dedicated to meticulously investigating safety issues linked to LLMs in tool learning. Specifically, ToolSword delineates six safety scenarios for LLMs in tool learning, encompassing $malicious$ $queries$ and $jailbreak$ $attacks$ in the input stage, $noisy$ $misdirection$ and $risky$ $cues$ in the execution stage, and $harmful$ $feedback$ and $error$ $conflicts$ in the output stage. Experiments conducted on 11 open-source and closed-source LLMs reveal enduring safety challenges in tool learning, such as handling harmful queries, employing risky tools, and delivering detrimental feedback, which even GPT-4 is susceptible to. Moreover, we conduct further studies with the aim of fostering research on tool learning safety. The data is released in https://github.com/Junjie-Ye/ToolSword.
Can We Verify Step by Step for Incorrect Answer Detection?
Xu, Xin, Diao, Shizhe, Yang, Can, Wang, Yang
Chain-of-Thought (CoT) prompting has marked a significant advancement in enhancing the reasoning capabilities of large language models (LLMs). Previous studies have developed various extensions of CoT, which focus primarily on enhancing end-task performance. In addition, there has been research on assessing the quality of reasoning chains in CoT. This raises an intriguing question: Is it possible to predict the accuracy of LLM outputs by scrutinizing the reasoning chains they generate? To answer this research question, we introduce a benchmark, R2PE, designed specifically to explore the relationship between reasoning chains and performance in various reasoning tasks spanning five different domains. This benchmark aims to measure the falsehood of the final output of LLMs based on the reasoning steps. To make full use of information in multiple reasoning chains, we propose the process discernibility score (PDS) framework that beats the answer-checking baseline by a large margin. Concretely, this resulted in an average of 5.1% increase in the F1 score across all 45 subsets within R2PE. We further demonstrate our PDS's efficacy in advancing open-domain QA accuracy. Data and code are available at https://github.com/XinXU-USTC/R2PE.
After months fighting Houthis on the USS Eisenhower, sailors face a new kind of sea threat
Kirk Lippold discusses the reported three U.S. strikes against Houthis in Yemen on'Your World.' Sailors aboard the aircraft carrier USS Dwight D. Eisenhower and its accompanying warships have spent four months straight at sea defending against ballistic missiles and flying attack drones fired by Iranian-backed Houthis, and are now more regularly also defending against a new threat -- fast unmanned vessels that are fired at them through the water. While the Houthis have launched unmanned surface vessels, or USVs, in the past against Saudi coalition forces that have intervened in Yemen's civil war, they were used for the first time against U.S. military and commercial vessels in the Red Sea on Jan. 4. In the weeks since, the Navy has had to intercept and destroy multiple USVs. It's "more of an unknown threat that we don't have a lot of intel on, that could be extremely lethal -- an unmanned surface vessel," said Rear Adm. Marc Miguez, commander of Carrier Strike Group Two, of which the Eisenhower is the flagship. The Houthis "have ways of obviously controlling them just like they do the (unmanned aerial vehicles), and we have very little little fidelity as to all the stockpiles of what they have USV-wise," Miguez said.
Recovering the Pre-Fine-Tuning Weights of Generative Models
Horwitz, Eliahu, Kahana, Jonathan, Hoshen, Yedid
The dominant paradigm in generative modeling consists of two steps: i) pre-training on a large-scale but unsafe dataset, ii) aligning the pre-trained model with human values via fine-tuning. This practice is considered safe, as no current method can recover the unsafe, pre-fine-tuning model weights. In this paper, we demonstrate that this assumption is often false. Concretely, we present Spectral DeTuning, a method that can recover the weights of the pre-fine-tuning model using a few low-rank (LoRA) fine-tuned models. In contrast to previous attacks that attempt to recover pre-fine-tuning capabilities, our method aims to recover the exact pre-fine-tuning weights. Our approach exploits this new vulnerability against large-scale models such as a personalized Stable Diffusion and an aligned Mistral.
Machine Learning in management of precautionary closures caused by lipophilic biotoxins
Molares-Ulloa, Andres, Fernandez-Blanco, Enrique, Pazos, Alejandro, Rivero, Daniel
Mussel farming is one of the most important aquaculture industries. The main risk to mussel farming is harmful algal blooms (HABs), which pose a risk to human consumption. In Galicia, the Spanish main producer of cultivated mussels, the opening and closing of the production areas is controlled by a monitoring program. In addition to the closures resulting from the presence of toxicity exceeding the legal threshold, in the absence of a confirmatory sampling and the existence of risk factors, precautionary closures may be applied. These decisions are made by experts without the support or formalisation of the experience on which they are based. Therefore, this work proposes a predictive model capable of supporting the application of precautionary closures. Achieving sensitivity, accuracy and kappa index values of 97.34%, 91.83% and 0.75 respectively, the kNN algorithm has provided the best results. This allows the creation of a system capable of helping in complex situations where forecast errors are more common.
Houthis using Iranian missiles, drones to attack civilian, military targets across Middle East, DIA confirms
Houthi militants in Yemen are using Iranian-supplied missiles and drones to attack civilian and military targets across the Middle East, analysis from the Defense Intelligence Agency (DIA) shows. The report, "Iran: Enabling Houthi Attacks Across the Middle East," aims to provide more insight into the relationship between Iran and the Houthis. The militant group, stationed in Yemen, has for months been striking commercial vessels traveling through the Red Sea in protest of Palestinian civilians killed during Israel's ongoing offensive against Hamas members in Gaza. Houthi fighters stage a rally in support of the Palestinians in the Gaza Strip and against the U.S.-led airstrikes on Yemen, in Sanaa, Yemen, Monday, Jan. 29, 2024. Most recently, Houthi rebels fired ballistic missiles at two ships traveling through Middle East waters.
Generative Flows on Discrete State-Spaces: Enabling Multimodal Flows with Applications to Protein Co-Design
Campbell, Andrew, Yim, Jason, Barzilay, Regina, Rainforth, Tom, Jaakkola, Tommi
Combining discrete and continuous data is an important capability for generative models. We present Discrete Flow Models (DFMs), a new flow-based model of discrete data that provides the missing link in enabling flow-based generative models to be applied to multimodal continuous and discrete data problems. Our key insight is that the discrete equivalent of continuous space flow matching can be realized using Continuous Time Markov Chains. DFMs benefit from a simple derivation that includes discrete diffusion models as a specific instance while allowing improved performance over existing diffusion-based approaches. We utilize our DFMs method to build a multimodal flow-based modeling framework. We apply this capability to the task of protein co-design, wherein we learn a model for jointly generating protein structure and sequence. Our approach achieves state-of-the-art co-design performance while allowing the same multimodal model to be used for flexible generation of the sequence or structure.
OIL-AD: An Anomaly Detection Framework for Sequential Decision Sequences
Wang, Chen, Erfani, Sarah, Alpcan, Tansu, Leckie, Christopher
Anomaly detection in decision-making sequences is a challenging problem due to the complexity of normality representation learning and the sequential nature of the task. Most existing methods based on Reinforcement Learning (RL) are difficult to implement in the real world due to unrealistic assumptions, such as having access to environment dynamics, reward signals, and online interactions with the environment. To address these limitations, we propose an unsupervised method named Offline Imitation Learning based Anomaly Detection (OIL-AD), which detects anomalies in decision-making sequences using two extracted behaviour features: action optimality and sequential association. Our offline learning model is an adaptation of behavioural cloning with a transformer policy network, where we modify the training process to learn a Q function and a state value function from normal trajectories. We propose that the Q function and the state value function can provide sufficient information about agents' behavioural data, from which we derive two features for anomaly detection. The intuition behind our method is that the action optimality feature derived from the Q function can differentiate the optimal action from others at each local state, and the sequential association feature derived from the state value function has the potential to maintain the temporal correlations between decisions (state-action pairs). Our experiments show that OIL-AD can achieve outstanding online anomaly detection performance with up to 34.8% improvement in F1 score over comparable baselines.
INSIDE: LLMs' Internal States Retain the Power of Hallucination Detection
Chen, Chao, Liu, Kai, Chen, Ze, Gu, Yi, Wu, Yue, Tao, Mingyuan, Fu, Zhihang, Ye, Jieping
Knowledge hallucination have raised widespread concerns for the security and reliability of deployed LLMs. Previous efforts in detecting hallucinations have been employed at logit-level uncertainty estimation or language-level self-consistency evaluation, where the semantic information is inevitably lost during the tokendecoding procedure. Thus, we propose to explore the dense semantic information retained within LLMs' INternal States for hallucInation DEtection (INSIDE). In particular, a simple yet effective EigenScore metric is proposed to better evaluate responses' self-consistency, which exploits the eigenvalues of responses' covariance matrix to measure the semantic consistency/diversity in the dense embedding space. Furthermore, from the perspective of self-consistent hallucination detection, a test time feature clipping approach is explored to truncate extreme activations in the internal states, which reduces overconfident generations and potentially benefits the detection of overconfident hallucinations. Extensive experiments and ablation studies are performed on several popular LLMs and questionanswering (QA) benchmarks, showing the effectiveness of our proposal. Large Language Models (LLMs) have recently achieved a milestone breakthrough and demonstrated impressive abilities in various applications (Ouyang et al., 2022; OpenAI, 2023). However, it has been widely observed that even the state-of-the-art LLMs often make factually incorrect or nonsense generations (Cohen et al., 2023; Ren et al., 2022; Kuhn et al., 2022), which is also known as knowledge hallucination (Ji et al., 2023). The potentially unreliable generations make it risky to deploy LLMs in practical scenarios.