Indian Ocean
Better Zero-Shot Reasoning with Self-Adaptive Prompting
Wan, Xingchen, Sun, Ruoxi, Dai, Hanjun, Arik, Sercan O., Pfister, Tomas
Modern large language models (LLMs) have demonstrated impressive capabilities at sophisticated tasks, often through step-by-step reasoning similar to humans. This is made possible by their strong few and zero-shot abilities -- they can effectively learn from a handful of handcrafted, completed responses ("in-context examples"), or are prompted to reason spontaneously through specially designed triggers. Nonetheless, some limitations have been observed. First, performance in the few-shot setting is sensitive to the choice of examples, whose design requires significant human effort. Moreover, given the diverse downstream tasks of LLMs, it may be difficult or laborious to handcraft per-task labels. Second, while the zero-shot setting does not require handcrafting, its performance is limited due to the lack of guidance to the LLMs. To address these limitations, we propose Consistency-based Self-adaptive Prompting (COSP), a novel prompt design method for LLMs. Requiring neither handcrafted responses nor ground-truth labels, COSP selects and builds the set of examples from the LLM zero-shot outputs via carefully designed criteria that combine consistency, diversity and repetition. In the zero-shot setting for three different LLMs, we show that using only LLM predictions, COSP improves performance up to 15% compared to zero-shot baselines and matches or exceeds few-shot baselines for a range of reasoning tasks.
Semantic Structure Enhanced Event Causality Identification
Hu, Zhilei, Li, Zixuan, Jin, Xiaolong, Bai, Long, Guan, Saiping, Guo, Jiafeng, Cheng, Xueqi
Event Causality Identification (ECI) aims to identify causal relations between events in unstructured texts. This is a very challenging task, because causal relations are usually expressed by implicit associations between events. Existing methods usually capture such associations by directly modeling the texts with pre-trained language models, which underestimate two kinds of semantic structures vital to the ECI task, namely, event-centric structure and event-associated structure. The former includes important semantic elements related to the events to describe them more precisely, while the latter contains semantic paths between two events to provide possible supports for ECI. In this paper, we study the implicit associations between events by modeling the above explicit semantic structures, and propose a Semantic Structure Integration model (SemSIn). It utilizes a GNN-based event aggregator to integrate the event-centric structure information, and employs an LSTM-based path aggregator to capture the event-associated structure information between two events. Experimental results on three widely used datasets show that SemSIn achieves significant improvements over baseline methods.
SHINE: Syntax-augmented Hierarchical Interactive Encoder for Zero-shot Cross-lingual Information Extraction
Ma, Jun-Yu, Gu, Jia-Chen, Ling, Zhen-Hua, Liu, Quan, Liu, Cong, Hu, Guoping
Zero-shot cross-lingual information extraction(IE) aims at constructing an IE model for some low-resource target languages, given annotations exclusively in some rich-resource languages. Recent studies based on language-universal features have shown their effectiveness and are attracting increasing attention. However, prior work has neither explored the potential of establishing interactions between language-universal features and contextual representations nor incorporated features that can effectively model constituent span attributes and relationships between multiple spans. In this study, a syntax-augmented hierarchical interactive encoder (SHINE) is proposed to transfer cross-lingual IE knowledge. The proposed encoder is capable of interactively capturing complementary information between features and contextual information, to derive language-agnostic representations for various IE tasks. Concretely, a multi-level interaction network is designed to hierarchically interact the complementary information to strengthen domain adaptability. Besides, in addition to the well-studied syntax features of part-of-speech and dependency relation, a new syntax feature of constituency structure is introduced to model the constituent span information which is crucial for IE. Experiments across seven languages on three IE tasks and four benchmarks verify the effectiveness and generalization ability of the proposed method.
A Comparative Study of Methods for Estimating Conditional Shapley Values and When to Use Them
Olsen, Lars Henry Berge, Glad, Ingrid Kristine, Jullum, Martin, Aas, Kjersti
Shapley values originated in cooperative game theory but are extensively used today as a model-agnostic explanation framework to explain predictions made by complex machine learning models in the industry and academia. There are several algorithmic approaches for computing different versions of Shapley value explanations. Here, we focus on conditional Shapley values for predictive models fitted to tabular data. Estimating precise conditional Shapley values is difficult as they require the estimation of non-trivial conditional expectations. In this article, we develop new methods, extend earlier proposed approaches, and systematize the new refined and existing methods into different method classes for comparison and evaluation. The method classes use either Monte Carlo integration or regression to model the conditional expectations. We conduct extensive simulation studies to evaluate how precisely the different method classes estimate the conditional expectations, and thereby the conditional Shapley values, for different setups. We also apply the methods to several real-world data experiments and provide recommendations for when to use the different method classes and approaches. Roughly speaking, we recommend using parametric methods when we can specify the data distribution almost correctly, as they generally produce the most accurate Shapley value explanations. When the distribution is unknown, both generative methods and regression models with a similar form as the underlying predictive model are good and stable options. Regression-based methods are often slow to train but produce the Shapley value explanations quickly once trained. The vice versa is true for Monte Carlo-based methods, making the different methods appropriate in different practical situations.
ChatPLUG: Open-Domain Generative Dialogue System with Internet-Augmented Instruction Tuning for Digital Human
Tian, Junfeng, Chen, Hehong, Xu, Guohai, Yan, Ming, Gao, Xing, Zhang, Jianhai, Li, Chenliang, Liu, Jiayi, Xu, Wenshen, Xu, Haiyang, Qian, Qi, Wang, Wei, Ye, Qinghao, Zhang, Jiejing, Zhang, Ji, Huang, Fei, Zhou, Jingren
In this paper, we present ChatPLUG, a Chinese open-domain dialogue system for digital human applications that instruction finetunes on a wide range of dialogue tasks in a unified internet-augmented format. Different from other open-domain dialogue models that focus on large-scale pre-training and scaling up model size or dialogue corpus, we aim to build a powerful and practical dialogue system for digital human with diverse skills and good multi-task generalization by internet-augmented instruction tuning. To this end, we first conduct large-scale pre-training on both common document corpus and dialogue data with curriculum learning, so as to inject various world knowledge and dialogue abilities into ChatPLUG. Then, we collect a wide range of dialogue tasks spanning diverse features of knowledge, personality, multi-turn memory, and empathy, on which we further instruction tune \modelname via unified natural language instruction templates. External knowledge from an internet search is also used during instruction finetuning for alleviating the problem of knowledge hallucinations. We show that \modelname outperforms state-of-the-art Chinese dialogue systems on both automatic and human evaluation, and demonstrates strong multi-task generalization on a variety of text understanding and generation tasks. In addition, we deploy \modelname to real-world applications such as Smart Speaker and Instant Message applications with fast inference. Our models and code will be made publicly available on ModelScope: https://modelscope.cn/models/damo/ChatPLUG-3.7B and Github: https://github.com/X-PLUG/ChatPLUG .
Adaptive Bias Correction for Improved Subseasonal Forecasting
Mouatadid, Soukayna, Orenstein, Paulo, Flaspohler, Genevieve, Cohen, Judah, Oprescu, Miruna, Fraenkel, Ernest, Mackey, Lester
Subseasonal forecasting -- predicting temperature and precipitation 2 to 6 weeks ahead -- is critical for effective water allocation, wildfire management, and drought and flood mitigation. Recent international research efforts have advanced the subseasonal capabilities of operational dynamical models, yet temperature and precipitation prediction skills remain poor, partly due to stubborn errors in representing atmospheric dynamics and physics inside dynamical models. Here, to counter these errors, we introduce an adaptive bias correction (ABC) method that combines state-of-the-art dynamical forecasts with observations using machine learning. We show that, when applied to the leading subseasonal model from the European Centre for Medium-Range Weather Forecasts (ECMWF), ABC improves temperature forecasting skill by 60-90% (over baseline skills of 0.18-0.25) and precipitation forecasting skill by 40-69% (over baseline skills of 0.11-0.15) in the contiguous U.S. We couple these performance improvements with a practical workflow to explain ABC skill gains and identify higher-skill windows of opportunity based on specific climate conditions.
Are Machine Rationales (Not) Useful to Humans? Measuring and Improving Human Utility of Free-Text Rationales
Joshi, Brihi, Liu, Ziyi, Ramnath, Sahana, Chan, Aaron, Tong, Zhewei, Nie, Shaoliang, Wang, Qifan, Choi, Yejin, Ren, Xiang
Among the remarkable emergent capabilities of large language models (LMs) is free-text rationalization; beyond a certain scale, large LMs are capable of generating seemingly useful rationalizations, which in turn, can dramatically enhance their performances on leaderboards. This phenomenon raises a question: can machine generated rationales also be useful for humans, especially when lay humans try to answer questions based on those machine rationales? We observe that human utility of existing rationales is far from satisfactory, and expensive to estimate with human studies. Existing metrics like task performance of the LM generating the rationales, or similarity between generated and gold rationales are not good indicators of their human utility. While we observe that certain properties of rationales like conciseness and novelty are correlated with their human utility, estimating them without human involvement is challenging. We show that, by estimating a rationale's helpfulness in answering similar unseen instances, we can measure its human utility to a better extent. We also translate this finding into an automated score, GEN-U, that we propose, which can help improve LMs' ability to generate rationales with better human utility, while maintaining most of its task performance. Lastly, we release all code and collected data with this project.
US evacuates private citizens from Sudan for first time
Bryan Stern and Mark Geist discusses helping Americans out of a war zone after being left behind by the Biden admin. The U.S. has evacuated its first group of American citizens and permanent residents from Sudan since war broke out in the capital weeks ago. The land evacuation started Friday with efforts to bus a large group of Americans to the Red Sea via Port Sudan. Officials revealed Saturday that unmanned aircraft provided armed overwatch as a bus convoy carried 200 to 300 Americans over 500 miles. Smoke is seen in Khartoum, Sudan, Wednesday, April 19, 2023.
Dependence of Physiochemical Features on Marine Chlorophyll Analysis with Learning Techniques
Adhikary, Subhrangshu, Chaturvedi, Sudhir Kumar, Banerjee, Saikat, Basu, Sourav
Marine chlorophyll which is present within phytoplankton are the basis of photosynthesis and they have a high significance in sustaining ecological balance as they highly contribute toward global primary productivity and comes under the food chain of many marine organisms. Imbalance in the concentrations of phytoplankton can disrupt the ecological balance. The growth of phytoplankton depends upon the optimum concentrations of physiochemical constituents like iron, nitrates, phosphates, pH level, salinity, etc. and deviations from an ideal concentration can affect the growth of phytoplankton which can ultimately disrupt the ecosystem at a large scale. Thus the analysis of such constituents has high significance to estimate the probable growth of marine phytoplankton. The advancements of remote sensing technologies have improved the scope to remotely study the physiochemical constituents on a global scale. The machine learning techniques have made it possible to predict the marine chlorophyll levels based on physiochemical properties and deep learning helped to do the same but in a more advanced manner simulating the working principle of a human brain. In this study, we have used machine learning and deep learning for the Bay of Bengal to establish a regression model of chlorophyll levels based on physiochemical features and discussed its reliability and performance for different regression models. This could help to estimate the amount of chlorophyll present in water bodies based on physiochemical features so we can plan early in case there arises a possibility of disruption in the ecosystem due to imbalance in marine phytoplankton.
US Navy sails first drone boat through Strait of Hormuz between Iran, Oman
Fox News Flash top headlines are here. Check out what's clicking on Foxnews.com. The U.S. Navy sailed its first drone boat through the strategic Strait of Hormuz on Wednesday, a crucial waterway for global energy supplies where American sailors often faces tense encounters with Iranian forces. The trip by the L3 Harris Arabian Fox MAST-13, a 41-foot speedboat carrying sensors and cameras, drew the attention of Iran's Revolutionary Guard, but took place without incident, said Navy spokesman Cmdr. Two U.S. Coast Guard cutters, the USCGC Charles Moulthrope and USCGC John Scheuerman, accompanied the drone.