Wang, Xinyi
Air Bumper: A Collision Detection and Reaction Framework for Autonomous MAV Navigation
Wang, Ruoyu, Guo, Zixuan, Chen, Yizhou, Wang, Xinyi, Chen, Ben M.
Autonomous navigation in unknown environments with obstacles remains challenging for micro aerial vehicles (MAVs) due to their limited onboard computing and sensing resources. Although various collision avoidance methods have been developed, it is still possible for drones to collide with unobserved obstacles due to unpredictable disturbances, sensor limitations, and control uncertainty. Instead of completely avoiding collisions, this article proposes Air Bumper, a collision detection and reaction framework, for fully autonomous flight in 3D environments to improve the safety of drones. Our framework only utilizes the onboard inertial measurement unit (IMU) to detect and estimate collisions. We further design a collision recovery control for rapid recovery and collision-aware mapping to integrate collision information into general LiDAR-based sensing and planning frameworks. Our simulation and experimental results show that the quadrotor can rapidly detect, estimate, and recover from collisions with obstacles in 3D space and continue the flight smoothly with the help of the collision-aware map. Our Air Bumper will be released as open-source software on GitHub.
FIAT: Fusing learning paradigms with Instruction-Accelerated Tuning
Wang, Xinyi, Wieting, John, Clark, Jonathan H.
Learning paradigms for large language models (LLMs) currently tend to fall within either in-context learning (ICL) or full fine-tuning. Each of these comes with their own trade-offs based on available data, model size, compute cost, ease-of-use, and final quality with neither solution performing well across-the-board. In this article, we first describe ICL and fine-tuning paradigms in a way that highlights their natural connections. Some of their most exciting capabilities, such as producing logical reasoning to solve a problem, are found to emerge only when the model size is over a certain threshold, often hundreds of billions of parameters (Wei et al., 2022b;a). The impressive capabilities of these models to produce high-quality responses without any task-specific tuning along with the very high cost of further tuning such models has led much recent work to focus on the paradigm of In-Context Learning (ICL)--placing a few task-specific examples and instructions into the model's input (Brown et al., 2020; Chowdhery et al., 2022; Google et al., 2023; OpenAI, 2023). Although prior work has seen that fine-tuning a model on task data can often lead to superior performance on the downstream task compared to ICL (Scao & Rush, 2021; Schick & Schรผtze, 2020a;b; Asai et al., 2023), there are significantly fewer recent efforts on fine-tuning models for tasks with limited data, perhaps because the time and compute costs associated with tuning a very large model drives practitioners toward smaller models, abandoning the ability to take advantage of emergent model capabilities.
Automatically Correcting Large Language Models: Surveying the landscape of diverse self-correction strategies
Pan, Liangming, Saxon, Michael, Xu, Wenda, Nathani, Deepak, Wang, Xinyi, Wang, William Yang
Large language models (LLMs) have demonstrated remarkable performance across a wide array of NLP tasks. However, their efficacy is undermined by undesired and inconsistent behaviors, including hallucination, unfaithful reasoning, and toxic content. A promising approach to rectify these flaws is self-correction, where the LLM itself is prompted or guided to fix problems in its own output. Techniques leveraging automated feedback -- either produced by the LLM itself or some external system -- are of particular interest as they are a promising way to make LLM-based solutions more practical and deployable with minimal human feedback. This paper presents a comprehensive review of this emerging class of techniques. We analyze and taxonomize a wide array of recent work utilizing these strategies, including training-time, generation-time, and post-hoc correction. We also summarize the major applications of this strategy and conclude by discussing future directions and challenges.
Non-parametric Probabilistic Time Series Forecasting via Innovations Representation
Wang, Xinyi, Lee, Meijen, Zhao, Qing, Tong, Lang
Probabilistic time series forecasting predicts the conditional probability distributions of the time series at a future time given past realizations. Such techniques are critical in risk-based decision-making and planning under uncertainties. Existing approaches are primarily based on parametric or semi-parametric time-series models that are restrictive, difficult to validate, and challenging to adapt to varying conditions. This paper proposes a nonparametric method based on the classic notion of {\em innovations} pioneered by Norbert Wiener and Gopinath Kallianpur that causally transforms a nonparametric random process to an independent and identical uniformly distributed {\em innovations process}. We present a machine-learning architecture and a learning algorithm that circumvent two limitations of the original Wiener-Kallianpur innovations representation: (i) the need for known probability distributions of the time series and (ii) the existence of a causal decoder that reproduces the original time series from the innovations representation. We develop a deep-learning approach and a Monte Carlo sampling technique to obtain a generative model for the predicted conditional probability distribution of the time series based on a weak notion of Wiener-Kallianpur innovations representation. The efficacy of the proposed probabilistic forecasting technique is demonstrated on a variety of electricity price datasets, showing marked improvement over leading benchmarks of probabilistic forecasting techniques.
XTREME-UP: A User-Centric Scarce-Data Benchmark for Under-Represented Languages
Ruder, Sebastian, Clark, Jonathan H., Gutkin, Alexander, Kale, Mihir, Ma, Min, Nicosia, Massimo, Rijhwani, Shruti, Riley, Parker, Sarr, Jean-Michel A., Wang, Xinyi, Wieting, John, Gupta, Nitish, Katanova, Anna, Kirov, Christo, Dickinson, Dana L., Roark, Brian, Samanta, Bidisha, Tao, Connie, Adelani, David I., Axelrod, Vera, Caswell, Isaac, Cherry, Colin, Garrette, Dan, Ingle, Reeve, Johnson, Melvin, Panteleev, Dmitry, Talukdar, Partha
Data scarcity is a crucial issue for the development of highly multilingual NLP systems. Yet for many under-represented languages (ULs) -- languages for which NLP re-search is particularly far behind in meeting user needs -- it is feasible to annotate small amounts of data. Motivated by this, we propose XTREME-UP, a benchmark defined by: its focus on the scarce-data scenario rather than zero-shot; its focus on user-centric tasks -- tasks with broad adoption by speakers of high-resource languages; and its focus on under-represented languages where this scarce-data scenario tends to be most realistic. XTREME-UP evaluates the capabilities of language models across 88 under-represented languages over 9 key user-centric technologies including ASR, OCR, MT, and information access tasks that are of general utility. We create new datasets for OCR, autocomplete, semantic parsing, and transliteration, and build on and refine existing datasets for other tasks. XTREME-UP provides methodology for evaluating many modeling scenarios including text-only, multi-modal (vision, audio, and text),supervised parameter tuning, and in-context learning. We evaluate commonly used models on the benchmark. We release all code and scripts to train and evaluate models
mmT5: Modular Multilingual Pre-Training Solves Source Language Hallucinations
Pfeiffer, Jonas, Piccinno, Francesco, Nicosia, Massimo, Wang, Xinyi, Reid, Machel, Ruder, Sebastian
Multilingual sequence-to-sequence models perform poorly with increased language coverage and fail to consistently generate text in the correct target language in few-shot settings. To address these challenges, we propose mmT5, a modular multilingual sequence-to-sequence model. mmT5 utilizes language-specific modules during pre-training, which disentangle language-specific information from language-agnostic information. We identify representation drift during fine-tuning as a key limitation of modular generative models and develop strategies that enable effective zero-shot transfer. Our model outperforms mT5 at the same parameter sizes by a large margin on representative natural language understanding and generation tasks in 40+ languages. Compared to mT5, mmT5 raises the rate of generating text in the correct language under zero-shot settings from 7% to 99%, thereby greatly alleviating the source language hallucination problem.
Serial Contrastive Knowledge Distillation for Continual Few-shot Relation Extraction
Wang, Xinyi, Wang, Zitao, Hu, Wei
Therefore, the continual Heist and Paulheim, 2017; Zhang et al., 2018) few-shot RE paradigm (Qin and Joty, 2022) mainly assume a fixed pre-defined relation set and was proposed to simulate real human learning scenarios, train on a fixed dataset. However, they cannot work where new knowledge can be acquired from well with the new relations that continue emerging a small number of new samples. As illustrated in in some real-world scenarios of RE. Continual Figure 1, the continual few-shot RE paradigm expects RE (Wang et al., 2019; Han et al., 2020; Wu et al., the model to continuously learn new relations 2021) was proposed as a new paradigm to solve through abundant training data only for the first this situation, which applies the idea of continual task, but through sparse training data for all subsequent learning (Parisi et al., 2019) to the field of RE. tasks. Thus, the model needs to identify Compared with conventional RE, continual RE the growing relations well with few labeled data is more challenging. It requires the model to learn for them while retaining the knowledge on old relations emerging relations while maintaining a stable and without re-training from scratch. As relations accurate classification of old relations, i.e., the socalled grow, the confusion about relation representations catastrophic forgetting problem (Thrun and leads to catastrophic forgetting.
Sampling-based path planning under temporal logic constraints with real-time adaptation
Chen, Yizhou, Wang, Ruoyu, Wang, Xinyi, Chen, Ben M.
Replanning in temporal logic tasks is extremely difficult during the online execution of robots. This study introduces an effective path planner that computes solutions for temporal logic goals and instantly adapts to non-static and partially unknown environments. Given prior knowledge and a task specification, the planner first identifies an initial feasible solution by growing a sampling-based search tree. While carrying out the computed plan, the robot maintains a solution library to continuously enhance the unfinished part of the plan and store backup plans. The planner updates existing plans when meeting unexpected obstacles or recognizing flaws in prior knowledge. Upon a high-level path is obtained, a trajectory generator tracks the path by dividing it into segments of motion primitives. Our planner is integrated into an autonomous mobile robot system, further deployed on a multicopter with limited onboard processing power. In simulation and real-world experiments, our planner is demonstrated to swiftly and effectively adjust to environmental uncertainties.
Causal Balancing for Domain Generalization
Wang, Xinyi, Saxon, Michael, Li, Jiachen, Zhang, Hongyang, Zhang, Kun, Wang, William Yang
While machine learning models rapidly advance the state-of-the-art on various real-world tasks, out-of-domain (OOD) generalization remains a challenging problem given the vulnerability of these models to spurious correlations. We propose a balanced mini-batch sampling strategy to transform a biased data distribution into a spurious-free balanced distribution, based on the invariance of the underlying causal mechanisms for the data generation process. We argue that the Bayes optimal classifiers trained on such balanced distribution are minimax optimal across a diverse enough environment space. We also provide an identifiability guarantee of the latent variable model of the proposed data generation process, when utilizing enough train environments. Experiments are conducted on DomainBed, demonstrating empirically that our method obtains the best performance across 20 baselines reported on the benchmark.
PECO: Examining Single Sentence Label Leakage in Natural Language Inference Datasets through Progressive Evaluation of Cluster Outliers
Saxon, Michael, Wang, Xinyi, Xu, Wenda, Wang, William Yang
Building natural language inference (NLI) benchmarks that are both challenging for modern techniques, and free from shortcut biases is difficult. Chief among these biases is "single sentence label leakage," where annotator-introduced spurious correlations yield datasets where the logical relation between (premise, hypothesis) pairs can be accurately predicted from only a single sentence, something that should in principle be impossible. We demonstrate that despite efforts to reduce this leakage, it persists in modern datasets that have been introduced since its 2018 discovery. To enable future amelioration efforts, introduce a novel model-driven technique, the progressive evaluation of cluster outliers (PECO) which enables both the objective measurement of leakage, and the automated detection of subpopulations in the data which maximally exhibit it.