AITopics | Wang, Xinyi

Collaborating Authors

Wang, Xinyi

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Air Bumper: A Collision Detection and Reaction Framework for Autonomous MAV Navigation

Wang, Ruoyu, Guo, Zixuan, Chen, Yizhou, Wang, Xinyi, Chen, Ben M.

arXiv.org Artificial IntelligenceSep-15-2023

Autonomous navigation in unknown environments with obstacles remains challenging for micro aerial vehicles (MAVs) due to their limited onboard computing and sensing resources. Although various collision avoidance methods have been developed, it is still possible for drones to collide with unobserved obstacles due to unpredictable disturbances, sensor limitations, and control uncertainty. Instead of completely avoiding collisions, this article proposes Air Bumper, a collision detection and reaction framework, for fully autonomous flight in 3D environments to improve the safety of drones. Our framework only utilizes the onboard inertial measurement unit (IMU) to detect and estimate collisions. We further design a collision recovery control for rapid recovery and collision-aware mapping to integrate collision information into general LiDAR-based sensing and planning frameworks. Our simulation and experimental results show that the quadrotor can rapidly detect, estimate, and recover from collisions with obstacles in 3D space and continue the flight smoothly with the help of the collision-aware map. Our Air Bumper will be released as open-source software on GitHub.

artificial intelligence, collision, obstacle, (16 more...)

arXiv.org Artificial Intelligence

2307.06101

Genre: Research Report > New Finding (0.34)

Industry: Transportation > Air (1.00)

Technology:

Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.91)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.68)

Add feedback

FIAT: Fusing learning paradigms with Instruction-Accelerated Tuning

Wang, Xinyi, Wieting, John, Clark, Jonathan H.

arXiv.org Artificial IntelligenceSep-12-2023

Learning paradigms for large language models (LLMs) currently tend to fall within either in-context learning (ICL) or full fine-tuning. Each of these comes with their own trade-offs based on available data, model size, compute cost, ease-of-use, and final quality with neither solution performing well across-the-board. In this article, we first describe ICL and fine-tuning paradigms in a way that highlights their natural connections. Some of their most exciting capabilities, such as producing logical reasoning to solve a problem, are found to emerge only when the model size is over a certain threshold, often hundreds of billions of parameters (Wei et al., 2022b;a). The impressive capabilities of these models to produce high-quality responses without any task-specific tuning along with the very high cost of further tuning such models has led much recent work to focus on the paradigm of In-Context Learning (ICL)--placing a few task-specific examples and instructions into the model's input (Brown et al., 2020; Chowdhery et al., 2022; Google et al., 2023; OpenAI, 2023). Although prior work has seen that fine-tuning a model on task data can often lead to superior performance on the downstream task compared to ICL (Scao & Rush, 2021; Schick & Schütze, 2020a;b; Asai et al., 2023), there are significantly fewer recent efforts on fine-tuning models for tasks with limited data, perhaps because the time and compute costs associated with tuning a very large model drives practitioners toward smaller models, abandoning the ability to take advantage of emergent model capabilities.

arxiv preprint arxiv, large language model, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2309.04663

Country:

North America > Canada (0.14)
Europe > France (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Add feedback

Automatically Correcting Large Language Models: Surveying the landscape of diverse self-correction strategies

Pan, Liangming, Saxon, Michael, Xu, Wenda, Nathani, Deepak, Wang, Xinyi, Wang, William Yang

arXiv.org Artificial IntelligenceAug-29-2023

Large language models (LLMs) have demonstrated remarkable performance across a wide array of NLP tasks. However, their efficacy is undermined by undesired and inconsistent behaviors, including hallucination, unfaithful reasoning, and toxic content. A promising approach to rectify these flaws is self-correction, where the LLM itself is prompted or guided to fix problems in its own output. Techniques leveraging automated feedback -- either produced by the LLM itself or some external system -- are of particular interest as they are a promising way to make LLM-based solutions more practical and deployable with minimal human feedback. This paper presents a comprehensive review of this emerging class of techniques. We analyze and taxonomize a wide array of recent work utilizing these strategies, including training-time, generation-time, and post-hoc correction. We also summarize the major applications of this strategy and conclude by discussing future directions and challenges.

diverse self-correction strategy, large language model, natural language, (3 more...)

arXiv.org Artificial Intelligence

2308.03188

Genre:

Research Report (0.69)
Overview (0.53)

Technology: Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)

Add feedback

Non-parametric Probabilistic Time Series Forecasting via Innovations Representation

Wang, Xinyi, Lee, Meijen, Zhao, Qing, Tong, Lang

arXiv.org Artificial IntelligenceJun-4-2023

Probabilistic time series forecasting predicts the conditional probability distributions of the time series at a future time given past realizations. Such techniques are critical in risk-based decision-making and planning under uncertainties. Existing approaches are primarily based on parametric or semi-parametric time-series models that are restrictive, difficult to validate, and challenging to adapt to varying conditions. This paper proposes a nonparametric method based on the classic notion of {\em innovations} pioneered by Norbert Wiener and Gopinath Kallianpur that causally transforms a nonparametric random process to an independent and identical uniformly distributed {\em innovations process}. We present a machine-learning architecture and a learning algorithm that circumvent two limitations of the original Wiener-Kallianpur innovations representation: (i) the need for known probability distributions of the time series and (ii) the existence of a causal decoder that reproduces the original time series from the innovations representation. We develop a deep-learning approach and a Monte Carlo sampling technique to obtain a generative model for the predicted conditional probability distribution of the time series based on a weak notion of Wiener-Kallianpur innovations representation. The efficacy of the proposed probabilistic forecasting technique is demonstrated on a variety of electricity price datasets, showing marked improvement over leading benchmarks of probabilistic forecasting techniques.

artificial intelligence, machine learning, representation, (15 more...)

arXiv.org Artificial Intelligence

2306.03782

Country:

North America > United States > New York (0.28)
North America > United States > Massachusetts > Middlesex County (0.14)

Genre: Research Report (0.82)

Industry: Energy > Power Industry (0.67)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

XTREME-UP: A User-Centric Scarce-Data Benchmark for Under-Represented Languages

Ruder, Sebastian, Clark, Jonathan H., Gutkin, Alexander, Kale, Mihir, Ma, Min, Nicosia, Massimo, Rijhwani, Shruti, Riley, Parker, Sarr, Jean-Michel A., Wang, Xinyi, Wieting, John, Gupta, Nitish, Katanova, Anna, Kirov, Christo, Dickinson, Dana L., Roark, Brian, Samanta, Bidisha, Tao, Connie, Adelani, David I., Axelrod, Vera, Caswell, Isaac, Cherry, Colin, Garrette, Dan, Ingle, Reeve, Johnson, Melvin, Panteleev, Dmitry, Talukdar, Partha

arXiv.org Artificial IntelligenceMay-24-2023

Data scarcity is a crucial issue for the development of highly multilingual NLP systems. Yet for many under-represented languages (ULs) -- languages for which NLP re-search is particularly far behind in meeting user needs -- it is feasible to annotate small amounts of data. Motivated by this, we propose XTREME-UP, a benchmark defined by: its focus on the scarce-data scenario rather than zero-shot; its focus on user-centric tasks -- tasks with broad adoption by speakers of high-resource languages; and its focus on under-represented languages where this scarce-data scenario tends to be most realistic. XTREME-UP evaluates the capabilities of language models across 88 under-represented languages over 9 key user-centric technologies including ASR, OCR, MT, and information access tasks that are of general utility. We create new datasets for OCR, autocomplete, semantic parsing, and transliteration, and build on and refine existing datasets for other tasks. XTREME-UP provides methodology for evaluating many modeling scenarios including text-only, multi-modal (vision, audio, and text),supervised parameter tuning, and in-context learning. We evaluate commonly used models on the benchmark. We release all code and scripts to train and evaluate models

computational linguistic, machine learning, natural language, (21 more...)

arXiv.org Artificial Intelligence

2305.11938

Country:

Europe (1.00)
North America > United States (0.68)
Asia > Middle East > UAE (0.14)
Asia > Middle East > Qatar (0.14)

Genre: Research Report > New Finding (0.67)

Industry:

Education (0.67)
Health & Medicine > Therapeutic Area (0.45)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Natural Language > Machine Translation (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (0.93)

Add feedback

mmT5: Modular Multilingual Pre-Training Solves Source Language Hallucinations

Pfeiffer, Jonas, Piccinno, Francesco, Nicosia, Massimo, Wang, Xinyi, Reid, Machel, Ruder, Sebastian

arXiv.org Artificial IntelligenceMay-23-2023

Multilingual sequence-to-sequence models perform poorly with increased language coverage and fail to consistently generate text in the correct target language in few-shot settings. To address these challenges, we propose mmT5, a modular multilingual sequence-to-sequence model. mmT5 utilizes language-specific modules during pre-training, which disentangle language-specific information from language-agnostic information. We identify representation drift during fine-tuning as a key limitation of modular generative models and develop strategies that enable effective zero-shot transfer. Our model outperforms mT5 at the same parameter sizes by a large margin on representative natural language understanding and generation tasks in 40+ languages. Compared to mT5, mmT5 raises the rate of generating text in the correct language under zero-shot settings from 7% to 99%, thereby greatly alleviating the source language hallucination problem.

artificial intelligence, machine learning, natural language, (16 more...)

arXiv.org Artificial Intelligence

2305.14224

Country:

Europe (0.92)
North America > United States > California (0.14)
Asia > Middle East > UAE (0.14)

Genre: Research Report (0.49)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

Serial Contrastive Knowledge Distillation for Continual Few-shot Relation Extraction

Wang, Xinyi, Wang, Zitao, Hu, Wei

arXiv.org Artificial IntelligenceMay-11-2023

Therefore, the continual Heist and Paulheim, 2017; Zhang et al., 2018) few-shot RE paradigm (Qin and Joty, 2022) mainly assume a fixed pre-defined relation set and was proposed to simulate real human learning scenarios, train on a fixed dataset. However, they cannot work where new knowledge can be acquired from well with the new relations that continue emerging a small number of new samples. As illustrated in in some real-world scenarios of RE. Continual Figure 1, the continual few-shot RE paradigm expects RE (Wang et al., 2019; Han et al., 2020; Wu et al., the model to continuously learn new relations 2021) was proposed as a new paradigm to solve through abundant training data only for the first this situation, which applies the idea of continual task, but through sparse training data for all subsequent learning (Parisi et al., 2019) to the field of RE. tasks. Thus, the model needs to identify Compared with conventional RE, continual RE the growing relations well with few labeled data is more challenging. It requires the model to learn for them while retaining the knowledge on old relations emerging relations while maintaining a stable and without re-training from scratch. As relations accurate classification of old relations, i.e., the socalled grow, the confusion about relation representations catastrophic forgetting problem (Thrun and leads to catastrophic forgetting.

machine learning, natural language, relation, (16 more...)

arXiv.org Artificial Intelligence

2305.06616

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)

Add feedback

Sampling-based path planning under temporal logic constraints with real-time adaptation

Chen, Yizhou, Wang, Ruoyu, Wang, Xinyi, Chen, Ben M.

arXiv.org Artificial IntelligenceFeb-21-2023

Replanning in temporal logic tasks is extremely difficult during the online execution of robots. This study introduces an effective path planner that computes solutions for temporal logic goals and instantly adapts to non-static and partially unknown environments. Given prior knowledge and a task specification, the planner first identifies an initial feasible solution by growing a sampling-based search tree. While carrying out the computed plan, the robot maintains a solution library to continuously enhance the unfinished part of the plan and store backup plans. The planner updates existing plans when meeting unexpected obstacles or recognizing flaws in prior knowledge. Upon a high-level path is obtained, a trajectory generator tracks the path by dividing it into segments of motion primitives. Our planner is integrated into an autonomous mobile robot system, further deployed on a multicopter with limited onboard processing power. In simulation and real-world experiments, our planner is demonstrated to swiftly and effectively adjust to environmental uncertainties.

artificial intelligence, planning & scheduling, robot, (19 more...)

arXiv.org Artificial Intelligence

2302.11114

Country: Asia (0.14)

Genre: Research Report (0.40)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Planning & Scheduling (1.00)
Information Technology > Artificial Intelligence > Robots > Robot Planning & Action (0.94)

Add feedback

Causal Balancing for Domain Generalization

Wang, Xinyi, Saxon, Michael, Li, Jiachen, Zhang, Hongyang, Zhang, Kun, Wang, William Yang

arXiv.org Artificial IntelligenceFeb-19-2023

While machine learning models rapidly advance the state-of-the-art on various real-world tasks, out-of-domain (OOD) generalization remains a challenging problem given the vulnerability of these models to spurious correlations. We propose a balanced mini-batch sampling strategy to transform a biased data distribution into a spurious-free balanced distribution, based on the invariance of the underlying causal mechanisms for the data generation process. We argue that the Bayes optimal classifiers trained on such balanced distribution are minimax optimal across a diverse enough environment space. We also provide an identifiability guarantee of the latent variable model of the proposed data generation process, when utilizing enough train environments. Experiments are conducted on DomainBed, demonstrating empirically that our method obtains the best performance across 20 baselines reported on the benchmark.

artificial intelligence, generalization, machine learning, (20 more...)

arXiv.org Artificial Intelligence

2206.05263

Country: North America > United States > California (0.14)

Genre: Research Report (1.00)

Industry: Health & Medicine (0.92)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models (0.87)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.67)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)

Add feedback

PECO: Examining Single Sentence Label Leakage in Natural Language Inference Datasets through Progressive Evaluation of Cluster Outliers

Saxon, Michael, Wang, Xinyi, Xu, Wenda, Wang, William Yang

arXiv.org Artificial IntelligenceFeb-11-2023

Building natural language inference (NLI) benchmarks that are both challenging for modern techniques, and free from shortcut biases is difficult. Chief among these biases is "single sentence label leakage," where annotator-introduced spurious correlations yield datasets where the logical relation between (premise, hypothesis) pairs can be accurately predicted from only a single sentence, something that should in principle be impossible. We demonstrate that despite efforts to reduce this leakage, it persists in modern datasets that have been introduced since its 2018 discovery. To enable future amelioration efforts, introduce a novel model-driven technique, the progressive evaluation of cluster outliers (PECO) which enables both the objective measurement of leakage, and the automated detection of subpopulations in the data which maximally exhibit it.

artificial intelligence, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2112.09237

Country:

Europe (0.67)
North America > United States > California (0.14)
North America > United States > Louisiana (0.14)

Genre:

Research Report > Promising Solution (0.48)
Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback