Goto

Collaborating Authors

arXiv.org Artificial Intelligence


Standardizing and Benchmarking Crisis-related Social Media Datasets for Humanitarian Information Processing

arXiv.org Artificial Intelligence

Time-critical analysis of social media streams is important for humanitarian organizations to plan rapid response during disasters. The crisis informatics research community has developed several techniques and systems to process and classify big crisis related data posted on social media. However, due to the dispersed nature of the datasets used in the literature, it is not possible to compare the results and measure the progress made towards better models for crisis informatics. In this work, we attempt to bridge this gap by standardizing various existing crisis-related datasets. We consolidate labels of eight annotated data sources and provide 166.1k and 141.5k tweets for informativeness and humanitarian classification tasks, respectively. The consolidation results in a larger dataset that affords the ability to train more sophisticated models. To that end, we provide baseline results using CNN and BERT models.


A Comprehensive Survey on Traffic Prediction

arXiv.org Artificial Intelligence

Traffic prediction plays an essential role in intelligent transportation system. Accurate traffic prediction can assist route planing, guide vehicle dispatching, and mitigate traffic congestion. This problem is challenging due to the complicated and dynamic spatio-temporal dependencies between different regions in the road network. Recently, a significant amount of research efforts have been devoted to this area, greatly advancing traffic prediction abilities. The purpose of this paper is to provide a comprehensive survey for traffic prediction. Specifically, we first summarize the existing traffic prediction methods, and give a taxonomy of them. Second, we list the common applications of traffic prediction and the state-of-the-art in these applications. Third, we collect and organize widely used public datasets in the existing literature. Furthermore, we give an evaluation by conducting extensive experiments to compare the performance of methods related to traffic demand and speed prediction respectively on two datasets. Finally, we discuss potential future directions.


There is Strength in Numbers: Avoiding the Hypothesis-Only Bias in Natural Language Inference via Ensemble Adversarial Training

arXiv.org Artificial Intelligence

Natural Language Inference (NLI) datasets contain annotation artefacts resulting in spurious correlations between the natural language utterances and their respective entailment classes. These artefacts are exploited by neural networks even when only considering the hypothesis and ignoring the premise, leading to unwanted biases. Previous work proposed tackling this problem via adversarial training, but this leads to learned sentence representations that still suffer from the same biases. As a solution, we propose using an ensemble of adversaries during the training, encouraging the model to jointly decrease the accuracy of these different adversaries while fitting the data. We show that using an ensemble of adversaries can prevent the bias from being relearned after the model training is completed, further improving how well the model generalises to different NLI datasets. In particular, these models outperformed previous approaches when tested on 12 different NLI datasets not used in the model training. Finally, the optimal number of adversarial classifiers depends on the dimensionality of the sentence representations, with larger dimensional representations benefiting when trained with a greater number of adversaries.


Logical Natural Language Generation from Open-Domain Tables

arXiv.org Artificial Intelligence

Neural natural language generation (NLG) models have recently shown remarkable progress in fluency and coherence. However, existing studies on neural NLG are primarily focused on surface-level realizations with limited emphasis on logical inference, an important aspect of human thinking and language. In this paper, we suggest a new NLG task where a model is tasked with generating natural language statements that can be \emph{logically entailed} by the facts in an open-domain semi-structured table. To facilitate the study of the proposed logical NLG problem, we use the existing TabFact dataset \cite{chen2019tabfact} featured with a wide range of logical/symbolic inferences as our testbed, and propose new automatic metrics to evaluate the fidelity of generation models w.r.t.\ logical inference. The new task poses challenges to the existing monotonic generation frameworks due to the mismatch between sequence order and logical order. In our experiments, we comprehensively survey different generation architectures (LSTM, Transformer, Pre-Trained LM) trained with different algorithms (RL, Adversarial Training, Coarse-to-Fine) on the dataset and made following observations: 1) Pre-Trained LM can significantly boost both the fluency and logical fidelity metrics, 2) RL and Adversarial Training are trading fluency for fidelity, 3) Coarse-to-Fine generation can help partially alleviate the fidelity issue while maintaining high language fluency. The code and data are available at \url{https://github.com/wenhuchen/LogicNLG}.


Detecting fake news for the new coronavirus by reasoning on the Covid-19 ontology

arXiv.org Artificial Intelligence

In the context of the Covid-19 pandemic, many were quick to spread deceptive information. I investigate here how reasoning in Description Logics (DLs) can detect inconsistencies between trusted medical sources and not trusted ones. The not-trusted information comes in natural language (e.g. "Covid-19 affects only the elderly"). To automatically convert into DLs, I used the FRED converter. Reasoning in Description Logics is then performed with the Racer tool.


From Understanding Genetic Drift to a Smart-Restart Parameter-less Compact Genetic Algorithm

arXiv.org Artificial Intelligence

One of the key difficulties in using estimation-of-distribution algorithms is choosing the population sizes appropriately: Too small values lead to genetic drift, which can cause enormous difficulties. In the regime with no genetic drift, however, often the runtime is roughly proportional to the population size, which renders large population sizes inefficient. Based on a recent quantitative analysis which population sizes lead to genetic drift, we propose a parameter-less version of the compact genetic algorithm that automatically finds a suitable population size without spending too much time in situations unfavorable due to genetic drift. We prove an easy mathematical runtime guarantee for this algorithm and conduct an extensive experimental analysis on four classic benchmark problems. The former shows that under a natural assumption, our algorithm has a performance similar to the one obtainable from the best population size. The latter confirms that missing the right population size can be highly detrimental and shows that our algorithm as well as a previously proposed parameter-less one based on parallel runs avoids such pitfalls. Comparing the two approaches, ours profits from its ability to abort runs which are likely to be stuck in a genetic drift situation.


Warm-Start AlphaZero Self-Play Search Enhancements

arXiv.org Artificial Intelligence

Recently, AlphaZero has achieved landmark results in deep reinforcement learning, by providing a single self-play architecture that learned three different games at super human level. AlphaZero is a large and complicated system with many parameters, and success requires much compute power and fine-tuning. Reproducing results in other games is a challenge, and many researchers are looking for ways to improve results while reducing computational demands. AlphaZero's design is purely based on self-play and makes no use of labeled expert data or domain specific enhancements; it is designed to learn from scratch. We propose a novel approach to deal with this cold-start problem by employing simple search enhancements at the beginning phase of self-play training, namely Rollout, Rapid Action Value Estimate (RAVE) and dynamically weighted combinations of these with the neural network, and Rolling Horizon Evolutionary Algorithms (RHEA). Our experiments indicate that most of these enhancements improve the performance of their baseline player in three different (small) board games, with especially RAVE based variants playing strongly.


The Moral Burden of Ambiguity Aversion

arXiv.org Artificial Intelligence

In their article, "Egalitarianism under Severe Uncertainty", Philosophy and Public Affairs, 46:3, 2018, Thomas Rowe and Alex Voorhoeve develop an original moral decision theory for cases under uncertainty, called "pluralist egalitarianism under uncertainty". In this paper, I firstly sketch their views and arguments. I then elaborate on their moral decision theory by discussing how it applies to choice scenarios in health ethics. Finally, I suggest a new two-stage Ellsberg thought experiment challenging the core of the principle of their theory. In such an experiment pluralist egalitarianism seems to suggest the wrong, morally and rationally speaking, course of action -- no matter whether I consider my thought experiment in a simultaneous or a sequential setting.


Cpp-Taskflow v2: A General-purpose Parallel and Heterogeneous Task Programming System at Scale

arXiv.org Artificial Intelligence

The Cpp-Taskflow project addresses the long-standing question: How can we make it easier for developers to write parallel and heterogeneous programs with high performance and simultaneous high productivity? Cpp-Taskflow develops a simple and powerful task programming model to enable efficient implementations of heterogeneous decomposition strategies. Our programming model empowers users with both static and dynamic task graph constructions to incorporate a broad range of computational patterns including hybrid CPU-GPU computing, dynamic control flow, and irregularity. We develop an efficient heterogeneous work-stealing strategy that adapts worker threads to available task parallelism at any time during the graph execution. We have demonstrated promising performance of Cpp-Taskflow on both micro-benchmark and real-world applications. As an example, we solved a large machine learning workload by up to 1.5x faster, 1.6x less memory, and 1.7x fewer lines of code than two industrial-strength systems, oneTBB and StarPU, on a machine of 40 CPUs and 4 GPUs.


Challenge Closed-book Science Exam: A Meta-learning Based Question Answering System

arXiv.org Artificial Intelligence

Prior work in standardized science exams requires support from large text corpus, such as targeted science corpus from Wikipedia or SimpleWikipedia. However, retrieving knowledge from the large corpus is time-consuming and questions embedded in complex semantic representation may interfere with retrieval. Inspired by the dual process theory in cognitive science, we propose a MetaQA framework, where system 1 is an intuitive meta-classifier and system 2 is a reasoning module. Specifically, our method based on meta-learning method and large language model BERT, which can efficiently solve science problems by learning from related example questions without relying on external knowledge bases. We evaluate our method on AI2 Reasoning Challenge (ARC), and the experimental results show that meta-classifier yields considerable classification performance on emerging question types. The information provided by meta-classifier significantly improves the accuracy of reasoning module from 46.6% to 64.2%, which has a competitive advantage over retrieval-based QA methods.