Goto

Collaborating Authors

 Indian Ocean


Toward Robust Uncertainty Estimation with Random Activation Functions

arXiv.org Artificial Intelligence

In this paper, we focus on ensemble UQ techniques, either Bayesian Recent advances in deep neural networks have demonstrated or non-Bayesian, as this group is less explored compared to remarkable performance in a wide variety of applications, the solely Bayesian techniques. An ensemble model aggregates ranging from recommendation systems and improving user the predictions of multiple individual base-learners (or experience to natural language processing and speech recognition ensemble members), which in our case are neural networks (Abiodun et al. 2018). Nevertheless, blindly relying (NNs), and the empirical variance of their predictions gives on the outcome of these models can have harmful effects, an approximate measure of uncertainty. The idea behind this especially in high-stake domains such as healthcare heuristic is highly intuitive: the more the base-learners disagree and autonomous driving, as models can provide inaccurate on the outcome, the more uncertain they are. Therefore, predictions when queried in out-of-distribution data the goal of ensemble members is to have a great level points (Amodei et al. 2016). Consequently, correctly quantifying of disagreement (variability) in the areas where little or no the uncertainty of models' predictions is an admissible data is available, and to have a high level of agreement in mechanism to distinguish where a model can or cannot regions with abundance of data (Pearce et al. 2018).


FLSea: Underwater Visual-Inertial and Stereo-Vision Forward-Looking Datasets

arXiv.org Artificial Intelligence

Visibility underwater is challenging, and degrades as the distance between the subject and camera increases, making vision tasks in the forward-looking direction more difficult. We have collected underwater forward-looking stereo-vision and visual-inertial image sets in the Mediterranean and Red Sea. To our knowledge there are no other public datasets in the underwater environment acquired with this camera-sensor orientation published with ground-truth. These datasets are critical for the development of several underwater applications, including obstacle avoidance, visual odometry, 3D tracking, Simultaneous Localization and Mapping (SLAM) and depth estimation. The stereo datasets include synchronized stereo images in dynamic underwater environments with objects of known-size. The visual-inertial datasets contain monocular images and IMU measurements, aligned with millisecond resolution timestamps and objects of known size which were placed in the scene. Both sensor configurations allow for scale estimation, with the calibrated baseline in the stereo setup and the IMU in the visual-inertial setup. Ground truth depth maps were created offline for both dataset types using photogrammetry. The ground truth is validated with multiple known measurements placed throughout the imaged environment. There are 5 stereo and 8 visual-inertial datasets in total, each containing thousands of images, with a range of different underwater visibility and ambient light conditions, natural and man-made structures and dynamic camera motions. The forward-looking orientation of the camera makes these datasets unique and ideal for testing underwater obstacle-avoidance algorithms and for navigation close to the seafloor in dynamic environments. With our datasets, we hope to encourage the advancement of autonomous functionality for underwater vehicles in dynamic and/or shallow water environments.


Benchmarks for Automated Commonsense Reasoning: A Survey

arXiv.org Artificial Intelligence

More than one hundred benchmarks have been developed to test the commonsense knowledge and commonsense reasoning abilities of artificial intelligence (AI) systems. However, these benchmarks are often flawed and many aspects of common sense remain untested. Consequently, we do not currently have any reliable way of measuring to what extent existing AI systems have achieved these abilities. This paper surveys the development and uses of AI commonsense benchmarks. We discuss the nature of common sense; the role of common sense in AI; the goals served by constructing commonsense benchmarks; and desirable features of commonsense benchmarks. We analyze the common flaws in benchmarks, and we argue that it is worthwhile to invest the work needed ensure that benchmark examples are consistently high quality. We survey the various methods of constructing commonsense benchmarks. We enumerate 139 commonsense benchmarks that have been developed: 102 text-based, 18 image-based, 12 video based, and 7 simulated physical environments. We discuss the gaps in the existing benchmarks and aspects of commonsense reasoning that are not addressed in any existing benchmark. We conclude with a number of recommendations for future development of commonsense AI benchmarks.


US Navy official says Iranian attacks in Middle East 'have the attention of everyone'

FOX News

Fox News Flash top headlines are here. Check out what's clicking on Foxnews.com. Iranian attacks in the waterways of the Middle East and elsewhere in the region "have the attention of everyone" as tensions rise over Tehran's advancing nuclear program, the head of the U.S. Navy's 5th Fleet said Tuesday. Vice Adm. Brad Cooper also told The Associated Press that he's seen a rise in what he described as Iran's "malign activities" in the region over his two years leading the Bahrain-based 5th Fleet. While Cooper pointed to recent seizures of weapons by American and allied forces in the region as a success, he acknowledged that Iran has been able to carry out drone attacks targeting shipping in the Mideast and other assaults in the region.


What happens before and after: Multi-Event Commonsense in Event Coreference Resolution

arXiv.org Artificial Intelligence

Event coreference models cluster event mentions pertaining to the same real-world event. Recent models rely on contextualized representations to recognize coreference among lexically or contextually similar mentions. However, models typically fail to leverage commonsense inferences, which is particularly limiting for resolving lexically-divergent mentions. We propose a model that extends event mentions with temporal commonsense inferences. Given a complex sentence with multiple events, e.g., "The man killed his wife and got arrested", with the target event "arrested", our model generates plausible events that happen before the target event - such as "the police arrived", and after it, such as "he was sentenced". We show that incorporating such inferences into an existing event coreference model improves its performance, and we analyze the coreferences in which such temporal knowledge is required.


Towards Fine-Grained Information: Identifying the Type and Location of Translation Errors

arXiv.org Artificial Intelligence

Fine-grained information on translation errors is helpful for the translation evaluation community. Existing approaches can not synchronously consider error position and type, failing to integrate the error information of both. In this paper, we propose Fine-Grained Translation Error Detection (FG-TED) task, aiming at identifying both the position and the type of translation errors on given source-hypothesis sentence pairs. Besides, we build an FG-TED model to predict the \textbf{addition} and \textbf{omission} errors -- two typical translation accuracy errors. First, we use a word-level classification paradigm to form our model and use the shortcut learning reduction to relieve the influence of monolingual features. Besides, we construct synthetic datasets for model training, and relieve the disagreement of data labeling in authoritative datasets, making the experimental benchmark concordant. Experiments show that our model can identify both error type and position concurrently, and gives state-of-the-art results on the restored dataset. Our model also delivers more reliable predictions on low-resource and transfer scenarios than existing baselines. The related datasets and the source code will be released in the future.


A Generative Adversarial Network for Climate Tipping Point Discovery (TIP-GAN)

arXiv.org Artificial Intelligence

We propose a new Tipping Point Generative Adversarial Network (TIP-GAN) for better characterizing potential climate tipping points in Earth system models. We describe an adversarial game to explore the parameter space of these models, detect upcoming tipping points, and discover the drivers of tipping points. In this setup, a set of generators learn to construct model configurations that will invoke a climate tipping point. The discriminator learns to identify which generators are generating each model configuration and whether a given configuration will lead to a tipping point. The discriminator is trained using an oracle (a surrogate climate model) to test if a generated model configuration leads to a tipping point or not. We demonstrate the application of this GAN to invoke the collapse of the Atlantic Meridional Overturning Circulation (AMOC). We share experimental results of modifying the loss functions and the number of generators to exploit the area of uncertainty in model state space near a climate tipping point. In addition, we show that our trained discriminator can predict AMOC collapse with a high degree of accuracy without the use of the oracle. This approach could generalize to other tipping points, and could augment climate modeling research by directing users interested in studying tipping points to parameter sets likely to induce said tipping points in their computationally intensive climate models.


Multimodal Chain-of-Thought Reasoning in Language Models

arXiv.org Artificial Intelligence

Large language models (LLMs) have shown impressive performance on complex reasoning by leveraging chain-of-thought (CoT) prompting to generate intermediate reasoning chains as the rationale to infer the answer. However, existing CoT studies have focused on the language modality. We propose Multimodal-CoT that incorporates language (text) and vision (images) modalities into a two-stage framework that separates rationale generation and answer inference. In this way, answer inference can leverage better generated rationales that are based on multimodal information. With Multimodal-CoT, our model under 1 billion parameters outperforms the previous state-of-the-art LLM (GPT-3.5) by 16 percentage points (75.17%->91.68% accuracy) on the ScienceQA benchmark and even surpasses human performance. Code is publicly available available at https://github.com/amazon-science/mm-cot.


US condemns Russian use of Iranian drones in Ukraine

FOX News

American defense officials on Tuesday sought to dispel any doubt that Iran is supplying drones for Russia's war in Ukraine, releasing photos and analysis of unmanned aircraft deployed in the conflict to demonstrate Tehran's involvement. During a briefing in London, analysts from the Defense Intelligence Agency displayed photos of drones that attacked Ukraine alongside images of those previously traced to Iran. A comparison of design details such as tail fins, nose cones and landing gear shows that the weapons used in Ukraine are "indistinguishable" from Shahed-131 and -136 attack drones and Mohajer 6 unmanned aerial vehicles used in the Middle East. The effort to "show the homework'' is intended to help persuade governments or international agencies of Tehran's involvement. Iran has said it supplied a "small number" of drones to Russia before the invasion of Ukraine but has denied providing any more since troops crossed the border last February. The evidence proves otherwise, an official from the Defense Intelligence Agency said while speaking on condition of anonymity because of the sensitivity of the information. "Iran is a partner in the conflict with Russia,'' the official said.


Dark solitons in Bose-Einstein condensates: a dataset for many-body physics research

arXiv.org Artificial Intelligence

We establish a dataset of over $1.6\times10^4$ experimental images of Bose--Einstein condensates containing solitonic excitations to enable machine learning (ML) for many-body physics research. About $33~\%$ of this dataset has manually assigned and carefully curated labels. The remainder is automatically labeled using SolDet -- an implementation of a physics-informed ML data analysis framework -- consisting of a convolutional-neural-network-based classifier and OD as well as a statistically motivated physics-informed classifier and a quality metric. This technical note constitutes the definitive reference of the dataset, providing an opportunity for the data science community to develop more sophisticated analysis tools, to further understand nonlinear many-body physics, and even advance cold atom experiments.