Goto

Collaborating Authors

 South America


Deep Particulate Matter Forecasting Model Using Correntropy-Induced Loss

arXiv.org Machine Learning

Forecasting the particulate matter (PM) concentration in South Korea has become urgently necessary owing to its strong negative impact on human life. In most statistical or machine learning methods, independent and identically distributed data, for example, a Gaussian distribution, are assumed; however, time series such as air pollution and weather data do not meet this assumption. In this study, the maximum correntropy criterion for regression (MCCR) loss is used in an analysis of the statistical characteristics of air pollution and weather data. Rigorous seasonality adjustment of the air pollution and weather data was performed because of their complex seasonality patterns and the heavy-tailed distribution of data even after deseasonalization. The MCCR loss was applied to multiple models including conventional statistical models and state-of-the-art machine learning models. The results show that the MCCR loss is more appropriate than the conventional mean squared error loss for forecasting extreme values.


The FLORES-101 Evaluation Benchmark for Low-Resource and Multilingual Machine Translation

arXiv.org Artificial Intelligence

One of the biggest challenges hindering progress in low-resource and multilingual machine translation is the lack of good evaluation benchmarks. Current evaluation benchmarks either lack good coverage of low-resource languages, consider only restricted domains, or are low quality because they are constructed using semi-automatic procedures. In this work, we introduce the FLORES-101 evaluation benchmark, consisting of 3001 sentences extracted from English Wikipedia and covering a variety of different topics and domains. These sentences have been translated in 101 languages by professional translators through a carefully controlled process. The resulting dataset enables better assessment of model quality on the long tail of low-resource languages, including the evaluation of many-to-many multilingual translation systems, as all translations are multilingually aligned. By publicly releasing such a high-quality and high-coverage dataset, we hope to foster progress in the machine translation community and beyond.


Expectations and perceptions of healthcare professionals for robot deployment in hospital environments during the COVID-19 pandemic

Robohub

The recent outbreak of the severe acute respiratory syndrome, caused by coronavirus 2 (SARS-CoV-2), also referred to as COVID-19, has spread globally in an unprecedented way. In response, the efforts of most countries have been focused on containing and mitigating the effects of the pandemic. Given the transmission rate of the virus, the World Health Organization (WHO) recommended several strategies, such as physical distancing, to contain the spread and widespread transmission. Driven by other factors, including the effects of this pandemic on the economy, some countries are now resuming economic activities, making it all the more necessary to ensure compliance with bio-safety protocols to contain further spread of the virus. So, the background to our project is this diverse landscape of different public health measures that are having to be adopted around the world, oftentimes with the measures being iteratively refined by the authorities as their impacts on the social, economic, and political sectors become clearer. In the health sector, all levels and different stakeholders of the world's health systems have been unwaveringly focused on providing medical care during this pandemic.


Senior Software Engineer, Machine Learning

#artificialintelligence

Ripple's mission is to enable payments every way, everywhere for everyone. We believe connecting traditional financial entities like banks, payment providers and corporations with emerging blockchain technologies and users is the path to an open, decentralized, and more inclusive financial future. This Internet of Value gives any internet-enabled person, application or device access to financial services that are transparent, fast, reliable, and cheap. Delivering this vision is a challenge of massive scale spanning $155 trillion in annual cross border fiat payments and the $1.5 trillion market of digital assets that has grown 10X in the last year. We are looking for an experienced software engineer to join a new team charged with determining and delivering optimal liquidity for every customer in the world in a cost-effective, robust and scalable manner.


GraphMI: Extracting Private Graph Data from Graph Neural Networks

arXiv.org Artificial Intelligence

As machine learning becomes more widely used for critical applications, the need to study its implications in privacy turns to be urgent. Given access to the target model and auxiliary information, the model inversion attack aims to infer sensitive features of the training dataset, which leads to great privacy concerns. Despite its success in grid-like domains, directly applying model inversion techniques on non-grid domains such as graph achieves poor attack performance due to the difficulty to fully exploit the intrinsic properties of graphs and attributes of nodes used in Graph Neural Networks (GNN). To bridge this gap, we present \textbf{Graph} \textbf{M}odel \textbf{I}nversion attack (GraphMI), which aims to extract private graph data of the training graph by inverting GNN, one of the state-of-the-art graph analysis tools. Specifically, we firstly propose a projected gradient module to tackle the discreteness of graph edges while preserving the sparsity and smoothness of graph features. Then we design a graph auto-encoder module to efficiently exploit graph topology, node attributes, and target model parameters for edge inference. With the proposed methods, we study the connection between model inversion risk and edge influence and show that edges with greater influence are more likely to be recovered. Extensive experiments over several public datasets demonstrate the effectiveness of our method. We also show that differential privacy in its canonical form can hardly defend our attack while preserving decent utility.


Data sets, fraud, and the future « Jon Rappoport's Blog

#artificialintelligence

Right off the bat, here is a scene from the near-future: AI takes a look at John Jones' medical records, does instant collating, and comes up with a disease diagnosis. Via Zoom, the doctor's AI assistant slaps on a diagnosis, and an hour later, two bottles of medical drugs arrive at Jones' door. One problem: the data set assembled by AI is preposterous. Jones' so-called symptoms don't add up to a disease. Only in another data set, held by the CDC, do the symptoms require a disease-label.


Be Considerate: Objectives, Side Effects, and Deciding How to Act

arXiv.org Artificial Intelligence

Recent work in AI safety has highlighted that in sequential decision making, objectives are often underspecified or incomplete. This gives discretion to the acting agent to realize the stated objective in ways that may result in undesirable outcomes. We contend that to learn to act safely, a reinforcement learning (RL) agent should include contemplation of the impact of its actions on the wellbeing and agency of others in the environment, including other acting agents and reactive processes. We endow RL agents with the ability to contemplate such impact by augmenting their reward based on expectation of future return by others in the environment, providing different criteria for characterizing impact. We further endow these agents with the ability to differentially factor this impact into their decision making, manifesting behavior that ranges from self-centred to self-less, as demonstrated by experiments in gridworld environments.


Layered gradient accumulation and modular pipeline parallelism: fast and efficient training of large language models

arXiv.org Artificial Intelligence

The advent of the transformer has sparked a quick growth in the size of language models, far outpacing hardware improvements. (Dense) transformers are expected to reach the trillion-parameter scale in the near future, for which training requires thousands or even tens of thousands of GPUs. We investigate the challenges of training at this scale and beyond on commercially available hardware. In particular, we analyse the shortest possible training time for different configurations of distributed training, leveraging empirical scaling laws for language models to estimate the optimal (critical) batch size. Contrary to popular belief, we find no evidence for a memory wall, and instead argue that the real limitation -- other than the cost -- lies in the training duration. In addition to this analysis, we introduce two new methods, \textit{layered gradient accumulation} and \textit{modular pipeline parallelism}, which together cut the shortest training time by half. The methods also reduce data movement, lowering the network requirement to a point where a fast InfiniBand connection is not necessary. This increased network efficiency also improve on the methods introduced with the ZeRO optimizer, reducing the memory usage to a tiny fraction of the available GPU memory.


Model-agnostic and Scalable Counterfactual Explanations via Reinforcement Learning

arXiv.org Machine Learning

Counterfactual instances are a powerful tool to obtain valuable insights into automated decision processes, describing the necessary minimal changes in the input space to alter the prediction towards a desired target. Most previous approaches require a separate, computationally expensive optimization procedure per instance, making them impractical for both large amounts of data and high-dimensional data. Moreover, these methods are often restricted to certain subclasses of machine learning models (e.g. differentiable or tree-based models). In this work, we propose a deep reinforcement learning approach that transforms the optimization procedure into an end-to-end learnable process, allowing us to generate batches of counterfactual instances in a single forward pass. Our experiments on real-world data show that our method i) is model-agnostic (does not assume differentiability), relying only on feedback from model predictions; ii) allows for generating target-conditional counterfactual instances; iii) allows for flexible feature range constraints for numerical and categorical attributes, including the immutability of protected features (e.g. gender, race); iv) is easily extended to other data modalities such as images.


The Signed Cumulative Distribution Transform for 1-D Signal Analysis and Classification

arXiv.org Artificial Intelligence

This paper presents a new mathematical signal transform that is especially suitable for decoding information related to non-rigid signal displacements. We provide a measure theoretic framework to extend the existing Cumulative Distribution Transform [ACHA 45 (2018), no. 3, 616-641] to arbitrary (signed) signals on $\overline{\mathbb{R}}$. We present both forward (analysis) and inverse (synthesis) formulas for the transform, and describe several of its properties including translation, scaling, convexity, linear separability and others. Finally, we describe a metric in transform space, and demonstrate the application of the transform in classifying (detecting) signals under random displacements.