AITopics

2005.06394

Country:

North America > Canada > British Columbia > Vancouver Island > Capital Regional District > Victoria (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Burnaby (0.04)

Genre: Research Report (0.64)

Industry: Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Bellinger, Colin, Coles, Rory, Crowley, Mark, Tamblyn, Isaac

Reinforcement Learning in a Physics-Inspired Semi-Markov Environment

arXiv.org Artificial IntelligenceApr-15-2020

Reinforcement learning (RL) has been demonstrated to have great potential in many applications of scientific discovery and design. Recent work includes, for example, the design of new structures and compositions of molecules for therapeutic drugs. Much of the existing work related to the application of RL to scientific domains, however, assumes that the available state representation obeys the Markov property. For reasons associated with time, cost, sensor accuracy, and gaps in scientific knowledge, many scientific design and discovery problems do not satisfy the Markov property. Thus, something other than a Markov decision process (MDP) should be used to plan / find the optimal policy. In this paper, we present a physics-inspired semi-Markov RL environment, namely the phase change environment. In addition, we evaluate the performance of value-based RL algorithms for both MDPs and partially observable MDPs (POMDPs) on the proposed environment. Our results demonstrate deep recurrent Q-networks (DRQN) significantly outperform deep Q-networks (DQN), and that DRQNs benefit from training with hindsight experience replay. Implications for the use of semi-Markovian RL and POMDPs for scientific laboratories are also discussed.

agent, change environment, phase change environment, (15 more...)

2004.07333

Country:

North America > Canada > Ontario > National Capital Region > Ottawa (0.14)
North America > United States > District of Columbia > Washington (0.04)
North America > Canada > Ontario > Waterloo Region > Waterloo (0.04)
(2 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.76)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

arXiv.org Machine LearningApr-1-2020

Statistical Queries and Statistical Algorithms: Foundations and Applications

Reyzin, Lev

Over 20 years ago, Kearns [1998] introduced statistical queries as a framework for designing machine learning algorithms that are tolerant to noise. The statistical query model restricts a learning algorithm to ask certain types of queries to an oracle that responds with approximately correct answers. This framework has has proven useful, not only for designing noise-tolerant algorithms, but also for its connections to other noise models, for its ability to capture many of our current techniques, and for its explanatory power about the hardness of many important problems. Researchers have also found many connections between statistical queries and a variety of modern topics, including to evolvability, differential privacy, and adaptive data analysis. Statistical queries are now both an important tool and remain a foundational topic with many important questions. The aim of this survey is to illustrate these connections and bring researchers to the forefront of our understanding of this important area.

algorithm, query, statistical query, (14 more...)

2004.00557

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.14)
Oceania > Australia > Australian Capital Territory > Canberra (0.04)
North America > United States > Pennsylvania > Philadelphia County > Philadelphia (0.04)
(19 more...)

Genre:

Instructional Material > Course Syllabus & Notes (0.68)
Overview (0.66)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

arXiv.org Machine LearningMar-17-2020

Improving predictions by nonlinear regression models from outlying input data

Hsieh, William W.

When applying machine learning/statistical methods to the environmental sciences, nonlinear regression (NLR) models often perform only slightly better and occasionally worse than linear regression (LR). The proposed reason for this conundrum is that NLR models can give predictions much worse than LR when given input data which lie outside the domain used in model training. Continuous unbounded variables are widely used in environmental sciences, whence not uncommon for new input data to lie far outside the training domain. For six environmental datasets, inputs in the test data were classified as "outliers" and "non-outliers" based on the Mahalanobis distance from the training input data. The prediction scores (mean absolute error, Spearman correlation) showed NLR to outperform LR for the non-outliers, but often underperform LR for the outliers. An approach based on Occam's Razor (OR) was proposed, where linear extrapolation was used instead of nonlinear extrapolation for the outliers. The linear extrapolation to the outlier domain was based on the NLR model within the non-outlier domain. This NLR$_{\mathrm{OR}}$ approach reduced occurrences of very poor extrapolation by NLR, and it tended to outperform NLR and LR for the outliers. In conclusion, input test data should be screened for outliers. For outliers, the unreliable NLR predictions can be replaced by NLR$_{\mathrm{OR}}$ or LR predictions, or by issuing a "no reliable prediction" warning.

extrapolation, outlier, training data, (15 more...)

2003.07926

Country:

North America > Canada > Ontario (0.14)
Asia > China > Beijing > Beijing (0.05)
North America > United States > New York (0.04)
(6 more...)

Genre: Research Report (1.00)

Industry:

Transportation > Air (0.68)
Transportation > Infrastructure & Services > Airport (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Regression (1.00)

Jain, Piyush, Coogan, Sean C P, Subramanian, Sriram Ganapathi, Crowley, Mark, Taylor, Steve, Flannigan, Mike D

A review of machine learning applications in wildfire science and management

arXiv.org Machine LearningMar-1-2020

Artificial intelligence has been applied in wildfire science and management since the 1990s, with early applications including neural networks and expert systems. Since then the field has rapidly progressed congruently with the wide adoption of machine learning (ML) in the environmental sciences. Here, we present a scoping review of ML in wildfire science and management. Our objective is to improve awareness of ML among wildfire scientists and managers, as well as illustrate the challenging range of problems in wildfire science available to data scientists. We first present an overview of popular ML approaches used in wildfire science to date, and then review their use in wildfire science within six problem domains: 1) fuels characterization, fire detection, and mapping; 2) fire weather and climate change; 3) fire occurrence, susceptibility, and risk; 4) fire behavior prediction; 5) fire effects; and 6) fire management. We also discuss the advantages and limitations of various ML approaches and identify opportunities for future advances in wildfire science and management within a data science context. We identified 298 relevant publications, where the most frequently used ML methods included random forests, MaxEnt, artificial neural networks, decision trees, support vector machines, and genetic algorithms. There exists opportunities to apply more current ML methods (e.g., deep learning and agent based learning) in wildfire science. However, despite the ability of ML models to learn on their own, expertise in wildfire science is necessary to ensure realistic modelling of fire processes across multiple scales, while the complexity of some ML methods requires sophisticated knowledge for their application. Finally, we stress that the wildfire research and management community plays an active role in providing relevant, high quality data for use by practitioners of ML methods.

agricultural and forest meteorology, classification and regression problem, geoscience and remote sensing letter, (16 more...)

2003.00646

Country:

Asia > China > Fujian Province (0.14)
North America > United States > California > San Mateo County > San Mateo (0.13)
Europe > Greece (0.04)
(84 more...)

Genre:

Research Report > New Finding (1.00)
Overview (1.00)
Research Report > Experimental Study (0.68)

Industry:

Energy (1.00)
Law Enforcement & Public Safety > Fire & Emergency Services (0.93)
Government > Regional Government > North America Government > United States Government (0.92)
Health & Medicine > Pharmaceuticals & Biotechnology (0.92)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Fuzzy Logic (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
(10 more...)

arXiv.org Machine LearningFeb-25-2020

Attacks Which Do Not Kill Training Make Adversarial Learning Stronger

Zhang, Jingfeng, Xu, Xilie, Han, Bo, Niu, Gang, Cui, Lizhen, Sugiyama, Masashi, Kankanhalli, Mohan

Adversarial training based on the minimax formulation is necessary for obtaining adversarial robustness of trained models. However, it is conservative or even pessimistic so that it sometimes hurts the natural generalization. In this paper, we raise a fundamental question---do we have to trade off natural generalization for adversarial robustness? We argue that adversarial training is to employ confident adversarial data for updating the current model. We propose a novel approach of friendly adversarial training (FAT): rather than employing most adversarial data maximizing the loss, we search for least adversarial (i.e., friendly adversarial) data minimizing the loss, among the adversarial data that are confidently misclassified. Our novel formulation is easy to implement by just stopping the most adversarial data searching algorithms such as PGD (projected gradient descent) early, which we call early-stopped PGD. Theoretically, FAT is justified by an upper bound of the adversarial risk. Empirically, early-stopped PGD allows us to answer the earlier question negatively---adversarial robustness can indeed be achieved without compromising the natural generalization.

accuracy, adversarial data, test accuracy, (15 more...)

2002.11242

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > California > Los Angeles County > Long Beach (0.04)
(11 more...)

Genre: Research Report > Promising Solution (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

arXiv.org Machine LearningFeb-11-2020

Deep Autotuner: a Pitch Correcting Network for Singing Performances

Wager, Sanna, Tzanetakis, George, Wang, Cheng-i, Kim, Minje

We introduce a data-driven approach to automatic pitch correction of solo singing performances. The proposed approach predicts note-wise pitch shifts from the relationship between the respective spectrograms of the singing and accompaniment. This approach differs from commercial systems, where vocal track notes are usually shifted to be centered around pitches in a user-defined score, or mapped to the closest pitch among the twelve equal-tempered scale degrees. The proposed system treats pitch as a continuous value rather than relying on a set of discretized notes found in musical scores, thus allowing for improvisation and harmonization in the singing performance. We train our neural network model using a dataset of 4,702 amateur karaoke performances selected for good intonation. Our model is trained on both incorrect intonation, for which it learns a correction, and intentional pitch variation, which it learns to preserve. The proposed deep neural network with gated recurrent units on top of convolutional layers shows promising performance on the real-world score-free singing pitch correction task of autotuning.

correction, pitch correction, pitch shift, (13 more...)

2002.05511

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
North America > United States > Indiana > Monroe County > Bloomington (0.04)
North America > Canada > British Columbia > Vancouver Island > Capital Regional District > Victoria (0.04)

Genre: Research Report (1.00)

Industry:

Media > Music (1.00)
Leisure & Entertainment (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

arXiv.org Artificial IntelligenceFeb-2-2020

Solving Billion-Scale Knapsack Problems

Zhang, Xingwen, Qi, Feng, Hua, Zhigang, Yang, Shuang

Knapsack problems (KPs) are common in industry, but solving KPs is known to be NP-hard and has been tractable only at a relatively small scale. This paper examines KPs in a slightly generalized form and shows that they can be solved nearly optimally at scale via distributed algorithms. The proposed approach can be implemented fairly easily with off-the-shelf distributed computing frameworks (e.g. MPI, Hadoop, Spark). As an example, our implementation leads to one of the most efficient KP solvers known to date -- capable to solve KPs at an unprecedented scale (e.g., KPs with 1 billion decision variables and 1 billion constraints can be solved within 1 hour). The system has been deployed to production and called on a daily basis, yielding significant business impacts at Ant Financial.

algorithm, constraint, local constraint, (14 more...)

2002.00352

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.14)
Asia > Taiwan > Taiwan Province > Taipei (0.05)
(7 more...)

Genre: Research Report (1.00)

Industry:

Banking & Finance (0.68)
Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Computational Learning Theory (0.92)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.68)

arXiv.org Artificial IntelligenceJan-27-2020

Interpreting Cloud Computer Vision Pain-Points: A Mining Study of Stack Overflow

Cummaudo, Alex, Vasa, Rajesh, Barnett, Scott, Grundy, John, Abdelrazek, Mohamed

Intelligent services are becoming increasingly more pervasive; application developers want to leverage the latest advances in areas such as computer vision to provide new services and products to users, and large technology firms enable this via RESTful APIs. While such APIs promise an easy-to-integrate on-demand machine intelligence, their current design, documentation and developer interface hides much of the underlying machine learning techniques that power them. Such APIs look and feel like conventional APIs but abstract away data-driven probabilistic behaviour - the implications of a developer treating these APIs in the same way as other, traditional cloud services, such as cloud storage, is of concern. The objective of this study is to determine the various pain-points developers face when implementing systems that rely on the most mature of these intelligent services, specifically those that provide computer vision. We use Stack Overflow to mine indications of the frustrations that developers appear to face when using computer vision services, classifying their questions against two recent classification taxonomies (documentation-related and general questions). We find that, unlike mature fields like mobile development, there is a contrast in the types of questions asked by developers. These indicate a shallow understanding of the underlying technology that empower such systems. We discuss several implications of these findings via the lens of learning taxonomies to suggest how the software engineering community can improve these services and comment on the nature by which developers use them.

developer, intelligent service, taxonomy, (12 more...)

2001.1013

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Asia > South Korea > Seoul > Seoul (0.05)
Oceania > Australia > Victoria (0.04)
(13 more...)

Genre:

Research Report > New Finding (1.00)
Instructional Material > Course Syllabus & Notes (0.93)

Industry:

Information Technology > Services (1.00)
Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)

arXiv.org Artificial IntelligenceJan-8-2020

Semi-Sequential Probabilistic Model For Indoor Localization Enhancement

Hoang, Minh Tu, Yuen, Brosnan, Dong, Xiaodai, Lu, Tao, Westendorp, Robert, Reddy, Kishore

This paper proposes a semi-sequential probabilistic model (SSP) that applies an additional short term memory to enhance the performance of the probabilistic indoor localization. The conventional probabilistic methods normally treat the locations in the database indiscriminately. In contrast, SSP leverages the information of the previous position to determine the probable location since the user's speed in an indoor environment is bounded and locations near the previous one have higher probability than the other locations. Although the SSP utilizes the previous location information, it does not require the exact moving speed and direction of the user. On-site experiments using the received signal strength indicator (RSSI) and channel state information (CSI) fingerprints for localization demonstrate that SSP reduces the maximum error and boosts the performance of existing probabilistic approaches by 25% - 30%.

artificial intelligence, localization, machine learning, (18 more...)

doi: 10.1109/JSEN.2020.2972850

2001.024

Country:

North America > Canada > British Columbia > Vancouver Island > Capital Regional District > Victoria (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Burnaby (0.04)
Europe > Switzerland > Basel-City > Basel (0.04)
Asia > Middle East > Jordan (0.04)

Genre: Research Report (1.00)

Industry: Information Technology (0.68)

Technology:

Information Technology > Communications > Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.93)