Goto

Collaborating Authors

 training item


Items or Relations -- what do Artificial Neural Networks learn?

Krause, Renate, Reimann, Stefan

arXiv.org Artificial Intelligence

What has an Artificial Neural Network (ANN) learned after being successfully trained to solve a task - the set of training items or the relations between them? This question is difficult to answer for modern applied ANNs because of their enormous size and complexity. Therefore, here we consider a low-dimensional network and a simple task, i.e., the network has to reproduce a set of training items identically. We construct the family of solutions analytically and use standard learning algorithms to obtain numerical solutions. These numerical solutions differ depending on the optimization algorithm and the weight initialization and are shown to be particular members of the family of analytical solutions. In this simple setting, we observe that the general structure of the network weights represents the training set's symmetry group, i.e., the relations between training items. As a consequence, linear networks generalize, i.e., reproduce items that were not part of the training set but are consistent with the symmetry of the training set. In contrast, non-linear networks tend to learn individual training items and show associative memory. At the same time, their ability to generalize is limited. A higher degree of generalization is obtained for networks whose activation function contains a linear regime, such as tanh. Our results suggest ANN's ability to generalize - instead of learning items - could be improved by generating a sufficiently big set of elementary operations to represent relations and strongly depends on the applied non-linearity.


Network Generality, Training Required, and Precision Required

Neural Information Processing Systems

We show how to estimate (1) the number of functions that can be implemented by a particular network architecture, (2) how much analog precision is needed in the con(cid:173) nections in the network, and (3) the number of training examples the network must see before it can be expected to form reliable generalizations. Consider the following objectives: First, the network should be very powerful and ver(cid:173) satile, i.e., it should implement any function (truth table) you like, and secondly, it should learn easily, forming meaningful generalizations from a small number of training examples. Well, it is information-theoretically impossible to create such a network. We will present here a simplified argument; a more complete and sophisticated version can be found in Denker et al. (1987). It is customary to regard learning as a dynamical process: adjusting the weights (etc.) in a single network.


How to Introduce Machine Learning to your Business

#artificialintelligence

Artificial intelligence systems usually learn by example and are likely to learn better with high-quality examples. Low quality or insufficient training data can lead to unreliable systems that make poor decisions, reach the wrong conclusions, introduce or perpetuate bias and cannot handle real-world variation among other issues. Besides, poor data is costly. According to IBM, poor data quality in the US costs the country about 3.1 trillion dollars each year. To build a successful training data strategy, have a well-designed strategy that will collect and structure the data you need to tune, test, and train AI systems.


6 Tips for Building a Training Data Strategy for Machine Learning

#artificialintelligence

Artificial intelligence (AI) and machine learning (ML) are frequently used terms these days. AI refers to the concept of machines mimicking human cognition. ML is an approach used to create AI. If AI is when a computer can carry out a set of tasks based on instruction, ML is a machine's ability to ingest, parse, and learn from that data itself in order to become more accurate or precise about accomplishing that task. Executives in industries such as automotive, finance, government, healthcare, retail, and tech may already have a basic understanding of ML and AI.


An Optimal Control View of Adversarial Machine Learning

Zhu, Xiaojin

arXiv.org Machine Learning

I describe an optimal control view of adversarial machine learning, where the dynamical system is the machine learner, the input are adversarial actions, and the control costs are defined by the adversary's goals to do harm and be hard to detect. This view encompasses many types of adversarial machine learning, including test-item attacks, training-data poisoning, and adversarial reward shaping. The view encourages adversarial machine learning researcher to utilize advances in control theory and reinforcement learning.


Training Set Debugging Using Trusted Items

Zhang, Xuezhou (University of Wisconsin-Madison) | Zhu, Xiaojin (University of Wisconsin-Madison) | Wright, Stephen (University of Wisconsin-Madison)

AAAI Conferences

Training set bugs are flaws in the data that adversely affect machine learning. The training set is usually too large for manual inspection, but one may have the resources to verify a few trusted items. The set of trusted items may not by itself be adequate for learning, so we propose an algorithm that uses these items to identify bugs in the training set and thus improves learning. Specifically, our approach seeks the smallest set of changes to the training set labels such that the model learned from this corrected training set predicts labels of the trusted items correctly. We flag the items whose labels are changed as potential bugs, whose labels can be checked for veracity by human experts. To find the bugs in this way is a challenging combinatorial bilevel optimization problem, but it can be relaxed into a continuous optimization problem.Experiments on toy and real data demonstrate that our approach can identify training set bugs effectively and suggest appropriate changes to the labels. Our algorithm is a step toward trustworthy machine learning.


Training Set Debugging Using Trusted Items

Zhang, Xuezhou, Zhu, Xiaojin, Wright, Stephen J.

arXiv.org Machine Learning

Training set bugs are flaws in the data that adversely affect machine learning. The training set is usually too large for man- ual inspection, but one may have the resources to verify a few trusted items. The set of trusted items may not by itself be adequate for learning, so we propose an algorithm that uses these items to identify bugs in the training set and thus im- proves learning. Specifically, our approach seeks the smallest set of changes to the training set labels such that the model learned from this corrected training set predicts labels of the trusted items correctly. We flag the items whose labels are changed as potential bugs, whose labels can be checked for veracity by human experts. To find the bugs in this way is a challenging combinatorial bilevel optimization problem, but it can be relaxed into a continuous optimization problem. Ex- periments on toy and real data demonstrate that our approach can identify training set bugs effectively and suggest appro- priate changes to the labels. Our algorithm is a step toward trustworthy machine learning.


Machine Teaching: An Inverse Problem to Machine Learning and an Approach Toward Optimal Education

Zhu, Xiaojin (University of Wisconsin-Madison)

AAAI Conferences

I draw the reader's attention to machine teaching, the problem of finding an optimal training set given a machine learning algorithm and a target model. In addition to generating fascinating mathematical questions for computer scientists to ponder, machine teaching holds the promise of enhancing education and personnel training. The Socratic dialogue style aims to stimulate critical thinking.


Tracking Epidemics with Natural Language Processing and Crowdsourcing

Munro, Robert (Stanford University) | Gunasekara, Lucky (EpidemicIQ) | Nevins, Stephanie ( EpidemicIQ ) | Polepeddi, Lalith ( EpidemicIQ ) | Rosen, Evan ( Stanford )

AAAI Conferences

The first indication of a new outbreak is often in unstructured data (natural language) and reported openly in traditional or social media as a new `flu-like' or `malaria-like' illness weeks or months before the new pathogen is eventually isolated. We present a system for tracking these early signals globally, using natural language processing and crowdsourcing. By comparison, search-log-based approaches, while innovative and inexpensive, are often a trailing signal that follow open reports in plain language. Concentrating on discovering outbreak-related reports in big open data, we show how crowdsourced workers can create near-real-time training data for adaptive active-learning models, addressing the lack of broad coverage training data for tracking epidemics. This is well-suited to an outbreak information-flow context, where sudden bursts of information about new diseases/locations need to be manually processed quickly at short notice.


Network Generality, Training Required, and Precision Required

Denker, John S., Wittner, Ben S.

Neural Information Processing Systems

We show how to estimate (1) the number of functions that can be implemented by a particular network architecture, (2) how much analog precision is needed in the connections in the network, and (3) the number of training examples the network must see before it can be expected to form reliable generalizations.