AITopics | McCallum, Andrew

Collaborating Authors

McCallum, Andrew

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

OpenKI: Integrating Open Information Extraction and Knowledge Bases with Relation Inference

Zhang, Dongxu, Mukherjee, Subhabrata, Lockard, Colin, Dong, Xin Luna, McCallum, Andrew

arXiv.org Machine LearningApr-12-2019

In this paper, we consider advancing web-scale knowledge extraction and alignment by integrating OpenIE extractions in the form of (subject, predicate, object) triples with Knowledge Bases (KB). Traditional techniques from universal schema and from schema mapping fall in two extremes: either they perform instance-level inference relying on embedding for (subject, object) pairs, thus cannot handle pairs absent in any existing triples; or they perform predicate-level mapping and completely ignore background evidence from individual entities, thus cannot achieve satisfying quality. We propose OpenKI to handle sparsity of OpenIE extractions by performing instance-level inference: for each entity, we encode the rich information in its neighborhood in both KB and OpenIE extractions, and leverage this information in relation inference by exploring different methods of aggregation and attention. In order to handle unseen entities, our model is designed without creating entity-specific parameters. Extensive experiments show that this method not only significantly improves state-of-the-art for conventional OpenIE extractions like ReVerb, but also boosts the performance on OpenIE from semi-structured data, where new entity pairs are abundant and data are fairly sparse.

bayesian inference, relation, survey article, (21 more...)

arXiv.org Machine Learning

1904.12606

Country:

North America > United States > New York (0.14)
North America > United States > Massachusetts (0.14)
North America > United States > California (0.14)

Genre: Research Report (0.64)

Industry:

Media > Film (0.47)
Leisure & Entertainment (0.47)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.71)
(2 more...)

Add feedback

Inorganic Materials Synthesis Planning with Literature-Trained Neural Networks

Kim, Edward, Jensen, Zach, van Grootel, Alexander, Huang, Kevin, Staib, Matthew, Mysore, Sheshera, Chang, Haw-Shiuan, Strubell, Emma, McCallum, Andrew, Jegelka, Stefanie, Olivetti, Elsa

arXiv.org Machine LearningFeb-17-2019

College of Information and Computer Sciences, University of Massachusetts Amherst, Amherst, MA, USA (Dated: February 17, 2019) Leveraging new data sources is a key step in accelerating the pace of materials design and discovery. To complement the strides in synthesis planning driven by historical, experimental, and computed data, we present an automated method for connecting scientific literature to synthesis insights. Starting from natural language text, we apply word embeddings from language models, which are fed into a named entity recognition model, upon which a conditional variational autoencoder is trained to generate syntheses for arbitrary materials. We show the potential of this technique by predicting precursors for two perovskite materials, using only training data published over a decade prior to their first reported syntheses. We demonstrate that the model learns representations of materials corresponding to synthesis-related properties, and that the model's behavior complements existing thermodynamic knowledge.

neural network, precursor, text processing, (20 more...)

arXiv.org Machine Learning

1901.00032

Country: North America > United States > Massachusetts > Hampshire County > Amherst (0.54)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.69)

Add feedback

Compact Representation of Uncertainty in Clustering

Greenberg, Craig, Monath, Nicholas, Kobren, Ari, Flaherty, Patrick, McGregor, Andrew, McCallum, Andrew

Neural Information Processing SystemsDec-31-2018

For many classic structured prediction problems, probability distributions over the dependent variables can be efficiently computed using widely-known algorithms and data structures (such as forward-backward, and its corresponding trellis for exact probability distributions in Markov models). However, we know of no previous work studying efficient representations of exact distributions over clusterings. This paper presents definitions and proofs for a dynamic-programming inference procedure that computes the partition function, the marginal probability of a cluster, and the MAP clustering---all exactly. Rather than the Nth Bell number, these exact solutions take time and space proportional to the substantially smaller powerset of N. Indeed, we improve upon the time complexity of the algorithm introduced by Kohonen and Corander (2016) for this problem by a factor of N. While still large, this previously unknown result is intellectually interesting in its own right, makes feasible exact inference for important real-world small data applications (such as medicine), and provides a natural stepping stone towards sparse-trellis approximations that enable further scalability (which we also explore). In experiments, we demonstrate the superiority of our approach over approximate methods in analyzing real-world gene expression data used in cancer treatment.

artificial intelligence, machine learning, partition function, (18 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts (0.14)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Add feedback

Compact Representation of Uncertainty in Clustering

Greenberg, Craig, Monath, Nicholas, Kobren, Ari, Flaherty, Patrick, McGregor, Andrew, McCallum, Andrew

Neural Information Processing SystemsDec-31-2018

oncology, partition function, us government, (21 more...)

Neural Information Processing Systems

Country: North America > United States > Massachusetts (0.14)

Industry:

Health & Medicine > Therapeutic Area > Oncology (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.48)

Add feedback

Go for a Walk and Arrive at the Answer: Reasoning Over Paths in Knowledge Bases using Reinforcement Learning

Das, Rajarshi, Dhuliawala, Shehzaad, Zaheer, Manzil, Vilnis, Luke, Durugkar, Ishan, Krishnamurthy, Akshay, Smola, Alex, McCallum, Andrew

arXiv.org Artificial IntelligenceDec-30-2018

Knowledge bases (KB), both automatically and manually constructed, are often incomplete --- many valid facts can be inferred from the KB by synthesizing existing information. A popular approach to KB completion is to infer new relations by combinatory reasoning over the information found along other paths connecting a pair of entities. Given the enormous size of KBs and the exponential number of paths, previous path-based models have considered only the problem of predicting a missing relation given two entities or evaluating the truth of a proposed triple. Additionally, these methods have traditionally used random paths between fixed entity pairs or more recently learned to pick paths between them. We propose a new algorithm MINERVA, which addresses the much more difficult and practical task of answering questions where the relation is known, but only one entity. Since random walks are impractical in a setting with combinatorially many destinations from a start node, we present a neural reinforcement learning approach which learns how to navigate the graph conditioned on the input query to find predictive paths. Empirically, this approach obtains state-of-the-art results on several datasets, significantly outperforming prior methods.

deep learning, minerva, neural network, (22 more...)

arXiv.org Artificial Intelligence

1711.05851

Country:

North America > United States > Texas (0.14)
North America > United States > Massachusetts (0.14)

Genre: Research Report (1.00)

Industry:

Media (0.68)
Government > Regional Government > North America Government > United States Government (0.46)
Leisure & Entertainment > Sports > Basketball (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
(2 more...)

Add feedback

Search-Guided, Lightly-supervised Training of Structured Prediction Energy Networks

Rooshenas, Amirmohammad, Zhang, Dongxu, Sharma, Gopal, McCallum, Andrew

arXiv.org Machine LearningDec-22-2018

In structured output prediction tasks, labeling ground-truth training output is often expensive. However, for many tasks, even when the true output is unknown, we can evaluate predictions using a scalar reward function, which may be easily assembled from human knowledge or non-differentiable pipelines. But searching through the entire output space to find the best output with respect to this reward function is typically intractable. In this paper, we instead use efficient truncated randomized search in this reward function to train structured prediction energy networks (SPENs), which provide efficient test-time inference using gradient-based search on a smooth, learned representation of the score landscape, and have previously yielded state-of-the-art results in structured prediction. In particular, this truncated randomized search in the reward function yields previously unknown local improvements, providing effective supervision to SPENs, avoiding their traditional need for labeled training data.

deep learning, neural network, reward function, (20 more...)

arXiv.org Machine Learning

1812.09603

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report (0.50)

Industry: Energy > Power Industry (0.61)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Supervised Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Inductive Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

Embedded-State Latent Conditional Random Fields for Sequence Labeling

Thai, Dung, Ramesh, Sree Harsha, Murty, Shikhar, Vilnis, Luke, McCallum, Andrew

arXiv.org Machine LearningSep-27-2018

Complex textual information extraction tasks are often posed as sequence labeling or \emph{shallow parsing}, where fields are extracted using local labels made consistent through probabilistic inference in a graphical model with constrained transitions. Recently, it has become common to locally parametrize these models using rich features extracted by recurrent neural networks (such as LSTM), while enforcing consistent outputs through a simple linear-chain model, representing Markovian dependencies between successive labels. However, the simple graphical model structure belies the often complex non-local constraints between output labels. For example, many fields, such as a first name, can only occur a fixed number of times, or in the presence of other fields. While RNNs have provided increasingly powerful context-aware local features for sequence tagging, they have yet to be integrated with a global graphical model of similar expressivity in the output distribution. Our model goes beyond the linear chain CRF to incorporate multiple hidden states per output label, but parametrizes their transitions parsimoniously with low-rank log-potential scoring matrices, effectively learning an embedding space for hidden states. This augmented latent space of inference variables complements the rich feature representation of the RNN, and allows exact global inference obeying complex, learned non-local output constraints. We experiment with several datasets and show that the model outperforms baseline CRF+RNN models when global output constraints are necessary at inference-time, and explore the interpretable latent structure.

constraint, deep learning, neural network, (19 more...)

arXiv.org Machine Learning

1809.10835

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report (0.50)

Industry: Health & Medicine (0.47)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

A Systematic Classification of Knowledge, Reasoning, and Context within the ARC Dataset

Boratko, Michael, Padigela, Harshit, Mikkilineni, Divyendra, Yuvraj, Pritish, Das, Rajarshi, McCallum, Andrew, Chang, Maria, Fokoue-Nkoutche, Achille, Kapanipathi, Pavan, Mattei, Nicholas, Musa, Ryan, Talamadupula, Kartik, Witbrock, Michael

arXiv.org Artificial IntelligenceJun-1-2018

The recent work of Clark et al. introduces the AI2 Reasoning Challenge (ARC) and the associated ARC dataset that partitions open domain, complex science questions into an Easy Set and a Challenge Set. That paper includes an analysis of 100 questions with respect to the types of knowledge and reasoning required to answer them; however, it does not include clear definitions of these types, nor does it offer information about the quality of the labels. We propose a comprehensive set of definitions of knowledge and reasoning types necessary for answering the questions in the ARC dataset. Using ten annotators and a sophisticated annotation interface, we analyze the distribution of labels across the Challenge Set and statistics related to them. Additionally, we demonstrate that although naive information retrieval methods return sentences that are irrelevant to answering the query, sufficient supporting text is often present in the (ARC) corpus. Evaluating with human-selected relevant sentences improves the performance of a neural machine comprehension model by 42 points.

dataset, educational setting, neural network, (20 more...)

arXiv.org Artificial Intelligence

1806.00358

Country: North America > United States > Massachusetts > Hampshire County > Amherst (0.14)

Genre:

Research Report (0.64)
Overview (0.46)

Industry: Education > Educational Setting > K-12 Education (0.68)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

Probabilistic Embedding of Knowledge Graphs with Box Lattice Measures

Vilnis, Luke, Li, Xiang, Murty, Shikhar, McCallum, Andrew

arXiv.org Machine LearningMay-17-2018

Embedding methods which enforce a partial order or lattice structure over the concept space, such as Order Embeddings (OE) (Vendrov et al., 2016), are a natural way to model transitive relational data (e.g. entailment graphs). However, OE learns a deterministic knowledge base, limiting expressiveness of queries and the ability to use uncertainty for both prediction and learning (e.g. learning from expectations). Probabilistic extensions of OE (Lai and Hockenmaier, 2017) have provided the ability to somewhat calibrate these denotational probabilities while retaining the consistency and inductive bias of ordered models, but lack the ability to model the negative correlations found in real-world knowledge. In this work we show that a broad class of models that assign probability measures to OE can never capture negative correlation, which motivates our construction of a novel box lattice and accompanying probability measure to capture anticorrelation and even disjoint concepts, while still providing the benefits of probabilistic modeling, such as the ability to perform rich joint and conditional queries over arbitrary sets of concepts, and both learning from and predicting calibrated uncertainty. We show improvements over previous approaches in modeling the Flickr and WordNet entailment graphs, and investigate the power of the model.

inductive learning, neural network, probability, (20 more...)

arXiv.org Machine Learning

1805.06627

Country: North America > United States > Massachusetts (0.14)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.94)
(2 more...)

Add feedback

Active Bias: Training More Accurate Neural Networks by Emphasizing High Variance Samples

Chang, Haw-Shiuan, Learned-Miller, Erik, McCallum, Andrew

arXiv.org Machine LearningJan-6-2018

Self-paced learning and hard example mining re-weight training instances to improve learning accuracy. This paper presents two improved alternatives based on lightweight estimates of sample uncertainty in stochastic gradient descent (SGD): the variance in predicted probability of the correct class across iterations of mini-batch SGD, and the proximity of the correct class probability to the decision threshold. Extensive experimental results on six datasets show that our methods reliably improve accuracy in various network architectures, including additional gains on top of other popular training techniques, such as residual learning, momentum, ADAM, batch normalization, dropout, and distillation.

deep learning, epoch, neural network, (17 more...)

arXiv.org Machine Learning

1704.07433

Country: North America > United States > Massachusetts > Hampshire County > Amherst (0.14)

Genre: Research Report (1.00)

Industry:

Education (0.67)
Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.55)

Add feedback