AITopics | linear-chain crf

Collaborating Authors

linear-chain crf

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Regular-pattern-sensitive CRFs for Distant Label Interactions

Papay, Sean, Klinger, Roman, Pado, Sebastian

arXiv.org Artificial IntelligenceNov-19-2024

Linear-chain conditional random fields (CRFs) are a common model component for sequence labeling tasks when modeling the interactions between different labels is important. However, the Markov assumption limits linear-chain CRFs to only directly modeling interactions between adjacent labels. Weighted finite-state transducers (FSTs) are a related approach which can be made to model distant label-label interactions, but exact label inference is intractable for these models in the general case, and the task of selecting an appropriate automaton structure for the desired interaction types poses a practical challenge. In this work, we present regular-pattern-sensitive CRFs (RPCRFs), a method of enriching standard linear-chain CRFs with the ability to learn long-distance label interactions which occur in user-specified patterns. This approach allows users to write regular-expression label patterns concisely specifying which types of interactions the model should take into account, allowing the model to learn from data whether and in which contexts these patterns occur. The result can be interpreted alternatively as a CRF augmented with additional, non-local potentials, or as a finite-state transducer whose structure is defined by a set of easily-interpretable patterns. Critically, unlike the general case for FSTs (and for non-chain CRFs), exact training and inference are tractable for many pattern sets. In this work, we detail how a RPCRF can be automatically constructed from a set of user-specified patterns, and demonstrate the model's effectiveness on synthetic data, showing how different types of patterns can capture different nonlocal dependency structures in label sequences.

crf, interaction, sequence, (15 more...)

arXiv.org Artificial Intelligence

2411.12484

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > California > San Diego County > San Diego (0.04)
Europe > Germany > Baden-Württemberg > Stuttgart Region > Stuttgart (0.04)
(7 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Energy-Based Models with Applications to Speech and Language Processing

Ou, Zhijian

arXiv.org Artificial IntelligenceMar-16-2024

Energy-Based Models (EBMs) are an important class of probabilistic models, also known as random fields and undirected graphical models. EBMs are un-normalized and thus radically different from other popular self-normalized probabilistic models such as hidden Markov models (HMMs), autoregressive models, generative adversarial nets (GANs) and variational auto-encoders (VAEs). Over the past years, EBMs have attracted increasing interest not only from the core machine learning community, but also from application domains such as speech, vision, natural language processing (NLP) and so on, due to significant theoretical and algorithmic progress. The sequential nature of speech and language also presents special challenges and needs a different treatment from processing fix-dimensional data (e.g., images). Therefore, the purpose of this monograph is to present a systematic introduction to energy-based models, including both algorithmic progress and applications in speech and language processing. First, the basics of EBMs are introduced, including classic models, recent models parameterized by neural networks, sampling methods, and various learning methods from the classic learning algorithms to the most advanced ones. Then, the application of EBMs in three different scenarios is presented, i.e., for modeling marginal, conditional and joint distributions, respectively. 1) EBMs for sequential data with applications in language modeling, where the main focus is on the marginal distribution of a sequence itself; 2) EBMs for modeling conditional distributions of target sequences given observation sequences, with applications in speech recognition, sequence labeling and text generation; 3) EBMs for modeling joint distributions of both sequences of observations and targets, and their applications in semi-supervised learning and calibrated natural language understanding.

hamiltonian monte carlo, semi-supervised image classification, transition probability, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1561/2000000117

2403.10961

Country:

North America > Canada > Ontario > Toronto (0.14)
Asia > Middle East > Jordan (0.04)
Asia > China > Beijing > Beijing (0.04)
(4 more...)

Genre:

Research Report (1.00)
Overview (1.00)

Industry:

Health & Medicine (0.67)
Education (0.67)
Energy (0.45)

Technology:

Information Technology > Artificial Intelligence > Speech > Speech Recognition (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
(4 more...)

Add feedback

On equivalence between linear-chain conditional random fields and hidden Markov chains

Azeraf, Elie, Monfrini, Emmanuel, Pieczynski, Wojciech

arXiv.org Machine LearningNov-14-2021

Practitioners successfully use hidden Markov chains (HMCs) in different problems for about sixty years. HMCs belong to the family of generative models and they are often compared to discriminative models, like conditional random fields (CRFs). Authors usually consider CRFs as quite different from HMCs, and CRFs are often presented as interesting alternative to HMCs. In some areas, like natural language processing (NLP), discriminative models have completely supplanted generative models. However, some recent results show that both families of models are not so different, and both of them can lead to identical processing power. In this paper we compare the simple linear-chain CRFs to the basic HMCs. We show that HMCs are identical to CRFs in that for each CRF we explicitly construct an HMC having the same posterior distribution. Therefore, HMCs and linear-chain CRFs are not different but just differently parametrized models.

conditional random field, hmcs, linear-chain crf, (10 more...)

arXiv.org Machine Learning

2111.07376

Country:

Europe > France > Île-de-France > Paris > Paris (0.05)
Asia > India (0.04)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Entity extraction using Deep Learning – Towards Data Science

@machinelearnbotApr-4-2018, 22:35:24 GMT

Entity extraction from text is a major Natural Language Processing (NLP) task. As the recent advancement in the deep learning(DL) enable us to use them for NLP tasks and producing huge differences in accuracy compared to traditional methods. I have attempted to extract the information from article using both deep learning and traditional methods. Result was amazing as DL method got accuracy of 85% over 65% from legacy methods. The aim of the project is to tag each words of the articles into 4 categories: organisation, person, miscellaneous, and other.

neural network, representation, vector, (16 more...)

@machinelearnbot

Country: North America > United States > New York (0.05)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Efficient Dependency-Guided Named Entity Recognition

Jie, Zhanming (Singapore University of Technology and Design) | Muis, Aldrian Obaja (Singapore University of Technology and Design) | Lu, Wei (Singapore University of Technology and Design)

AAAI ConferencesFeb-14-2017

Named entity recognition (NER), which focuses on the extraction of semantically meaningful named entities and their semantic classes from text, serves as an indispensable component for several down-stream natural language processing (NLP) tasks such as relation extraction and event extraction. Dependency trees, on the other hand, also convey crucial semantic-level information. It has been shown previously that such information can be used to improve the performance of NER. In this work, we investigate on how to better utilize the structured information conveyed by dependency trees to improve the performance of NER. Specifically, unlike existing approaches which only exploit dependency information for designing local features, we show that certain global structured information of the dependency trees can be exploited when building NER models where such information can provide guided learning and inference. Through extensive experiments, we show that our proposed novel dependency-guided NER model performs competitively with models based on conventional semi-Markov conditional random fields, while requiring significantly less running time.

artificial intelligence, dependency tree, text processing, (17 more...)

AAAI Conferences

Thirty-First AAAI Conference on Artificial Intelligence

Country: Asia > Middle East (0.14)

Genre: Research Report > New Finding (0.93)

Industry:

Government (0.46)
Energy > Oil & Gas (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback

Efficiently Inducing Features of Conditional Random Fields

McCallum, Andrew

arXiv.org Machine LearningOct-19-2012

Conditional Random Fields (CRFs) are undirected graphical models, a special case of which correspond to conditionally-trained finite state machines. A key advantage of these models is their great flexibility to include a wide array of overlapping, multi-granularity, non-independent features of the input. In face of this freedom, an important question that remains is, what features should be used? This paper presents a feature induction method for CRFs. Founded on the principle of constructing only those feature conjunctions that significantly increase log-likelihood, the approach is based on that of Della Pietra et al [1997], but altered to work with conditional rather than joint probabilities, and with additional modifications for providing tractability specifically for a sequence model. In comparison with traditional approaches, automated feature induction offers both improved accuracy and more than an order of magnitude reduction in feature count; it enables the use of richer, higher-order Markov models, and offers more freedom to liberally guess about which atomic features may be relevant to a task. The induction method applies to linear-chain CRFs, as well as to more arbitrary CRF structures, also known as Relational Markov Networks [Taskar & Koller, 2002]. We present experimental results on a named entity extraction task.

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Machine Learning

1212.2504

Country: North America > United States > Massachusetts (0.28)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.94)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (0.88)

Add feedback

An Introduction to Conditional Random Fields

Sutton, Charles, McCallum, Andrew

arXiv.org Machine LearningNov-17-2010

Often we wish to predict a large number of variables that depend on each other as well as on other observed variables. Structured prediction methods are essentially a combination of classification and graphical modeling, combining the ability of graphical models to compactly model multivariate data with the ability of classification methods to perform prediction using large sets of input features. This tutorial describes conditional random fields, a popular probabilistic method for structured prediction. CRFs have seen wide application in natural language processing, computer vision, and bioinformatics. We describe methods for inference and parameter estimation for CRFs, including practical issues for implementing large scale CRFs. We do not assume previous knowledge of graphical modeling, so this tutorial is intended to be useful to practitioners in a wide variety of fields.

algorithm, neural network, optimization problem, (23 more...)

arXiv.org Machine Learning

1011.4088

Country:

North America > United States > Massachusetts (0.28)
Asia > Middle East (0.28)
Europe > Germany (0.27)
(3 more...)

Genre: Instructional Material > Course Syllabus & Notes (1.00)

Industry:

Health & Medicine (1.00)
Energy > Oil & Gas (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)
(6 more...)

Add feedback