AITopics

doi: 10.1007/s00799-019-00269-0

2003.05258

Country:

Europe > Netherlands > South Holland > The Hague (0.04)
Europe > Germany > Bavaria > Lower Franconia > Würzburg (0.04)
North America > United States > Illinois > Cook County > Chicago (0.04)
(14 more...)

Genre: Research Report > New Finding (0.46)

Industry: Government (0.93)

Technology:

Information Technology > Communications > Web > Semantic Web (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Cognitive Science > Problem Solving (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.92)

arXiv.org Artificial IntelligenceMar-7-2020

PathVQA: 30000+ Questions for Medical Visual Question Answering

He, Xuehai, Zhang, Yichen, Mou, Luntian, Xing, Eric, Xie, Pengtao

Is it possible to develop an "AI Pathologist" to pass the board-certified examination of the American Board of Pathology? To achieve this goal, the first step is to create a visual question answering (VQA) dataset where the AI agent is presented with a pathology image together with a question and is asked to give the correct answer. Our work makes the first attempt to build such a dataset. Different from creating general-domain VQA datasets where the images are widely accessible and there are many crowdsourcing workers available and capable of generating question-answer pairs, developing a medical VQA dataset is much more challenging. First, due to privacy concerns, pathology images are usually not publicly available. Second, only well-trained pathologists can understand pathology images, but they barely have time to help create datasets for AI research. To address these challenges, we resort to pathology textbooks and online digital libraries. We develop a semi-automated pipeline to extract pathology images and captions from textbooks and generate question-answer pairs from captions using natural language processing. We collect 32,799 open-ended questions from 4,998 pathology images where each question is manually checked to ensure correctness. To our best knowledge, this is the first dataset for pathology VQA. Our dataset will be released publicly to promote research in medical VQA.

dataset, method 3, question-answer pair, (12 more...)

2003.10286

Country:

North America > United States > California > San Diego County > San Diego (0.04)
Asia > China > Beijing > Beijing (0.04)

Genre: Instructional Material (0.93)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.94)
Information Technology > Artificial Intelligence > Natural Language > Question Answering (0.87)

Singkul, Sattaya, Khampingyot, Borirat, Maharattamalai, Nattasit, Taerungruang, Supawat, Chalothorn, Tawunrat

Parsing Thai Social Data: A New Challenge for Thai NLP

arXiv.org Artificial IntelligenceMar-6-2020

Dependency parsing (DP) is a task that analyzes text for syntactic structure and relationship between words. DP is widely used to improve natural language processing (NLP) applications in many languages such as English. Previous works on DP are generally applicable to formally written languages. However, they do not apply to informal languages such as the ones used in social networks. Therefore, DP has to be researched and explored with such social network data. In this paper, we explore and identify a DP model that is suitable for Thai social network data. After that, we will identify the appropriate linguistic unit as an input. The result showed that, the transition based model called, improve Elkared dependency parser outperform the others at UAS of 81.42%.

computational linguistic, dataset, dependency, (14 more...)

2003.03069

Country:

Europe > Spain > Valencian Community > Valencia Province > Valencia (0.04)
Europe > Bulgaria > Sofia City Province > Sofia (0.04)
Asia > Thailand > Chiang Mai > Chiang Mai (0.04)
(25 more...)

Genre: Research Report > New Finding (0.54)

Industry: Information Technology (1.00)

Technology:

Information Technology > Communications (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.99)

#artificialintelligenceMar-5-2020, 09:25:44 GMT

Why you should not use (f)lex, yacc and bison - Federico Tomassetti - Software Architect

In the field of parsing Lex and Yacc, as well as their respective successors flex and GNU Bison, have a sort of venerable status. And you could still use them today. But you should not do that. In this article will explain why they have problems and show you some alternatives. Lex and Yacc were the first popular and efficient lexers and parsers generators, flex and Bison were the first widespread open-source versions compatible with the original software. Each of these software has more than 30 years of history, which is an achievement in itself. For some people these are still the first software they think about when talking about parsing. So, why you should avoid them? Well, we found a few reasons based in our experience developing parsers for our clients. For example, we had to worked with existing lexers in flex and found difficult adding modern features, like Unicode support or making the lexer re-entrant (i.e., usable in many threads). With Bison our clients had trouble organizing large codebases and we found difficult improving the efficiency of a parser without rewriting large part of the grammar. The short version is that there are tools that are more flexible and productive, like ANTLR.

algorithm, grammar, software, (16 more...)

#artificialintelligence

Genre: Research Report (0.34)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.76)
Information Technology > Software > Programming Languages (0.46)

Journal of Artificial Intelligence ResearchFeb-26-2020

Jointly Improving Parsing and Perception for Natural Language Commands through Human-Robot Dialog

In this work, we present methods for using human-robot dialog to improve language understanding for a mobile robot agent. The agent parses natural language to underlying semantic meanings and uses robotic sensors to create multi-modal models of perceptual concepts like red and heavy. The agent can be used for showing navigation routes, delivering objects to people, and relocating objects from one location to another. We use dialog clarification questions both to understand commands and to generate additional parsing training data. The agent employs opportunistic active learning to select questions about how words relate to objects, improving its understanding of perceptual concepts. We evaluated this agent on Amazon Mechanical Turk. After training on data induced from conversations, the agent reduced the number of dialog questions it asked while receiving higher usability ratings. Additionally, we demonstrated the agent on a robotic platform, where it learned new perceptual concepts on the fly while completing a real-world task.

agent, parsing and perception, robot, (14 more...)

Journal of Artificial Intelligence Research

doi: 10.1613/jair.1.11485

AI Access Foundation

11485

Journal of Artificial Intelligence Research

Country:

North America > United States > Texas > Travis County > Austin (0.04)
Asia > Middle East > Republic of Türkiye > Aksaray Province > Aksaray (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
(5 more...)

Genre:

Questionnaire & Opinion Survey (1.00)
Research Report > Experimental Study (0.46)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Özateş, Şaziye Betül, Özgür, Arzucan, Güngör, Tunga, Öztürk, Balkız

A Hybrid Approach to Dependency Parsing: Combining Rules and Morphology with Deep Learning

arXiv.org Artificial IntelligenceFeb-24-2020

Fully data-driven, deep learning-based models are usually designed as language-independent and have been shown to be successful for many natural language processing tasks. However, when the studied language is low-resourced and the amount of training data is insufficient, these models can benefit from the integration of natural language grammar-based information. We propose two approaches to dependency parsing especially for languages with restricted amount of training data. Our first approach combines a state-of-the-art deep learning-based parser with a rule-based approach and the second one incorporates morphological information into the parser. In the rule-based approach, the parsing decisions made by the rules are encoded and concatenated with the vector representations of the input words as additional information to the deep network. The morphology-based approach proposes different methods to include the morphological structure of words into the parser network. Experiments are conducted on the IMST-UD Treebank and the results suggest that integration of explicit knowledge about the target language to a neural parser through a rule-based parsing system and morphological analysis leads to more accurate annotations and hence, increases the parsing performance in terms of attachment scores. The proposed methods are developed for Turkish, but can be adapted to other languages as well.

artificial intelligence, machine learning, natural language, (19 more...)

doi: 10.1109/ACCESS.2022.3202947

2002.10116

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Europe > Finland > Southwest Finland > Turku (0.04)
Europe > Belgium > Brussels-Capital Region > Brussels (0.04)
(11 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Johnson, Mark, Demuth, Katherine, Jones, Bevan, Black, Michael J.

Synergies in learning words and their referents

Neural Information Processing SystemsFeb-15-2020, 01:41:23 GMT

This paper presents Bayesian non-parametric models that simultaneously learn to segment words from phoneme strings and learn the referents of some of those words, and shows that there is a synergistic interaction in the acquisition of these two kinds of linguistic information. The models themselves are novel kinds of Adaptor Grammars that are an extension of an embedding of topic models into PCFGs. These models simultaneously segment phoneme sequences into words and learn the relationship between non-linguistic objects to the words that refer to them. We show (i) that modelling inter-word dependencies not only improves the accuracy of the word segmentation but also of word-object relationships, and (ii) that a model that simultaneously learns word-object relationships and word segmentation segments more accurately than one that just learns word segmentation on its own. We argue that these results support an interactive view of language acquisition that can take advantage of synergies such as these.

referent, synergy, word segmentation, (1 more...)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Collins, Michael, Cohen, Shay B.

Tensor Decomposition for Fast Parsing with Latent-Variable PCFGs

Neural Information Processing SystemsFeb-14-2020, 23:58:11 GMT

We describe an approach to speed-up inference with latent variable PCFGs, which have been shown to be highly effective for natural language parsing. Our approach is based on a tensor formulation recently introduced for spectral estimation of latent-variable PCFGs coupled with a tensor decomposition algorithm well-known in the multilinear algebra literature. We also describe an error bound for this approximation, which bounds the difference between the probabilities calculated by the algorithm and the true probabilities that the approximated model gives. Empirical evaluation on real-world natural language parsing data demonstrates a significant speed-up at minimal cost for parsing performance. Papers published at the Neural Information Processing Systems Conference.

fast parsing, latent-variable pcfg, tensor decomposition, (2 more...)

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Hsu, Daniel J., Kakade, Sham M., Liang, Percy S.

Identifiability and Unmixing of Latent Parse Trees

Neural Information Processing SystemsFeb-14-2020, 22:57:54 GMT

This paper explores unsupervised learning of parsing models along two directions. First, which models are identifiable from infinite data? We use a general technique for numerically checking identifiability based on the rank of a Jacobian matrix, and apply it to several standard constituency and dependency parsing models. Second, for identifiable models, how do we estimate the parameters efficiently? EM suffers from local optima, while recent work using spectral methods cannot be directly applied since the topology of the parse tree varies across sentences.

identifiability and unmixing, latent parse tree

Technology: Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)

Zhao, Yibiao, Zhu, Song-chun

Image Parsing with Stochastic Scene Grammar

Neural Information Processing SystemsFeb-14-2020, 21:27:18 GMT

In contrast to previous scene labeling work that applied discriminative classifiers to pixels (or super-pixels), we use a generative Stochastic Scene Grammar (SSG). This grammar represents the compositional structures of visual entities from scene categories, 3D foreground/background, 2D faces, to 1D lines. The grammar includes three types of production rules and two types of contextual relations. Production rules: (i) AND rules represent the decomposition of an entity into sub-parts; (ii) OR rules represent the switching among sub-types of an entity; (iii) SET rules rep- resent an ensemble of visual entities. Contextual relations: (i) Cooperative " " relations represent positive links between binding entities, such as hinged faces of a object or aligned boxes; (ii) Competitive "-" relations represents negative links between competing entities, such as mutually exclusive boxes. We design an efficient MCMC inference algorithm, namely Hierarchical cluster sampling, to search in the large solution space of scene configurations. The algorithm has two stages: (i) Clustering: It forms all possible higher-level structures (clusters) from lower-level entities by production rules and contextual relations.

algorithm, image parsing, stochastic scene grammar, (7 more...)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (0.85)
Information Technology > Artificial Intelligence > Machine Learning (0.62)