AITopics | Finin, Tim

Collaborating Authors

Finin, Tim

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Practical Entity Linking System for Tables in Scientific Literature

Mulwad, Varish, Finin, Tim, Kumar, Vijay S., Williams, Jenny Weisenberg, Dixit, Sharad, Joshi, Anupam

arXiv.org Artificial IntelligenceJun-11-2023

Entity linking is an important step towards constructing knowledge graphs that facilitate advanced question answering over scientific documents, including the retrieval of relevant information included in tables within these documents. This paper introduces a general-purpose system for linking entities to items in the Wikidata knowledge base. It describes how we adapt this system for linking domain-specific entities, especially for those entities embedded within tables drawn from COVID-19-related scientific literature. We describe the setup of an efficient offline instance of the system that enables our entity-linking approach to be more feasible in practice. As part of a broader approach to infer the semantic meaning of scientific tables, we leverage the structural and semantic characteristics of the tables to improve overall entity linking performance.

artificial intelligence, natural language, wikidata, (17 more...)

arXiv.org Artificial Intelligence

2306.10044

Country: North America > United States > Maryland > Baltimore County (0.46)

Genre: Research Report (0.82)

Industry:

Health & Medicine > Therapeutic Area > Infections and Infectious Diseases (1.00)
Health & Medicine > Pharmaceuticals & Biotechnology (1.00)
Health & Medicine > Therapeutic Area > Immunology (0.92)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (0.94)
Information Technology > Communications > Web > Semantic Web (0.68)

Add feedback

Jointly Identifying and Fixing Inconsistent Readings from Information Extraction Systems

Padia, Ankur, Ferraro, Francis, Finin, Tim

arXiv.org Artificial IntelligenceJan-26-2023

KGCleaner is a framework to identify and correct errors in data produced and delivered by an information extraction system. These tasks have been understudied and KGCleaner is the first to address both. We introduce a multi-task model that jointly learns to predict if an extracted relation is credible and repair it if not. We evaluate our approach and other models as instance of our framework on two collections: a Wikidata corpus of nearly 700K facts and 5M fact-relevant sentences and a collection of 30K facts from the 2015 TAC Knowledge Base Population task. For credibility classification, parameter efficient simple shallow neural network can achieve an absolute performance gain of 30 $F_1$ points on Wikidata and comparable performance on TAC. For the repair task, significant performance (at more than twice) gain can be obtained depending on the nature of the dataset and the models.

machine learning, natural language, provenance sentence, (20 more...)

arXiv.org Artificial Intelligence

1808.04816

Country: North America > United States > Maryland > Baltimore County (0.46)

Genre: Research Report (1.00)

Industry:

Government > Regional Government > North America Government > United States Government (0.94)
Health & Medicine (0.68)
Government > Military (0.68)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Information Extraction (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.35)

Add feedback

Recognizing and Extracting Cybersecurtity-relevant Entities from Text

Hanks, Casey, Maiden, Michael, Ranade, Priyanka, Finin, Tim, Joshi, Anupam

arXiv.org Artificial IntelligenceAug-2-2022

Cyber Threat Intelligence (CTI) is information describing threat vectors, vulnerabilities, and attacks and is often used as training data for AI-based cyber defense systems such as Cybersecurity Knowledge Graphs (CKG). There is a strong need to develop community-accessible datasets to train existing AI-based cybersecurity pipelines to efficiently and accurately extract meaningful insights from CTI. We have created an initial unstructured CTI corpus from a variety of open sources that we are using to train and test cybersecurity entity models using the spaCy framework and exploring self-learning methods to automatically recognize cybersecurity entities. We also describe methods to apply cybersecurity domain entity linking with existing world knowledge from Wikidata. Our future work will survey and test spaCy NLP tools and create methods for continuous integration of new information extracted from text.

annotation, machine learning, natural language, (15 more...)

arXiv.org Artificial Intelligence

2208.01693

Country: North America > United States > Maryland > Baltimore County > Baltimore (0.28)

Genre: Research Report (0.40)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
(2 more...)

Add feedback

Generating Fake Cyber Threat Intelligence Using Transformer-Based Models

Ranade, Priyanka, Piplai, Aritran, Mittal, Sudip, Joshi, Anupam, Finin, Tim

arXiv.org Artificial IntelligenceFeb-8-2021

Cyber-defense systems are being developed to automatically ingest Cyber Threat Intelligence (CTI) that contains semi-structured data and/or text to populate knowledge graphs. A potential risk is that fake CTI can be generated and spread through Open-Source Intelligence (OSINT) communities or on the Web to effect a data poisoning attack on these systems. Adversaries can use fake CTI examples as training input to subvert cyber defense systems, forcing the model to learn incorrect inputs to serve their malicious needs. In this paper, we automatically generate fake CTI text descriptions using transformers. We show that given an initial prompt sentence, a public language model like GPT-2 with fine-tuning, can generate plausible CTI text with the ability of corrupting cyber-defense systems. We utilize the generated fake CTI text to perform a data poisoning attack on a Cybersecurity Knowledge Graph (CKG) and a cybersecurity corpus. The poisoning attack introduced adverse impacts such as returning incorrect reasoning outputs, representation poisoning, and corruption of other dependent AI-based cyber defense systems. We evaluate with traditional approaches and conduct a human evaluation study with cybersecurity professionals and threat hunters. Based on the study, professional threat hunters were equally likely to consider our fake generated CTI as true.

cti, cyberwarfare, deep learning, (22 more...)

arXiv.org Artificial Intelligence

2102.04351

Country: North America > United States > Maryland (0.28)

Genre:

Research Report (1.00)
Overview (0.68)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (1.00)
Government > Regional Government > North America Government > United States Government (0.68)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Gazetteer Generation for Neural Named Entity Recognition

Song, Chan Hee (University of Notre Dame ) | Lawrie, Dawn (John Hopkins University) | Finin, Tim (University of Maryland Baltimore County) | Mayfield, James (John Hopkins University)

AAAI ConferencesMay-16-2020

We present a way to generate gazetteers from the Wikidata knowledge graph and use the lists to improve a neural NER system by adding an input feature indicating that a word is part of a name in the gazetteer. We empirically show that the approach yields performance gains in two distinct languages: a high-resource, word-based language, English and a high-resource, character-based language, Chinese. We apply the approach to a low-resource language, Russian, using a new annotated Russian NER corpus from Reddit tagged with four core and eleven extended types, and show a baseline score.

gazetteer generation, named entity recognition

AAAI Conferences

The Thirty-Third International Flairs Conference

Technology: Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)

Add feedback

Unfolding the Structure of a Document using Deep Learning

Rahman, Muhammad Mahbubur, Finin, Tim

arXiv.org Machine LearningSep-29-2019

Understanding and extracting of information from large documents, such as business opportunities, academic articles, medical documents and technical reports, poses challenges not present in short documents. Such large documents may be multi-themed, complex, noisy and cover diverse topics. We describe a framework that can analyze large documents and help people and computer systems locate desired information in them. We aim to automatically identify and classify different sections of documents and understand their purpose within the document. A key contribution of our research is modeling and extracting the logical and semantic structure of electronic documents using deep learning techniques. We evaluate the effectiveness and robustness of our framework through extensive experiments on two collections: more than one million scholarly articles from arXiv and a collection of requests for proposal documents from government sources.

deep learning, neural network, section header, (19 more...)

arXiv.org Machine Learning

1910.03678

Country:

North America > United States > Maryland > Baltimore County (0.14)
North America > United States > Maryland > Baltimore (0.14)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine (0.48)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Cyber-All-Intel: An AI for Security related Threat Intelligence

Mittal, Sudip, Joshi, Anupam, Finin, Tim

arXiv.org Artificial IntelligenceMay-7-2019

Keeping up with threat intelligence is a must for a security analyst today. There is a volume of information present in `the wild' that affects an organization. We need to develop an artificial intelligence system that scours the intelligence sources, to keep the analyst updated about various threats that pose a risk to her organization. A security analyst who is better `tapped in' can be more effective. In this paper we present, Cyber-All-Intel an artificial intelligence system to aid a security analyst. It is a system for knowledge extraction, representation and analytics in an end-to-end pipeline grounded in the cybersecurity informatics domain. It uses multiple knowledge representations like, vector spaces and knowledge graphs in a 'VKG structure' to store incoming intelligence. The system also uses neural network models to pro-actively improve its knowledge. We have also created a query engine and an alert system that can be used by an analyst to find actionable cybersecurity insights.

cyberwarfare, deep learning, vkg structure, (24 more...)

arXiv.org Artificial Intelligence

1905.02895

Country:

North America > United States > Maryland > Baltimore County (0.14)
North America > United States > Maryland > Baltimore (0.14)

Genre: Research Report (0.40)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (0.59)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Data Science > Data Mining (1.00)
Information Technology > Communications > Social Media (1.00)
Information Technology > Artificial Intelligence > Cognitive Science (1.00)
(4 more...)

Add feedback

Knowledge Graph Fact Prediction via Knowledge-Enriched Tensor Factorization

Padia, Ankur, Kalpakis, Kostantinos, Ferraro, Francis, Finin, Tim

arXiv.org Machine LearningFeb-8-2019

We present a family of novel methods for embedding knowledge graphs into real-valued tensors. These tensor-based embeddings capture the ordered relations that are typical in the knowledge graphs represented by semantic web languages like RDF. Unlike many previous models, our methods can easily use prior background knowledge provided by users or extracted automatically from existing knowledge graphs. In addition to providing more robust methods for knowledge graph embedding, we provide a provably-convergent, linear tensor factorization algorithm. We demonstrate the efficacy of our models for the task of predicting new facts across eight different knowledge graphs, achieving between 5% and 50% relative improvement over existing state-of-the-art knowledge graph embedding techniques. Our empirical evaluation shows that all of the tensor decomposition models perform well when the average degree of an entity in a graph is high, with constraint-based models doing better on graphs with a small number of highly similar relations and regularization-based models dominating for graphs with relations of varying degrees of similarity.

relation, semantic web, survey article, (20 more...)

arXiv.org Machine Learning

1902.03077

Country:

North America > United States > Maryland > Baltimore County (0.14)
North America > United States > Maryland > Baltimore (0.14)

Genre: Research Report > Promising Solution (0.34)

Industry: Information Technology > Security & Privacy (0.68)

Technology:

Information Technology > Communications > Web > Semantic Web (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Semantic Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)

Add feedback

Ontology-Grounded Topic Modeling for Climate Science Research

Sleeman, Jennifer, Finin, Tim, Halem, Milton

arXiv.org Artificial IntelligenceJul-30-2018

In scientific disciplines where research findings have a strong impact on society, reducing the amount of time it takes to understand, synthesize and exploit the research is invaluable. Topic modeling is an effective technique for summarizing a collection of documents to find the main themes among them and to classify other documents that have a similar mixture of co-occurring words. We show how grounding a topic model with an ontology, extracted from a glossary of important domain phrases, improves the topics generated and makes them easier to understand. We apply and evaluate this method to the climate science domain. The result improves the topics generated and supports faster research understanding, discovery of social networks among researchers, and automatic ontology generation.

artificial intelligence, ontology, text processing, (20 more...)

arXiv.org Artificial Intelligence

1807.10965

Country: North America > United States > Maryland > Baltimore County (0.46)

Genre: Research Report (0.40)

Industry:

Government (1.00)
Energy (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (1.00)

Add feedback

UCO: A Unified Cybersecurity Ontology

Syed, Zareen (University of Maryland Baltimore County) | Padia, Ankur (University of Maryland, Baltimore County) | Finin, Tim (University of Maryland, Baltimore County) | Mathews, Lisa (University of Maryland, Baltimore County) | Joshi, Anupam (University of Maryland, Baltimore County)

AAAI ConferencesApr-12-2016

In this paper we describe the Unified Cybersecurity Ontology (UCO) that is intended to support information integration and cyber situational awareness in cybersecurity systems. The ontology incorporates and integratesheterogeneous data and knowledge schemas from different cybersecurity systems and most commonly usedcybersecurity standards for information sharing and exchange. The UCO ontology has also been mapped to anumber of existing cybersecurity ontologies as well asconcepts in the Linked Open Data cloud (Berners-Lee,Bizer, and Heath 2009). Similar to DBpedia (Auer etal. 2007) which serves as the core for general knowledge in Linked Open Data cloud, we envision UCO toserve as the core for cybersecurity domain, which wouldevolve and grow with the passage of time with additional cybersecurity data sets as they become available.We also present a prototype system and concrete usecases supported by the UCO ontology. To the best of ourknowledge, this is the first cybersecurity ontology thathas been mapped to general world ontologies to support broader and diverse security use cases. We comparethe resulting ontology with previous efforts, discuss itsstrengths and limitations, and describe potential futurework directions.

artificial intelligence, cyberwarfare, ontology, (14 more...)

AAAI Conferences

Workshops at the Thirtieth AAAI Conference on Artificial Intelligence

Country:

North America > United States > Maryland > Baltimore County (0.14)
North America > United States > Maryland > Baltimore (0.14)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Military > Cyberwarfare (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Ontologies (1.00)

Add feedback