AITopics

2305.02118

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Africa > Middle East > Libya (0.05)
Oceania > Niue (0.04)
(21 more...)

Genre:

Research Report > Promising Solution (0.34)
Research Report > New Finding (0.34)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (1.00)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.93)

Beck, Tilman, Waldis, Andreas, Gurevych, Iryna

Contextual information integration for stance detection via cross-attention

arXiv.org Artificial IntelligenceMay-25-2023

Stance detection deals with identifying an author's stance towards a target. Most existing stance detection models are limited because they do not consider relevant contextual information which allows for inferring the stance correctly. Complementary context can be found in knowledge bases but integrating the context into pretrained language models is non-trivial due to the graph structure of standard knowledge bases. To overcome this, we explore an approach to integrate contextual information as text which allows for integrating contextual information from heterogeneous sources, such as structured knowledge sources and by prompting large language models. Our approach can outperform competitive baselines on a large and diverse stance detection benchmark in a cross-target setup, i.e. for targets unseen during training. We demonstrate that it is more robust to noisy context and can regularize for unwanted correlations between labels and target-specific vocabulary. Finally, it is independent of the pretrained language model in use.

artificial intelligence, computational linguistic, natural language, (18 more...)

2211.01874

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Hong Kong (0.04)
Oceania > Australia > Victoria > Melbourne (0.04)
(23 more...)

Genre: Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Information Fusion (0.64)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.54)

arXiv.org Artificial IntelligenceMay-25-2023

AaKOS: Aspect-adaptive Knowledge-based Opinion Summarization

Wang, Guan, Li, Weihua, Lai, Edmund M-K., Bai, Quan

The rapid growth of information on the Internet has led to an overwhelming amount of opinions and comments on various activities, products, and services. This makes it difficult and time-consuming for users to process all the available information when making decisions. Text summarization, a Natural Language Processing (NLP) task, has been widely explored to help users quickly retrieve relevant information by generating short and salient content from long or multiple documents. Recent advances in pre-trained language models, such as ChatGPT, have demonstrated the potential of Large Language Models (LLMs) in text generation. However, LLMs require massive amounts of data and resources and are challenging to implement as offline applications. Furthermore, existing text summarization approaches often lack the ``adaptive" nature required to capture diverse aspects in opinion summarization, which is particularly detrimental to users with specific requirements or preferences. In this paper, we propose an Aspect-adaptive Knowledge-based Opinion Summarization model for product reviews, which effectively captures the adaptive nature required for opinion summarization. The model generates aspect-oriented summaries given a set of reviews for a particular product, efficiently providing users with useful information on specific aspects they are interested in, ensuring the generated summaries are more personalized and informative. Extensive experiments have been conducted using real-world datasets to evaluate the proposed model. The results demonstrate that our model outperforms state-of-the-art approaches and is adaptive and efficient in generating summaries that focus on particular aspects, enabling users to make well-informed decisions and catering to their diverse interests and preferences.

large language model, machine learning, natural language, (18 more...)

2306.05537

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > New York > New York County > New York City (0.04)
Oceania > New Zealand > North Island > Auckland Region > Auckland (0.04)
(12 more...)

Genre:

Research Report > Promising Solution (0.48)
Research Report > New Finding (0.48)
Overview > Innovation (0.48)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.66)

Communications of the ACMMay-24-2023, 23:40:23 GMT

Conjoined Twins: Artificial Intelligence and the Invention of Computer Science

Thomas Haigh (thomas.haigh@gmail.com) is a professor of history at the University of Wisconsin--Milwaukee, WI, USA, and a Comenius visiting professor at Siegen University, Germany.

artificial intelligence, logic & formal reasoning, machine learning, (15 more...)

Communications of the ACM

Country:

North America > United States > Wisconsin > Milwaukee County > Milwaukee (0.24)
Europe > Germany (0.24)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.05)
(6 more...)

Genre: Personal (0.48)

Industry:

Leisure & Entertainment > Games (0.69)
Education (0.46)

Technology:

Information Technology > Artificial Intelligence > History (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.69)
Information Technology > Artificial Intelligence > Robots (0.69)
(3 more...)

Kazemnejad, Amirhossein, Rezagholizadeh, Mehdi, Parthasarathi, Prasanna, Chandar, Sarath

Measuring the Knowledge Acquisition-Utilization Gap in Pretrained Language Models

arXiv.org Artificial IntelligenceMay-24-2023

While pre-trained language models (PLMs) have shown evidence of acquiring vast amounts of knowledge, it remains unclear how much of this parametric knowledge is actually usable in performing downstream tasks. We propose a systematic framework to measure parametric knowledge utilization in PLMs. Our framework first extracts knowledge from a PLM's parameters and subsequently constructs a downstream task around this extracted knowledge. Performance on this task thus depends exclusively on utilizing the model's possessed knowledge, avoiding confounding factors like insufficient signal. As an instantiation, we study factual knowledge of PLMs and measure utilization across 125M to 13B parameter PLMs. We observe that: (1) PLMs exhibit two gaps - in acquired vs. utilized knowledge, (2) they show limited robustness in utilizing knowledge under distribution shifts, and (3) larger models close the acquired knowledge gap but the utilized knowledge gap remains. Overall, our study provides insights into PLMs' capabilities beyond their acquired knowledge.

knowledge management, large language model, machine learning, (21 more...)

2305.14775

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > Canada > Quebec > Montreal (0.14)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)
(8 more...)

Genre: Research Report > New Finding (1.00)

Technology:

Information Technology > Knowledge Management > Knowledge Engineering (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Natural Language > Chatbot (0.69)
(2 more...)

Cui, Wanyun, Chen, Xingran

Free Lunch for Efficient Textual Commonsense Integration in Language Models

arXiv.org Artificial IntelligenceMay-24-2023

Recent years have witnessed the emergence of textual commonsense knowledge bases, aimed at providing more nuanced and context-rich knowledge. The integration of external commonsense into language models has been shown to be a key enabler in advancing the state-of-the-art for a wide range of NLP tasks. However, incorporating textual commonsense descriptions is computationally expensive, as compared to encoding conventional symbolic knowledge. In this paper, we propose a method to improve its efficiency without modifying the model. We group training samples with similar commonsense descriptions into a single batch, thus reusing the encoded description across multiple samples. One key observation is that the upper bound of batch partitioning can be reduced to the classic {\it graph k-cut problem}. Consequently, we propose a spectral clustering-based algorithm to solve this problem. Extensive experiments illustrate that the proposed batch partitioning approach effectively reduces the computational cost while preserving performance. The efficiency improvement is more pronounced on larger datasets and on devices with more memory capacity, attesting to its practical utility for large-scale applications.

artificial intelligence, knowledge, natural language, (18 more...)

2305.15516

Country:

Europe > Italy > Tuscany > Florence (0.04)
North America > United States > Michigan (0.04)
North America > Canada > British Columbia > Metro Vancouver Regional District > Vancouver (0.04)
(3 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.70)
Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (0.69)

Iteratively Improving Biomedical Entity Linking and Event Extraction via Hard Expectation-Maximization

Li, Xiaochu, Liu, Minqian, Xu, Zhiyang, Huang, Lifu

Biomedical entity linking and event extraction are two crucial tasks to support text understanding and retrieval in the biomedical domain. These two tasks intrinsically benefit each other: entity linking disambiguates the biomedical concepts by referring to external knowledge bases and the domain knowledge further provides additional clues to understand and extract the biological processes, while event extraction identifies a key trigger and entities involved to describe each biological process which also captures the structural context to better disambiguate the biomedical entities. However, previous research typically solves these two tasks separately or in a pipeline, leading to error propagation. What's more, it's even more challenging to solve these two tasks together as there is no existing dataset that contains annotations for both tasks. To solve these challenges, we propose joint biomedical entity linking and event extraction by regarding the event structures and entity references in knowledge bases as latent variables and updating the two task-specific models in a hard Expectation-Maximization (EM) fashion: (1) predicting the missing variables for each partially annotated dataset based on the current two task-specific models, and (2) updating the parameters of each model on the corresponding pseudo completed dataset. Experimental results on two benchmark datasets: Genia 2011 for event extraction and BC4GO for entity linking, show that our joint framework significantly improves the model for each individual task and outperforms the strong baselines for both tasks. We will make the code and model checkpoints publicly available once the paper is accepted.

information retrieval, machine learning, natural language, (20 more...)

2305.14645

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Oregon > Multnomah County > Portland (0.04)
Asia > India (0.04)
(5 more...)

Genre: Research Report > New Finding (0.46)

Industry: Health & Medicine > Pharmaceuticals & Biotechnology (1.00)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.70)
Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.68)
Information Technology > Artificial Intelligence > Natural Language > Information Retrieval (0.47)

Cadavid-Sanchez, Sebastian, Kacem, Khalil, Frade, Rafael Aparecido Martins, Boehm, Johannes, Chaney, Thomas, Lashkari, Danial, Simig, Daniel

Evaluating end-to-end entity linking on domain-specific knowledge bases: Learning about ancient technologies from museum collections

To study social, economic, and historical questions, researchers in the social sciences and humanities have started to use increasingly large unstructured textual datasets. While recent advances in NLP provide many tools to efficiently process such data, most existing approaches rely on generic solutions whose performance and suitability for domain-specific tasks is not well understood. This work presents an attempt to bridge this domain gap by exploring the use of modern Entity Linking approaches for the enrichment of museum collection data. We collect a dataset comprising of more than 1700 texts annotated with 7,510 mention-entity pairs, evaluate some off-the-shelf solutions in detail using this dataset and finally fine-tune a recent end-to-end EL model on this data. We show that our fine-tuned model significantly outperforms other approaches currently available in this domain and present a proof-of-concept use case of this model. We release our dataset and our best model.

artificial intelligence, expert system, natural language, (19 more...)

2305.14588

Country:

Europe > Austria > Vienna (0.14)
North America > United States > New York > New York County > New York City (0.04)
Europe > Western Europe (0.04)
(9 more...)

Genre: Research Report (0.50)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Text Processing (0.94)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.88)

Panigrahi, Damodar, Anderson, William, Whitman, Joshua, Mittal, Sudip, Blakely, Benjamin A

REGARD: Rules of EngaGement for Automated cybeR Defense to aid in Intrusion Response

Automated Intelligent Cyberdefense Agents (AICAs) that are part Intrusion Detection Systems (IDS) and part Intrusion Response Systems (IRS) are being designed to protect against sophisticated and automated cyber-attacks. An AICA based on the ideas of Self-Adaptive Autonomic Computing Systems (SA-ACS) can be considered as a managing system that protects a managed system like a personal computer, web application, critical infrastructure, etc. An AICA, specifically the IRS components, can compute a wide range of potential responses to meet its security goals and objectives, such as taking actions to prevent the attack from completing, restoring the system to comply with the organizational security policy, containing or confining an attack, attack eradication, deploying forensics measures to enable future attack analysis, counterattack, and so on. To restrict its activities in order to minimize collateral/organizational damage, such an automated system must have set Rules of Engagement (RoE). Automated systems must determine which operations can be completely automated (and when), which actions require human operator confirmation, and which actions must never be undertaken. In this paper, to enable this control functionality over an IRS, we create Rules of EngaGement for Automated cybeR Defense (REGARD) system which holds a set of Rules of Engagement (RoE) to protect the managed system according to the instructions provided by the human operator. These rules help limit the action of the IRS on the managed system in compliance with the recommendations of the domain expert. We provide details of execution, management, operation, and conflict resolution for Rules of Engagement (RoE) to constrain the actions of an automated IRS. We also describe REGARD system implementation, security case studies for cyber defense, and RoE demonstrations.

artificial intelligence, intermediate action, machine learning, (18 more...)

2305.13967

Country:

North America > United States > Mississippi > Mississippi County > Mississippi State (0.04)
North America > United States > Iowa > Polk County > Ankeny (0.04)
Europe > France > Île-de-France > Paris > Paris (0.04)
Europe > Austria > Vienna (0.04)

Genre: Overview (0.93)

Industry:

Information Technology > Security & Privacy (1.00)
Government > Regional Government > North America Government > United States Government (1.00)
Government > Military > Cyberwarfare (1.00)

Technology:

Information Technology > Security & Privacy (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Rule-Based Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Expert Systems (0.88)

Message Intercommunication for Inductive Relation Reasoning

Liang, Ke, Meng, Lingyuan, Zhou, Sihang, Wang, Siwei, Tu, Wenxuan, Liu, Yue, Liu, Meng, Liu, Xinwang

Inductive relation reasoning for knowledge graphs, aiming to infer missing links between brand-new entities, has drawn increasing attention. The models developed based on Graph Inductive Learning, called GraIL-based models, have shown promising potential for this task. However, the uni-directional message-passing mechanism hinders such models from exploiting hidden mutual relations between entities in directed graphs. Besides, the enclosing subgraph extraction in most GraIL-based models restricts the model from extracting enough discriminative information for reasoning. Consequently, the expressive ability of these models is limited. To address the problems, we propose a novel GraIL-based inductive relation reasoning model, termed MINES, by introducing a Message Intercommunication mechanism on the Neighbor-Enhanced Subgraph. Concretely, the message intercommunication mechanism is designed to capture the omitted hidden mutual information. It introduces bi-directed information interactions between connected entities by inserting an undirected/bi-directed GCN layer between uni-directed RGCN layers. Moreover, inspired by the success of involving more neighbors in other graph-based tasks, we extend the neighborhood area beyond the enclosing subgraph to enhance the information collection for inductive relation reasoning. Extensive experiments on twelve inductive benchmark datasets demonstrate that our MINES outperforms existing state-of-the-art models, and show the effectiveness of our intercommunication mechanism and reasoning on the neighbor-enhanced subgraph.

data mining, machine learning, subgraph, (20 more...)

2305.14074

Country:

North America > United States > California > San Francisco County > San Francisco (0.14)
Europe > Austria > Vienna (0.14)
North America > United States > New York > New York County > New York City (0.04)
(15 more...)

Genre: Research Report (0.84)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Data Science > Data Mining (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
(3 more...)